Skip to content

Commit 757ac29

Browse files
committed
Test that M(XY) <= M(X) + M(Y) if X <= Y for most probes
1 parent f4e385e commit 757ac29

File tree

2 files changed

+69
-29
lines changed

2 files changed

+69
-29
lines changed

docs/Measures-of-disorder.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,7 @@ Measures of disorder are pretty formalized, so the names of the functions in the
144144
Computes the number of elements in a sequence that aren't followed by the same element in the sorted sequence.
145145

146146
Our implementation is slightly different from the original description in *Sublinear merging and natural mergesort* by S. Carlsson, C. Levcopoulos and O. Petersson:
147-
* It doesn't add 1 to the general result, thus returning 0 when $X$ is sorted - therefore respecting the Mannila definition of a MOP.
147+
* It doesn't add 1 to the general result, thus returning 0 when $X$ is sorted and respecting Mannila's first criterion for what makes a measure of presortedness (though this change might be responsible for the breakage of criterion 4).
148148
* It explicitly handles *equivalent elements*, while the original formal definition makes it difficult.
149149

150150
| Complexity | Memory | Iterators |
@@ -153,6 +153,8 @@ Our implementation is slightly different from the original description in *Subli
153153

154154
`max_for_size`: $|X| - 1$ when $X$ is sorted in reverse order.
155155

156+
**Note:** `probe::block` does not respect Mannila's criterion 4: $Block(\langle 1, 0 \rangle) = 1$ and $Block(\langle 2, 3 \rangle) = 0$, but $Block(\langle 1, 0, 2, 3 \rangle) = 2$.
157+
156158
### *Dis*
157159

158160
```cpp
@@ -285,6 +287,8 @@ The measure of disorder is slightly different from its original description in [
285287

286288
`max_for_size`: $\frac{|X| + 1}{2} - 1$ when $X$ is a sequence of elements that are alternatively greater then lesser than their previous neighbour.
287289

290+
**Note:** `probe::mono` does not respect Mannila's criterion 4: $Mono(\langle 1, 2, 3, 4, 5 \rangle) = 0$ and $Mono(\langle 10, 9, 8, 7, 6 \rangle) = 0$, but $Mono(\langle 1, 2, 3, 4, 5, 10, 9, 8, 7, 6 \rangle) = 1$.
291+
288292
### *Osc*
289293

290294
```cpp
@@ -299,7 +303,9 @@ Computes the *Oscillation* measure described by C. Levcopoulos and O. Petersson
299303

300304
`max_for_size`: $\frac{|X|(|X| - 2) - 1}{2}$ when the values in $X$ are strongly oscillating.
301305

302-
**Note:** *Osc* does not respect Mannila's criterion 5: $Osc(\langle 2, 4, 1, 3, 1, 3 \rangle) \not \le |\langle 4, 1, 3, 1, 3 \rangle| + Osc(\langle 4, 1, 3, 1, 3 \rangle)$, though it is possible that it only happens when equivalent elements are involved.
306+
**Note:** *Osc* does not respect Mannila's criterion 4: $Osc(\langle 0 \rangle) = 0$ and $Osc(\langle 3, 2, 1 \rangle) = 0$, but $Osc(\langle 0, 3, 2, 1 \rangle) = 2$.
307+
308+
**Note²:** *Osc* does not respect Mannila's criterion 5: $Osc(\langle 2, 4, 1, 3, 1, 3 \rangle) \not \le |\langle 4, 1, 3, 1, 3 \rangle| + Osc(\langle 4, 1, 3, 1, 3 \rangle)$, though it is possible that it only happens when equivalent elements are involved.
303309

304310
### *Rem*
305311

tests/probes/every_probe_common.cpp

Lines changed: 61 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,66 @@ TEMPLATE_TEST_CASE( "test M(subsequence(X)) <= M(X) for most probes M", "[probe]
179179
});
180180
}
181181

182+
namespace
183+
{
184+
// Split a sequence into two consequent subsequences X and Y,
185+
// with all elements of X being not greater than all elements of Y,
186+
// the split point is chosen at random
187+
188+
auto split_in_two(std::vector<int>& sequence)
189+
-> std::vector<int>::iterator
190+
{
191+
using diff_t = std::vector<int>::difference_type;
192+
using param_t = std::uniform_int_distribution<diff_t>::param_type;
193+
194+
auto size = static_cast<diff_t>(sequence.size());
195+
std::uniform_int_distribution<diff_t> dist;
196+
auto x_begin = sequence.begin();
197+
auto y_begin = x_begin + dist(hasard::engine(), param_t{0, size});
198+
std::nth_element(x_begin, y_begin, sequence.end());
199+
return y_begin;
200+
}
201+
}
202+
203+
TEMPLATE_TEST_CASE( "test M(XY) <= M(X) + M(Y) if X <= Y for most probes M", "[probe]",
204+
decltype(cppsort::probe::dis),
205+
decltype(cppsort::probe::enc),
206+
decltype(cppsort::probe::exc),
207+
decltype(cppsort::probe::max),
208+
decltype(cppsort::probe::sus) )
209+
{
210+
// Fourth property formalized by Mannila
211+
// Ensure that the disorder found in the concatenation of two sequences is not
212+
// greater than the sum of the individual sequences' disorders when all elements
213+
// of the second sequence are greater than all elements of the first sequence
214+
215+
// Note: some measures do not appear here because we test a stronger bound
216+
// instead (see the next test)
217+
218+
rc::prop("M(XY) ≤ M(X) + M(Y) if X ≤ Y", [](std::vector<int> sequence) {
219+
auto y_begin = split_in_two(sequence);
220+
std::decay_t<TestType> measure;
221+
return measure(sequence) <= measure(sequence.begin(), y_begin) + measure(y_begin, sequence.end());
222+
});
223+
}
224+
225+
TEMPLATE_TEST_CASE( "test M(XY) = M(X) + M(Y) if X <= Y for some probes M", "[probe]",
226+
decltype(cppsort::probe::ham),
227+
decltype(cppsort::probe::inv),
228+
decltype(cppsort::probe::rem),
229+
decltype(cppsort::probe::runs),
230+
decltype(cppsort::probe::spear) )
231+
{
232+
// Property formalized by Estivill-Castro in *Sorting and Measures of Disorder*
233+
// It is a stronger bound on Mannila's fourth property that some measures satisfy
234+
235+
rc::prop("M(XY) = M(X) + M(Y) if X ≤ Y", [](std::vector<int> sequence) {
236+
auto y_begin = split_in_two(sequence);
237+
std::decay_t<TestType> measure;
238+
return measure(sequence) == measure(sequence.begin(), y_begin) + measure(y_begin, sequence.end());
239+
});
240+
}
241+
182242
TEMPLATE_TEST_CASE( "test M(aX) <= |X| + M(X) for most probes M", "[probe]",
183243
decltype(cppsort::probe::block),
184244
decltype(cppsort::probe::dis),
@@ -192,7 +252,7 @@ TEMPLATE_TEST_CASE( "test M(aX) <= |X| + M(X) for most probes M", "[probe]",
192252
decltype(cppsort::probe::sus) )
193253
{
194254
// Fifth property formalized by Mannila
195-
// The following probes don't satisfy it: ham, osc, spear
255+
// The following probes don't satisfy it: Ham, Osc, Spear
196256

197257
rc::prop("M(⟨a⟩X) ≤ |X| + M(X)", [](const std::vector<int>& sequence) {
198258
std::decay_t<TestType> measure;
@@ -307,29 +367,3 @@ TEMPLATE_TEST_CASE( "test monotonicity", "[probe]",
307367
: disorder_wyz <= disorder_wxz;
308368
});
309369
}
310-
311-
TEMPLATE_TEST_CASE( "test M(XY) = M(X) + M(Y) if X <= Y for most probes M", "[probe]",
312-
decltype(cppsort::probe::ham),
313-
decltype(cppsort::probe::inv),
314-
decltype(cppsort::probe::rem),
315-
decltype(cppsort::probe::runs),
316-
decltype(cppsort::probe::spear) )
317-
{
318-
// Property formalized by Estivill-Castro in *Sorting and Measures of Disorder*
319-
// Not all measures of presortedness satisfy it, but a lot do
320-
321-
rc::prop("M(XY) = M(X) + M(Y) if X ≤ Y", [](std::vector<int> sequence) {
322-
using diff_t = std::vector<int>::difference_type;
323-
using param_t = std::uniform_int_distribution<diff_t>::param_type;
324-
325-
// Split the sequence into two consequent subsequences X and Y
326-
auto size = static_cast<diff_t>(sequence.size());
327-
std::uniform_int_distribution<diff_t> dist;
328-
auto x_begin = sequence.begin();
329-
auto y_begin = x_begin + dist(hasard::engine(), param_t{0, size});
330-
std::nth_element(x_begin, y_begin, sequence.end());
331-
332-
std::decay_t<TestType> measure;
333-
return measure(sequence) == measure(x_begin, y_begin) + measure(y_begin, sequence.end());
334-
});
335-
}

0 commit comments

Comments
 (0)