Re: [boost] [SORT] Parallel Algorithms

27 Mar 2015

      Hi Francisco,

On Wed, Mar 25, 2015 at 3:57 PM Francisco José Tapia <fjtapia@gmail.com>
wrote:
...
TEST OBJECTS
The object used in the test ( Reburcio is a word of an exercise from my
first pascal book, and don't have any sense) In the construction , copy
construction and operator =, copy all the elements with a loop. The only
one operation which use only the first element is the comparison.
I had done a new comparison, and compare the sum of all the elements of the
object. The time are similar, you must add 5% in the small objects to 15%
in the big objects.
I'm not precisely sure of your meaning, but I suggested modification of the
test object to use default operators where possible, to randomize the
entire contents of the object (so that not all integers in the array are
the same), and to use the entire contents of the object for comparison.  I
believe that these suggested changes make it more reflective of the
conventional large-object comparison case.  Most of the time, only the
first integer will be compared, but in equal cases the entire array will be
compared.  I don't like feeding overly simplified objects into benchmarks,
because compilers can make some surprising optimizations.
...
TIMSORT
The Tim Sort code is not mine.
Agreed.  I think we can drop it from this comparison, as your algorithm is
clearly much faster.  I mentioned it because it is used with other
languages, and responded when somebody else on the email group suggested I
allow it.  I'm not asking you for an implementation.
Timsort uses N/2 additional memory.
...
DESIGN PHASE
Now , we are in the design phase. Nothing is fixed. We can comment and
change anything.
The specification is simple, provide sort algorithms, stable and not
stable, for to run with 1 thread or with any number of threads, without any
other external library ( as Open MP , TBB, CilkPlus..). Only with the C++
standard.
If you want to discuss about this, we can do. We admit ideas, suggestions,
opinions. Any help is welcome. The final decision about if the code is
included or not in the boost sort library, depend of Steven, the creator of
the library.
After the design phase, with the decisions taken and the interfaces fixed.
We prepare the final version, redesign a test system, test in deep and
create the documentation.
I suggest deciding what you want to try to include.  Getting a competitive
parallel unstable sort to tbb will be hard, and I'm skeptical of including
a library that just replicates what tbb provides for easy use.
...
I tried to improve, I would like to have more time, more equipment, and
better documentation on order to find the best options, but I don't have,
and I do this in my free time after my work and my family.
I'm in the same situation with regards to available time.  I think many
boost participants are.
...
I designed 6 stable sort algorithms, for to select the fast (internally
named smart_merge_sort, about 10% faster than the GCC version). I prepared
a version of Sample Sort, only with threads and atomic variables. In the
benchmarks have similar performance than the TBB version, use one half of
the auxiliary memory used by TBB and is exception safe.
I like your parallel stable sorts.  They look like the most promising part
of your proposed library additions.  Indirect sorting is also interesting.
I just want reliable large-object and string comparison numbers for them
before diving in further.

Have you looked at integrating your testing with the approach in the Boost
Sort library?  The randomgen binary generates files with different types of
random number distributions that can test a variety of scenarios, and the
tune.pl script controls automated performance testing.
...
In my opinion, the parallel stable sort is reasonably good for to be
included in the final version. The important case, which shows the quality
of the algorithm, is when the data are unsorted. I will try to implement
the detection of the sorted elements , like in TBB, but I am not sure about
the results. Th internal loops of the algorithm are delicate, and a small
change , can produce big changes in the performance, as happened in other
algorithms
Is there a way you can do a fast initial pass, checking for already-sorted
data?  That's what Spreadsort does (at each level of the sort).

Re: [boost] [SORT] Parallel Algorithms

Steven Ross