[sorting] What are the most complicated sorting problems?

Hi. Since people want the sorting library to have something that the standard library sorts do not offer, I would like to know the most complicated sorting situations that user have encountered (like sort by name, and then first three digits of phone number, and if the sum of the first three digits of the phone number is less than 10, sort by the second 4 digits of the phone number). I ask this because radixsort and radix quicksort are able to do those sorts, but I need input from users to add features. Also, even if you do not have any complicated sorts, ideas in general would help.

Sam Schetterer wrote:
Hi. Since people want the sorting library to have something that the standard library sorts do not offer, I would like to know the most complicated sorting situations that user have encountered (like sort by name, and then first three digits of phone number, and if the sum of the first three digits of the phone number is less than 10, sort by the second 4 digits of the phone number). I ask this because radixsort and radix quicksort are able to do those sorts, but I need input from users to add features. Also, even if you do not have any complicated sorts, ideas in general would help. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Functionality that i would find useful (roughly sorted form most to least important): * provide an easy way to sort lists of pointers (without having to make a custom '<' operator each time).* Easy way to specify different comparison operators for various subfields. ie. in stead of one complex '<' operator, specify one for each subfield and then define sortorder (which subfield first, ...) on sorttime. * provide a good set of default '<' operators. at least: - simple types (strings, numbers, ...) - pointers to simple types - pairs of the above * Specify ascending vs. descending sorting. Trivial criteria really, but reduces complexity (or number) of the '<' operators. * return the n highest/lowest only. (optimized, ie: not full sort and then cutting solution) A complicated sort example i came across: * The datatype is a struct of vectors of strings * The criterium is the same dataype. * sort according to how good the vectors of strings correlate in the list values and the criterium (many same values). Another sort example from a different domain: Take a set of points in 3d space. sort them (anti-)clockwise around a given point. sort them according to the length along a path. Both of these would need to associate a "sort value" with each point in a first pass to be able to sort efficient. This can be left to the user, but could also be handled by a library since the rest (establishing a 1-1 relationship of "sort value" and actual value and sorting) is the same in each application. regards, Michael

* provide an easy way to sort lists of pointers (without having to make a custom '<' operator each time).
this is already in boost: some_iterator begin, end; std::sort(boost::indirect_iterator(begin), boost::indirect_iterator(end)); * Easy way to specify different
comparison operators for various subfields. ie. in stead of one complex '<' operator, specify one for each subfield and then define sortorder (which subfield first, ...) on sorttime.
This is an intriguing possibility, what about some intrusive design, where you can tag class members as sort keys?
* provide a good set of default '<' operators. at least: - simple types (strings, numbers, ...) - pointers to simple types - pairs of the above
Simple types and pairs already have operator<, and pointers are handled by indirect_iterator.
* Specify ascending vs. descending sorting. Trivial criteria really, but reduces complexity (or number) of the '<' operators.
std::sort(begin, end, std::greater<type>());
* return the n highest/lowest only. (optimized, ie: not full sort and then cutting solution)
std::partial_sort -Lewis

On 3/19/07, Lewis Hyatt <lhyatt@princeton.edu> wrote:
* provide an easy way to sort lists of pointers (without having to make a custom '<' operator each time).
this is already in boost:
some_iterator begin, end; std::sort(boost::indirect_iterator(begin), boost::indirect_iterator(end));
or by using Boost.Lambda: sort(begin, end, *_1 < *_2);
* Easy way to specify different
comparison operators for various subfields. ie. in stead of one complex '<' operator, specify one for each subfield and then define sortorder (which subfield first, ...) on sorttime.
This is an intriguing possibility, what about some intrusive design, where you can tag class members as sort keys?
Or a simple lambda again? sort(begin, end, bind(&some_type::some_field, _1) < bind(&some_type::some_field, _2)); This also automagically handles containers of (potentially smart) pointers. With recent boost versions, even plain bind can be used with relational operators. You can also use this syntax with containers of pointers sort(begin, end, _1->*&some_type::some_field) < _2->*&some_type::some_field); With containers to plain types you have to do, sort(begin, end, (&_1)->*&some_type::some_field) < (&_2)->*&some_type::some_field); While ->* is almost idea, I tend to use an explicit bind. gpd

Or a simple lambda again?
sort(begin, end, bind(&some_type::some_field, _1) < bind(&some_type::some_field, _2));
Yes that is a good point. I wonder if it might be worth encapsulating the idea of multi-sorting on several different members, though. I bet it would be possible to generate the necessary lambda expressions automatically... -Lewis

Lewis Hyatt wrote:
* provide an easy way to sort lists of pointers (without having to make a custom '<' operator each time).
this is already in boost:
some_iterator begin, end; std::sort(boost::indirect_iterator(begin), boost::indirect_iterator(end));
nice, didn't know that one :)
* Easy way to specify different
comparison operators for various subfields. ie. in stead of one complex '<' operator, specify one for each subfield and then define sortorder (which subfield first, ...) on sorttime.
This is an intriguing possibility, what about some intrusive design, where you can tag class members as sort keys?
I'm fine with intrusive, but actually i would prefer an adapter class. something like (simplified interface for readability): template<typename SourceClass,typename FieldType1,typename fieldType2, ...> class sort_field_adapter { const FieldType1& extractField1(const SourceClass& source); const FieldType1& extractField2(const SourceClass& source); ... }; rationale: This could be used to create sort keys "on the fly" (for example, sort by sum of member a+b). I am not sure if it is possible to define an easy, generic adapter interface due to the unknown number and types of the fields.
* provide a good set of default '<' operators. at least: - simple types (strings, numbers, ...) - pointers to simple types - pairs of the above
Simple types and pairs already have operator<, and pointers are handled by indirect_iterator.
is it possible to use indirect_iterator (or similar) if only one half of a pair is a pointer ?
* Specify ascending vs. descending sorting. Trivial criteria really, but reduces complexity (or number) of the '<' operators.
std::sort(begin, end, std::greater<type>());
ofc, but what about your custom < operators, functors. You would need to supply a second, inverted version. Michael
participants (5)
-
Giovanni Piero Deretta
-
Lewis Hyatt
-
Michael Lacher
-
Sam Schetterer
-
Yuriy Koblents-Mishke