In Sean Parent's CppCon keynote last week, he went through the process of
implementing an algorithm he called "sort_subrange", which takes four
iterators describing two ranges, one a subset of the other, and sorts the
subrange - as if you had sorted the entire range. (See
https://www.youtube.com/watch?v=sWgDk-o-6ZE, starting at about 31:30).
I have implemented this in Boost.Algorithm (crediting Sean) - called
partition_sort.
However, while I was doing this, I thought of a different algorithm; one
that gathered all the elements into the subrange as if the entire range was
sorted, but didn't actually do the sorting.
I've implemented that one, too - but I'm having a bit of trouble coming up
with a name.
I've used "partition_subrange", but that not that clear.
Sean has suggested "elements_in_subrange" and "elements_within_subrange".
Here's the declaration:
template
On 2015-10-01 16:24, Marshall Clow wrote:
In Sean Parent's CppCon keynote last week, he went through the process of implementing an algorithm he called "sort_subrange", which takes four iterators describing two ranges, one a subset of the other, and sorts the subrange - as if you had sorted the entire range. (See https://www.youtube.com/watch?v=sWgDk-o-6ZE, starting at about 31:30).
I have implemented this in Boost.Algorithm (crediting Sean) - called partition_sort.
However, while I was doing this, I thought of a different algorithm; one that gathered all the elements into the subrange as if the entire range was sorted, but didn't actually do the sorting.
I've implemented that one, too - but I'm having a bit of trouble coming up with a name.
I've used "partition_subrange", but that not that clear. Sean has suggested "elements_in_subrange" and "elements_within_subrange".
Here's the declaration:
template
void partition_subrange ( Iterator first, Iterator last, Iterator sub_first, Iterator sub_last, Pred p) /// \brief Gather the elements of the subrange [sub_first, sub_last) that is /// inside the range [first, last) as if you had sorted the entire range.
Any suggestions?
Assuming that the elements in [first, sub_first) are also less or equal to the element at sub_first, this is quite similar to nth_element, isn't it? Then you could call it mth_to_nth_element Not too smooth, but it should be clear to anyone who knows about nth_element. Best, Roland
On Thu, Oct 1, 2015 at 9:50 AM, Roland Bock
On 2015-10-01 16:24, Marshall Clow wrote:
I've used "partition_subrange", but that not that clear. Sean has suggested "elements_in_subrange" and "elements_within_subrange".
Here's the declaration:
template
void partition_subrange ( Iterator first, Iterator last, Iterator sub_first, Iterator sub_last, Pred p) /// \brief Gather the elements of the subrange [sub_first, sub_last) that is /// inside the range [first, last) as if you had sorted the entire range.
Any suggestions?
Assuming that the elements in [first, sub_first) are also less or equal to the element at sub_first, this is quite similar to nth_element, isn't it? Then you could call it
mth_to_nth_element
Not too smooth, but it should be clear to anyone who knows about nth_element.
+1 for associating the name of the new function with nth_element(). Zach
On Thu, Oct 1, 2015 at 4:50 AM, Roland Bock
On 2015-10-01 16:24, Marshall Clow wrote:
In Sean Parent's CppCon keynote last week, he went through the process of implementing an algorithm he called "sort_subrange", which takes four iterators describing two ranges, one a subset of the other, and sorts the subrange - as if you had sorted the entire range. (See https://www.youtube.com/watch?v=sWgDk-o-6ZE, starting at about 31:30).
I have implemented this in Boost.Algorithm (crediting Sean) - called partition_sort.
However, while I was doing this, I thought of a different algorithm; one that gathered all the elements into the subrange as if the entire range was sorted, but didn't actually do the sorting.
I've implemented that one, too - but I'm having a bit of trouble coming up with a name.
I've used "partition_subrange", but that not that clear. Sean has suggested "elements_in_subrange" and "elements_within_subrange".
Here's the declaration:
template
void partition_subrange ( Iterator first, Iterator last, Iterator sub_first, Iterator sub_last, Pred p) /// \brief Gather the elements of the subrange [sub_first, sub_last) that is /// inside the range [first, last) as if you had sorted the entire range.
Any suggestions?
Assuming that the elements in [first, sub_first) are also less or equal to the element at sub_first, this is quite similar to nth_element, isn't it? Then you could call it
mth_to_nth_element
Not too smooth, but it should be clear to anyone who knows about nth_element.
Yes, it effectively partitions the sequence into three parts. (Hmm... maybe I should call it "Gaul"). All the values "less" than sub_first, then the values "between" sub_first and sub_last, and then the values "after" sub_last. -- Marshall
On Oct 1, 2015, at 10:24 AM, Marshall Clow
However, while I was doing this, I thought of a different algorithm; one that gathered all the elements into the subrange as if the entire range was sorted, but didn't actually do the sorting.
I've implemented that one, too - but I'm having a bit of trouble coming up with a name.
I've used "partition_subrange", but that not that clear. Sean has suggested "elements_in_subrange" and "elements_within_subrange".
Any suggestions?
Given sort_subrange(), partition_subrange() is clear to me. Josh
Marshall Clow
In Sean Parent's CppCon keynote last week, he went through the process of implementing an algorithm he called "sort_subrange", which takes four iterators describing two ranges, one a subset of the other, and sorts the subrange - as if you had sorted the entire range. (See https://www.youtube.com/watch?v=sWgDk-o-6ZE, starting at about 31:30).
I have implemented this in Boost.Algorithm (crediting Sean) - called partition_sort.
However, while I was doing this, I thought of a different algorithm; one that gathered all the elements into the subrange as if the entire range was sorted, but didn't actually do the sorting.
I've implemented that one, too - but I'm having a bit of trouble coming up with a name.
I've used "partition_subrange", but that not that clear. Sean has suggested "elements_in_subrange" and "elements_within_subrange".
The names you mention, being nouns rather than verbs, don't convey the notion that the algorithm performs a mutating action on the sequence. Some ideas: partition_gather gather_partition partition_group group_partition where "partition" can be used as an object (suffix) or a verb qualifier (prefix) Joaquín M López Muñoz Telefónica
Sorry for top posting. I blame my portable device, manufacturer unnamed.
'range' is wrong and shouldn't be used, unless somehow qualified.
The things passed in are ranges, pairs of iterators are ranges, but
sort_subrange is using 'range' to mean the after-the-fact range. ie the
range that would be there after sorting. Yes it turns out to be the same
iterators, but it's confusing. I would think sort_subrange would only sort
the elements within the subrange (verb object - sort subrange) not sort
things from the total range _into_ the subrange.
So 'into' (or similar) might help. ie sort_into_subrange for the first
one. (just as a starting idea)
The second could be gather_into_subrange. Leaves you wondering what
criteria is used to select the gathering, but you'll check the docs instead
guessing incorrectly. (NOT understanding is better than MISunderstanding
when it comes to names)
I do like gather. And the association with nth element.
To be clear, the leftover elements on the left and right aren't even
partitioned? ie elements on the left could be greater than some elements on
the right?
Sent from my portable Analytical Engine
------------------------------
*From:* "Joaquín M López Munoz"
In Sean Parent's CppCon keynote last week, he went through the process of implementing an algorithm he called "sort_subrange", which takes four iterators describing two ranges, one a subset of the other, and sorts the subrange - as if you had sorted the entire range. (See https://www.youtube.com/watch?v=sWgDk-o-6ZE, starting at about 31:30).
I have implemented this in Boost.Algorithm (crediting Sean) - called partition_sort.
However, while I was doing this, I thought of a different algorithm; one that gathered all the elements into the subrange as if the entire range was sorted, but didn't actually do the sorting.
I've implemented that one, too - but I'm having a bit of trouble coming up with a name.
I've used "partition_subrange", but that not that clear. Sean has suggested "elements_in_subrange" and "elements_within_subrange".
The names you mention, being nouns rather than verbs, don't convey the notion that the algorithm performs a mutating action on the sequence. Some ideas: partition_gather gather_partition partition_group group_partition where "partition" can be used as an object (suffix) or a verb qualifier (prefix) Joaquín M López Muñoz Telefónica _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On October 1, 2015 10:29:57 PM EDT, Gottlob Frege
'range' is wrong and shouldn't be used, unless somehow qualified. The things passed in are ranges, pairs of iterators are ranges, but sort_subrange is using 'range' to mean the after-the-fact range. ie the range that would be there after sorting. Yes it turns out to be the same iterators, but it's confusing.
I had begun a reply with similar complaints, but never finished it, so I'll jump on your bandwagon instead. I, too, found the naming confusing. I find the interface really odd, too. I've never seen an algorithm that uses an iterator range that refers to a portion of an as-if-sorted range. I suppose there are real use cases or Sean wouldn't have worked on the algorithm he presented, but it's still an odd interface.
I would think sort_subrange would only sort the elements within the subrange (verb object - sort subrange) not sort things from the total range _into_ the subrange.
Right
So 'into' (or similar) might help. ie sort_into_subrange for the first one. (just as a starting idea)
That's better, but it still fails to capture the behavior well. The algorithm acts as if the entire range is sorted, providing a view into a subrange that is actually sorted. Perhaps "view" should be in the name: view_sorted_subrange?
The second could be gather_into_subrange. Leaves you wondering what criteria is used to select the gathering, but you'll check the docs instead guessing incorrectly.
I also think "gather" should be part of the name of the second algorithm.
(NOT understanding is better than MISunderstanding when it comes to names)
Indeed
I do like gather. And the association with nth element.
The problem with nth_element is that no index is given, but that algorithm is already established, so building on that name could help. However, nth_element also specifies the order of the elements to the left and right of the nth.
To be clear, the leftover elements on the left and right aren't even partitioned? ie elements on the left could be greater than some elements on the right?
If that were specified, then the mth_to_nth_element name has merit for Sean's algorithm, but Marshall's is still left without a good name. Finding similar names that vary only on the use of "sort" vs. "gather", for example, would be helpful. ___ Rob (Sent from my portable computation engine with which I am not forced to top post)
If that were specified, then the mth_to_nth_element name has merit for
Sean's
algorithm, but Marshall's is still left without a good name. Finding similar names that vary only on the use of "sort" vs. "gather", for example, would be helpful.
One problem with both "gather" and "into" is that they suggest picking or inserting additional elements, whereas the algorithm just rearranges values. Rearranging to some criterion is sorting IMO. One problem with "into" is that it implies insertion, and therefore also picking up additional elements. I think the discussion so far underestimates how much information is in the argument names. If you have to provide a range and a subrange it is clear that the function is not just sorting the subrange. Some possible alternatives for mth_to_nth_element are: - stratum (as in stratification) - league - echelon An alternative for subrange could be segment: sort_to_league(range, segment); // Sean's function sort_to_unsorted_league( range, segment); // Marshall's function or sort_to_sorted_stratum(range, segment); // Sean's function sort_to_stratum (range, segment); // Marshall's function or sorted_stratify (range, segment) ; // Sean's function stratify (range, segment) ; // Marshall's function
On Fri, Oct 2, 2015 at 6:10 AM, alex
If that were specified, then the mth_to_nth_element name has merit for
Sean's
algorithm, but Marshall's is still left without a good name. Finding similar names that vary only on the use of "sort" vs. "gather", for example, would be helpful.
One problem with both "gather" and "into" is that they suggest picking or inserting additional elements, whereas the algorithm just rearranges values. Rearranging to some criterion is sorting IMO. One problem with "into" is that it implies insertion, and therefore also picking up additional elements.
meh, I don't think that's a big issue. 'into' is just a clue, I don't think it leads to MISunderstanding, just not full understanding. however...
I think the discussion so far underestimates how much information is in the argument names. If you have to provide a range and a subrange it is clear that the function is not just sorting the subrange.
Some possible alternatives for mth_to_nth_element are: - stratum (as in stratification) - league - echelon
An alternative for subrange could be segment:
sort_to_league(range, segment); // Sean's function sort_to_unsorted_league( range, segment); // Marshall's function
or
sort_to_sorted_stratum(range, segment); // Sean's function sort_to_stratum (range, segment); // Marshall's function
or
sorted_stratify (range, segment) ; // Sean's function stratify (range, segment) ; // Marshall's function
I really like stratify. I just mentioned this yesterday in another thread - at some point you just need to co-opt a (hopefully related) term, and give it defined meaning in programming. I can see strata (and variations) becoming a term used in a number of places. sorted_stratify() still doesn't seem quite right to me, however. Once a person understands stratify, they then try to understand what a _sorted_ stratify would be, and that leads to: - wasn't stratify already sorting (in a course-grained way) - does it sort all 3 strata (and then isn't that just sort()?) midsort_stratify() ? stratify_subsort() ? :-( getting a bit wordy and confusing Tony
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
sorted_stratify (range, segment) ; // Sean's function stratify (range, segment) ; // Marshall's function
I really like stratify.
I can see strata (and variations) becoming a term used in a number of
Me too :) even though I think that term would require a more generic interface than just allowing 3 strata, why not 2, 4 or n? You could just supply n ranges and arrange the first stratum into the first range, the second stratum into the second range, etc. places.
sorted_stratify() still doesn't seem quite right to me, [...] - wasn't stratify already sorting (in a course-grained way) - does it sort all 3 strata (and then isn't that just sort()?)
midsort_stratify() ? stratify_subsort() ?
How about using the singular stratum to take away (reduce) that confusion sort_stratum(range, segment); // Sean's function stratify(ranges) ; // Marshall's function
On 10/02/2015 05:00 PM, Gottlob Frege wrote:
I really like stratify.
I find "stratify" to be too grandiose a term. In hierarchy theory there is an important distinction between hierachical levels (or layers) and strata. Elements at one level are simply aggregations of elements from the underlying level. Elements at one stratum are completely different from elements of the underlying stratum (e.g. we humans operate quite differently in the biosphere stratum than planets in the solar system stratum.) For that reason I would prefer "layered_sort" (or "sort_layer") over "stratified_sort" (or "sort_stratum".)
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Bjorn Reese Sent: 02 October 2015 17:24 To: boost@lists.boost.org Subject: Re: [boost] Painting a bikeshed ...
On 10/02/2015 05:00 PM, Gottlob Frege wrote:
I really like stratify.
I find "stratify" to be too grandiose a term.
In hierarchy theory there is an important distinction between hierachical levels (or layers) and strata
Ok, but in Boost the words strata / stratum and stratification are never used as far as I can tell. google("site:www.boost.org stratum strata stratification") The word layer is used a lot, but for various purposes. Also a basic introduction to hierarchy theory does not include these words either: http://www.isss.org/hierarchy.htm
On Fri, Oct 2, 2015 at 4:55 AM, Rob Stewart
On October 1, 2015 10:29:57 PM EDT, Gottlob Frege
wrote: 'range' is wrong and shouldn't be used, unless somehow qualified. The things passed in are ranges, pairs of iterators are ranges, but sort_subrange is using 'range' to mean the after-the-fact range. ie the range that would be there after sorting. Yes it turns out to be the same iterators, but it's confusing.
I had begun a reply with similar complaints, but never finished it, so I'll jump on your bandwagon instead.
I, too, found the naming confusing. I find the interface really odd, too. I've never seen an algorithm that uses an iterator range that refers to a portion of an as-if-sorted range. I suppose there are real use cases or Sean wouldn't have worked on the algorithm he presented, but it's still an odd interface.
I would think sort_subrange would only sort the elements within the subrange (verb object - sort subrange) not sort things from the total range _into_ the subrange.
Right
So 'into' (or similar) might help. ie sort_into_subrange for the first one. (just as a starting idea)
That's better, but it still fails to capture the behavior well. The algorithm acts as if the entire range is sorted, providing a view into a subrange that is actually sorted. Perhaps "view" should be in the name: view_sorted_subrange?
The second could be gather_into_subrange. Leaves you wondering what criteria is used to select the gathering, but you'll check the docs instead guessing incorrectly.
I also think "gather" should be part of the name of the second algorithm.
(NOT understanding is better than MISunderstanding when it comes to names)
Indeed
I do like gather. And the association with nth element.
The problem with nth_element is that no index is given, but that algorithm is already established, so building on that name could help. However, nth_element also specifies the order of the elements to the left and right of the nth.
To be clear, the leftover elements on the left and right aren't even partitioned? ie elements on the left could be greater than some elements on the right?
If that were specified, then the mth_to_nth_element name has merit for Sean's algorithm, but Marshall's is still left without a good name. Finding similar names that vary only on the use of "sort" vs. "gather", for example, would be helpful.
After listening to the talk, IIUC, everything on the left is less than everything in the middle, is less than everything on the right. (And thanks Marshall, I didn't need sleep last night. Sean's talk, followed by Chandler's, followed by... Or I could blame youtube for automatically starting the next video. But I choose Marshall.) Tony
___ Rob
(Sent from my portable computation engine with which I am not forced to top post)
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On Fri, Oct 2, 2015 at 4:49 AM, Gottlob Frege
On Fri, Oct 2, 2015 at 4:55 AM, Rob Stewart
wrote: On October 1, 2015 10:29:57 PM EDT, Gottlob Frege < gottlobfrege@gmail.com> wrote:
To be clear, the leftover elements on the left and right aren't even partitioned? ie elements on the left could be greater than some elements on the right?
If that were specified, then the mth_to_nth_element name has merit for Sean's algorithm, but Marshall's is still left without a good name. Finding similar names that vary only on the use of "sort" vs. "gather", for example, would be helpful.
After listening to the talk, IIUC, everything on the left is less than everything in the middle, is less than everything on the right.
Yes, that is correct.
(And thanks Marshall, I didn't need sleep last night. Sean's talk, followed by Chandler's, followed by... Or I could blame youtube for automatically starting the next video. But I choose Marshall.)
:-) -- Marshall
On 2015-10-01 10:24, Marshall Clow wrote:
In Sean Parent's CppCon keynote last week, he went through the process of implementing an algorithm he called "sort_subrange", which takes four iterators describing two ranges, one a subset of the other, and sorts the subrange - as if you had sorted the entire range. (See https://www.youtube.com/watch?v=sWgDk-o-6ZE, starting at about 31:30).
I have implemented this in Boost.Algorithm (crediting Sean) - called partition_sort.
However, while I was doing this, I thought of a different algorithm; one that gathered all the elements into the subrange as if the entire range was sorted, but didn't actually do the sorting.
I've implemented that one, too - but I'm having a bit of trouble coming up with a name.
I've used "partition_subrange", but that not that clear. Sean has suggested "elements_in_subrange" and "elements_within_subrange".
Here's the declaration:
template
void partition_subrange ( Iterator first, Iterator last, Iterator sub_first, Iterator sub_last, Pred p)
Not sure if this helps with the naming, but I find the order of arguments here unintuitive. I feel that it ought to follow nth_element and have void partition_subrange ( Iterator first, Iterator sub_first, Iterator sub_last, Iterator last, Pred p) and, having done that, I immediately feel like it ought to be variadic and support arbitrarily many points to fix in sorted order void partition_subrange ( Iterator first, Iterator... nth, Iterator last, Pred p) or even void partition_subrange ( Iterator first, Range<Iterator> nth, Iterator last, Pred p) This is actually a thing I have wanted to do, for example when one wishes to distribute data amongst several processors such that each gets a contiguous (in sorted order) chunk of the data. In the past I have simply sorted, but this would be better (especially when the number of processors is small), because you could partition the data in this way, then distribute it (and then sort each subrange in parallel if you needed it sorted). If you did chose to go down one of these routes then I guess a name like nth_elements or, to emphasize the plurality, multiple_nth_elements, might work. Or block_sort, or something along that line. (Of course, for this algorithm to be efficient you need to first partition on the point closest to the centre of the range, which makes it a little trickier to implement) John Bytheway
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of John Bytheway Sent: 03 October 2015 12:40 To: boost@lists.boost.org Subject: Re: [boost] Painting a bikeshed ...
On 2015-10-01 10:24, Marshall Clow wrote:
In Sean Parent's CppCon keynote last week, he went through the process of implementing an algorithm he called "sort_subrange", which takes four iterators describing two ranges, one a subset of the other, and sorts the subrange - as if you had sorted the entire range. (See https://www.youtube.com/watch?v=sWgDk-o-6ZE, starting at about 31:30).
I have implemented this in Boost.Algorithm (crediting Sean) - called partition_sort.
However, while I was doing this, I thought of a different algorithm; one that gathered all the elements into the subrange as if the entire range was sorted, but didn't actually do the sorting.
[...]
and, having done that, I immediately feel like it ought to be variadic and support arbitrarily many points to fix in sorted order
void partition_subrange ( Iterator first, Iterator... nth, Iterator last, Pred p)
or even
void partition_subrange ( Iterator first, Range<Iterator> nth, Iterator last, Pred p)
+1 You can loosen the requirements by not requiring all of the partitions to be subranges of a master range. void partion_ranges(Ranges... r) For instance three vectors {1, 4, 6} , {2, 5} and {3} could be partitioned to {1, 3, 2} , {5, 4} and {6}. Of course, the function does not need to have the word ranges in it. void partion (Ranges... r) And I still think stratify is more expressive...
participants (11)
-
alex
-
Bjorn Reese
-
Gottlob Frege
-
Joaquín M López Munoz
-
John Bytheway
-
Josh Juran
-
Marshall Clow
-
Rob Stewart
-
Roland Bock
-
Seth
-
Zach Laine