Partial Sort and Partition with Ranges [was: Boostcon Keynote available]

2009/8/3 Marshall Clow <mclow.lists@gmail.com>:
Thank you, everyone, for your patience. More videos from Boostcon will be going up there in the near future.
Thanks, Marshall, I'd been looking forward to that. I see that the talk presents a two-range form of partialSort, which I don't see in D's standard library[1]. Does anyone know the history there? On a related note, how would I, using ranges, sort only the first half of a partition? It appears I can only get a range for the second half[2]. In code, I'm looking for the equivalent of sort(b, partition(b, e, pred)); [1] http://www.digitalmars.com/d/2.0/phobos/std_algorithm.html#partialSort [2] http://www.digitalmars.com/d/2.0/phobos/std_algorithm.html#partition

Scott McMurray wrote:
2009/8/3 Marshall Clow <mclow.lists@gmail.com>:
Thank you, everyone, for your patience. More videos from Boostcon will be going up there in the near future.
Thanks, Marshall, I'd been looking forward to that.
I see that the talk presents a two-range form of partialSort, which I don't see in D's standard library[1]. Does anyone know the history there?
It's just a version un-synchronization. I have partialSort in my tree but haven't committed it yet. I expect to do that soon.
On a related note, how would I, using ranges, sort only the first half of a partition? It appears I can only get a range for the second half[2]. In code, I'm looking for the equivalent of sort(b, partition(b, e, pred));
I made partial sorting available for random-access ranges (just like STL). For those it's easy to select the first half (e.g. in D by using r[0 .. r.length / 2]). Andrei P.S. Unrelated: I found this interview with Stepanov about his new book at http://www.informit.com/articles/article.aspx?p=1383185

2009/8/5 Andrei Alexandrescu <andrei@metalanguage.com>:
Scott McMurray wrote:
On a related note, how would I, using ranges, sort only the first half of a partition? It appears I can only get a range for the second half[2]. In code, I'm looking for the equivalent of sort(b, partition(b, e, pred));
I made partial sorting available for random-access ranges (just like STL). For those it's easy to select the first half (e.g. in D by using r[0 .. r.length / 2]).
Sorry, that was unclear of me. When I said "half" I meant "the contiguous subsequence at the from of the original range in which the predicate is true". It looks like I have to do this (guessing at D syntax): falsepart = partition(r, pred); sort(r[0 .. r.size()-falsepart.size()]); Is there a more elegant way to do that? I don't have confidence that I haven't made an off-by-one error. Also, I presume partition works on bidirectional ranges for which I cannot use [] to get the range where the predicate is true. What if I wanted to run a bottom_up_merge_sort function on the "predicate is true" subrange of my list, for example? In short, it feels like partition should be returning a pair of ranges. (Though I haven't thought all the way through sentinel-terminated ranges and such. But I suppose they aren't bidirectional in the first place.)

Scott McMurray wrote:
2009/8/5 Andrei Alexandrescu <andrei@metalanguage.com>:
Scott McMurray wrote:
On a related note, how would I, using ranges, sort only the first half of a partition? It appears I can only get a range for the second half[2]. In code, I'm looking for the equivalent of sort(b, partition(b, e, pred)); I made partial sorting available for random-access ranges (just like STL). For those it's easy to select the first half (e.g. in D by using r[0 .. r.length / 2]).
Sorry, that was unclear of me. When I said "half" I meant "the contiguous subsequence at the from of the original range in which the predicate is true". It looks like I have to do this (guessing at D syntax):
falsepart = partition(r, pred); sort(r[0 .. r.size()-falsepart.size()]);
Is there a more elegant way to do that? I don't have confidence that I haven't made an off-by-one error.
Not that I know of.
Also, I presume partition works on bidirectional ranges for which I cannot use [] to get the range where the predicate is true.
partition requires bidirectional ranges but if you pass a random-access range it gives you back a portion of it, which is also a random-access range.
What if I wanted to run a bottom_up_merge_sort function on the "predicate is true" subrange of my list, for example?
In short, it feels like partition should be returning a pair of ranges.
(Though I haven't thought all the way through sentinel-terminated ranges and such. But I suppose they aren't bidirectional in the first place.)
Interesting point. Partition actually works on forward ranges if no stability is required. There are different algorithms that work best depending on stability requirements. If you look at the implementation of std.partition, you'll see that I use different algorithms for stable vs. semistable vs. unstable partition. In particular, I don't know of efficient stable partition for bidirectional ranges. I could return two ranges *if* the range is bidirectional and the partition is not stable; that would complicate the interface a bit by spilling subtle realities about the algorithm into the client. Note that if you know the length of the range, you can always use take(n, range) to restrict a range to its first n elements. This, I think, takes care of all situations of interest (if you have a range that you don't know the length of *and* want to do bottom_up_merge_sort on it, I guess that won't work). Andrei

Andrei Alexandrescu wrote:
partition requires bidirectional ranges but if you pass a random-access range it gives you back a portion of it, which is also a random-access range.
s/bidirectional/forward/ Andrei
participants (2)
-
Andrei Alexandrescu
-
Scott McMurray