[string_algo] find_head/find_tail additions
Hi. I have a feature I would like to see in the string algo library. The present find_head function gets an unsigned int N and returns the head of the input string with size is at most N. I would like another option that will return the head of the input with size at most original_size minus N. In other words, all the string without the last N characters. if original_size < N the return would be an empty string. The interface for such a functionality might introduce a new function (find_head_but()?), or make the present find_head function accept a signed int N, instead of unsigned int, with a negetive number indicates the new functionality. And of course the same for find_tail. Thanks, Yuval
Hi, This seems like a goot idea. I like the interface with int parameter. I'll add it to my todo list. Thanks, Pavol On Thu, Dec 01, 2005 at 11:54:38AM +0200, Yuval Ronen wrote:
Hi.
I have a feature I would like to see in the string algo library. The present find_head function gets an unsigned int N and returns the head of the input string with size is at most N. I would like another option that will return the head of the input with size at most original_size minus N. In other words, all the string without the last N characters. if original_size < N the return would be an empty string.
The interface for such a functionality might introduce a new function (find_head_but()?), or make the present find_head function accept a signed int N, instead of unsigned int, with a negetive number indicates the new functionality.
And of course the same for find_tail.
Thanks, Yuval
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
Pavol Droba wrote:
This seems like a goot idea. I like the interface with int parameter. I'll add it to my todo list.
First of all, thanks. One problem with the negative numbers I just thought of: zero. A positive zero means return an empty string. A negative zero means return the whole input string. But since there's no difference between positive zero and negative zero...
On Thu, Dec 01, 2005 at 08:06:14PM +0200, Yuval Ronen wrote:
Pavol Droba wrote:
This seems like a goot idea. I like the interface with int parameter. I'll add it to my todo list.
First of all, thanks.
One problem with the negative numbers I just thought of: zero. A positive zero means return an empty string. A negative zero means return the whole input string. But since there's no difference between positive zero and negative zero...
Well, we just have to decide and document it. I think, that the safer approach would be to treat 0 as positive number. If for nothing else, it would not break current definition. Regards, Pavol
Pavol Droba
Hi,
This seems like a goot idea. I like the interface with int parameter. I'll add it to my todo list.
Thanks, Pavol
What's wrong with erase_tail_copy? Don't like the idea with negative numbers becuse I can't see how code can use the functionality. Either you want the to use the positive version or the negative version and then it can be different functions. Combining it into the same function doubles the size and might affect inlining, optimization etc. What about find_nth?
On Fri, Dec 02, 2005 at 07:47:16AM +0000, Martin wrote:
Pavol Droba
writes: Hi,
This seems like a goot idea. I like the interface with int parameter. I'll add it to my todo list.
Thanks, Pavol
What's wrong with erase_tail_copy?
Well, you got me. It's a shame, but I have forgotten, that there is already function for this. My response was make without consideration of the existence of erase_tail/head. Given this, I see no futher reason to modify behaviour of find_head.
Don't like the idea with negative numbers becuse I can't see how code can use the functionality. Either you want the to use the positive version or the negative version and then it can be different functions. Combining it into the same function doubles the size and might affect inlining, optimization etc.
Actualy the usage of negative numbers in indexation of string ranges is quite common in scripting world. For example ruby and perl do have something like this.
What about find_nth?
It would make sence to do the same here, but only if other algorithms like find_head would be modified as well. As I see that there is no reason for former, I will not modify the later. Best regards, Pavol
Pavol Droba wrote:
On Fri, Dec 02, 2005 at 07:47:16AM +0000, Martin wrote:
What's wrong with erase_tail_copy?
Well, you got me. It's a shame, but I have forgotten, that there is already function for this. My response was make without consideration of the existence of erase_tail/head.
Given this, I see no futher reason to modify behaviour of find_head.
I took a look at erase_tail_copy and it seems it's not exactly what I want. The erase_tail_copy is considered a mutating function in this library and therefor requires a SequenceT as an input, rather than a RangeT (no string literals allowed). It also makes a copy of the input string when I don't think there's any need to. In other words, those erase_xxx_copy functions are, IMO, misplaced. No wonder I couldn't find them in the first place... I think they are essentially find algorithms just like the find_head/find_tail functions, and should: 1. Be named find_something, not erase_something 2. Placed in the find.hpp header, not erase.hpp 3. Accept RangeT (including string literals) 4. Return an iterator_range just like the find_xxx functions, without making copies (of course there could also be _copy versions, but on the other hand, do the find_xxx functions have _copy versions?). Yuval
On Sat, Dec 03, 2005 at 01:21:43AM +0200, Yuval Ronen wrote:
Pavol Droba wrote:
On Fri, Dec 02, 2005 at 07:47:16AM +0000, Martin wrote:
What's wrong with erase_tail_copy?
Well, you got me. It's a shame, but I have forgotten, that there is already function for this. My response was make without consideration of the existence of erase_tail/head.
Given this, I see no futher reason to modify behaviour of find_head.
I took a look at erase_tail_copy and it seems it's not exactly what I want. The erase_tail_copy is considered a mutating function in this library and therefor requires a SequenceT as an input, rather than a RangeT (no string literals allowed). It also makes a copy of the input string when I don't think there's any need to.
In other words, those erase_xxx_copy functions are, IMO, misplaced. No wonder I couldn't find them in the first place... I think they are essentially find algorithms just like the find_head/find_tail functions, and should: 1. Be named find_something, not erase_something 2. Placed in the find.hpp header, not erase.hpp 3. Accept RangeT (including string literals) 4. Return an iterator_range just like the find_xxx functions, without making copies (of course there could also be _copy versions, but on the other hand, do the find_xxx functions have _copy versions?).
erase_tail_copy does not exactly be what a find_head would. But it is quite close. The only serious diference is, that it does not return an iterator_range based reference to the input string, rather a copy of it. This might be a serious problem, however I have not yet come to a situation were it realy mattered. On the other hand, my personal experience does not prove anything globaly. The definition and placement of erase_tail_copy is consistent with other erase_xxx algorithms, it should not be changed. Therefore the only reasonable way to provide the exact functionality as you requested is to modify find_head (or add find_head_but). If we are going this way I'm biased to provide negative-index variant, due to similarities with scripting languages. At the end, I feel, that it would do no wrong to extend find_head/tail and then probably also find_nth with the support for negative indexes. So, when I get a chance, I'll implenet it. Regards, Pavol BTW: find_xxx functions does not need to have copy variants, since they are not mutating algorithms. Find simply returns a reference to a input string. It is up to you what to do with it. One possility is to make a copy. In mutating algorithms like replace and erase, there is fundamental difference how algorithm behaves in mutable and copy version.
I took a look at erase_tail_copy and it seems it's not exactly what I want. The erase_tail_copy is considered a mutating function in this library and therefor requires a SequenceT as an input, rather than a RangeT (no string literals allowed). It also makes a copy of the input string when I don't think there's any need to.
In other words, those erase_xxx_copy functions are, IMO, misplaced. No wonder I couldn't find them in the first place... I think they are essentially find algorithms just like the find_head/find_tail functions, and should: 1. Be named find_something, not erase_something 2. Placed in the find.hpp header, not erase.hpp 3. Accept RangeT (including string literals) 4. Return an iterator_range just like the find_xxx functions, without making copies (of course there could also be _copy versions, but on the other hand, do the find_xxx functions have _copy versions?).
The definition and placement of erase_tail_copy is consistent with other erase_xxx algorithms, it should not be changed.
BTW: find_xxx functions does not need to have copy variants, since they are not mutating algorithms. Find simply returns a reference to a input string. It is up to you what to do with it. One possility is to make a copy. In mutating algorithms like replace and erase, there is fundamental difference how algorithm behaves in mutable and copy version.
My point was that erase_tail_copy /can/, and therefor /should/, be defined in terms of 'find' rather than in terms of 'mutate'. The 'find' notions is prefered since it's not intrusive, and leave more control in the hands of the user. For this reason, anything that can be defined in 'find' terms, should be. On the other hand, if you really want to keep the mutating version of erase_tail_copy, then so be it. You won't here any more arguments from me... ;-)
erase_tail_copy does not exactly be what a find_head would. But it is quite close. The only serious diference is, that it does not return an iterator_range based reference to the input string, rather a copy of it. This might be a serious problem, however I have not yet come to a situation were it realy mattered. On the other hand, my personal experience does not prove anything globaly.
Therefore the only reasonable way to provide the exact functionality as you requested is to modify find_head (or add find_head_but). If we are going this way I'm biased to provide negative-index variant, due to similarities with scripting languages.
At the end, I feel, that it would do no wrong to extend find_head/tail and then probably also find_nth with the support for negative indexes. So, when I get a chance, I'll implenet it.
I have to say that I'm quite convinced that adding a new function is somewhat better then the negative numbers approach. It's true that the negative technique has some history in scripting languages, but I think we can do better in C++. But on this matter, as on the previous one, if you decide on the negative number approach, I'll argue no more. And btw, whatever method chosen to do it with find_head/find_tail, it should probably also be applied to erase_head/erase_tail[_copy]. These functions should also deal with 'backward' (negative) indices. Thanks again, Yuval
participants (3)
-
Martin
-
Pavol Droba
-
Yuval Ronen