[string algo] again strange split behaviour

hi, I assume the behaviour of boost::split not to return the last token has been changed (in CVS). However, a side effect seems to be that there is also one token returned for _empty_ strings, which is very questionable IMO! summary (if '/' is the separator): boost 1.32: "" -> 0 tokens "abc/abc" -> 2 tokens "abc/abc/" -> 2 tokens CVS: "" -> 1 token (!) "abc/abc" -> 2 tokens "abc/abc/" -> 3 tokens should be IMO: "" -> 0 tokens "abc/abc" -> 2 tokens "abc/abc/" -> 3 tokens Thoughts? Stefan

On Wed, 13 Jul 2005 15:12:48 +0200, Stefan Slapeta <stefan@slapeta.com> wrote:
hi,
I assume the behaviour of boost::split not to return the last token has been changed (in CVS). However, a side effect seems to be that there is also one token returned for _empty_ strings, which is very questionable IMO!
summary (if '/' is the separator):
boost 1.32: "" -> 0 tokens "abc/abc" -> 2 tokens "abc/abc/" -> 2 tokens
CVS: "" -> 1 token (!) "abc/abc" -> 2 tokens "abc/abc/" -> 3 tokens
should be IMO: "" -> 0 tokens "abc/abc" -> 2 tokens "abc/abc/" -> 3 tokens
Thoughts?
In some ways I agree with you, but I think it may be more consistent to output an empty token for an empty string as long as other empty tokens are kept. However, I can see arguments for and against both ways. Does anyone know what other libraries/languages do in this situation? Scripting languages typically have a split() function, but I haven't tested this particular case. -- Be seeing you.

Thore Karlsen <sid@6581.com> writes:
should be IMO: "" -> 0 tokens "abc/abc" -> 2 tokens "abc/abc/" -> 3 tokens
Thoughts?
In some ways I agree with you, but I think it may be more consistent to output an empty token for an empty string as long as other empty tokens are kept. However, I can see arguments for and against both ways.
Does anyone know what other libraries/languages do in this situation? Scripting languages typically have a split() function, but I haven't tested this particular case.
Like John said: python -c "print ''.split('/'), 'abc/abc'.split('/'), 'abc/abc/'.split('/')" [''] ['abc', 'abc'] ['abc', 'abc', ''] -- Dave Abrahams Boost Consulting www.boost-consulting.com

On Wed, 13 Jul 2005 10:42:31 -0400, David Abrahams <dave@boost-consulting.com> wrote:
should be IMO: "" -> 0 tokens "abc/abc" -> 2 tokens "abc/abc/" -> 3 tokens
Thoughts?
In some ways I agree with you, but I think it may be more consistent to output an empty token for an empty string as long as other empty tokens are kept. However, I can see arguments for and against both ways.
Does anyone know what other libraries/languages do in this situation? Scripting languages typically have a split() function, but I haven't tested this particular case.
Like John said:
python -c "print ''.split('/'), 'abc/abc'.split('/'), 'abc/abc/'.split('/')" [''] ['abc', 'abc'] ['abc', 'abc', '']
I didn't get John's message yet (I'm reading through the gmame newsgroup interface, and it hasn't appeared there yet), but Python does exactly what I would expect string_algo to do. -- Be seeing you.

On Wed, Jul 13, 2005 at 10:12:21PM +0200, Stefan Slapeta wrote:
David Abrahams wrote:
Like John said:
python -c "print ''.split('/'), 'abc/abc'.split('/'), 'abc/abc/'.split('/')" [''] ['abc', 'abc'] ['abc', 'abc', '']
ok, forget about it then. I've just found out that other libraries do the same thing...
Thanks, one less problem to solve. Regards, Pavol

Hi, On Wed, Jul 13, 2005 at 03:12:48PM +0200, Stefan Slapeta wrote:
hi,
I assume the behaviour of boost::split not to return the last token has been changed (in CVS). However, a side effect seems to be that there is also one token returned for _empty_ strings, which is very questionable IMO!
summary (if '/' is the separator):
boost 1.32: "" -> 0 tokens "abc/abc" -> 2 tokens "abc/abc/" -> 2 tokens
CVS: "" -> 1 token (!) "abc/abc" -> 2 tokens "abc/abc/" -> 3 tokens
should be IMO: "" -> 0 tokens "abc/abc" -> 2 tokens "abc/abc/" -> 3 tokens
Hmm, looking through the code, I see you are right. I have not considered this case. I will see what can be done to fix it. Thanks, Pavol.
participants (4)
-
David Abrahams
-
Pavol Droba
-
Stefan Slapeta
-
Thore Karlsen