
Michael Goldshteyn wrote:
OK, so I want to use the sregex_token_iterator functionality to split a data string. The data string contains:
/a/b//c/
The delimiter is the forward slash and I do want empty strings. I expect to get:
{}{a}{b}{}{c}{}
What I actually get is:
{}{a}{b}{}{c}
The empty string after {c}, which I expect because the data string ended in a forward slash, is missing. What do I have to do to get the empty string after {c} if the data string ends in a forward slash?
<snip> This is by design. It behaves the same as Boost.Regex and perl's split() function. Try running this perl code: $str = '/a/b//c/'; @rg = split(/\//, $str); foreach(@rg) { printf("{%s}", $_); } It prints: {}{a}{b}{}{c} I'm not 100% sure I understand this behavior myself, but the C++0x standard is very clear about this case. 28.12.2.4/5-6 about regex_token_iterator::operator++ says:
Otherwise, if any of the values stored in subs is equal to -1 and prev->suffix().length() is not 0 the operator sets *this to a suffix iterator that points to the range [prev->suffix().first, prev->suffix().second). Otherwise, sets *this to an end-of-sequence iterator.
In your case, subs[0] is -1 and prev->suffix().length() is 0 after matching the trailing '/', so *this becomes the end-of-sequence iterator and we're done. I don't myself remember the rationale for requiring the suffix to be non-empty. Perhaps it is for parity with perl. -- Eric Niebler BoostPro Computing http://www.boostpro.com