Tokenizer: feature request

Hi! I need to get non-null tokens from std::basic_string deliminated by null characters. For example, hopefully the following code string str=string("X;Y\0\0Z\0",7); typedef tokenizer<boost::escaped_list_separator<char> > Tok; escaped_list_separator<char> sep(string(),string("\0;",2),string()); Tok tokens(str, sep); for(Tok::iterator i = tokens.begin();i != tokens.end();++i) cout << "<" << *i << ">"; will yield <X><Y><Z> instead of <X><Y><><Z><> . An expert told me that changing the tokenizer constructors in BOOST from char* to std::basic_string should do the job. //Quote The escaped_list_separator is designed to seperated fields for CSV-like inputs, which allow empty fields. You cannot use this type of separator here. The default char_separator would do the trick, but the problem is that it does not allow the null character ('\0') as a separator because of the way the string of separators is passed to its constructor (that is, a const Char*, null-terminated strings). But since the internal representation is a ::std::basic_string<>, then I wonder why the parameter type for the constructor is not the same. It would then solve the problem. //End Quote Thank you! CN -- http://www.fastmail.fm - Accessible with your email software or over the web

On Mon, 07 Nov 2005 21:47:43 +0800, "CN" <cnliou9@fastmail.fm> wrote:
Hi!
I need to get non-null tokens from std::basic_string deliminated by null characters. For example, hopefully the following code
string str=string("X;Y\0\0Z\0",7); typedef tokenizer<boost::escaped_list_separator<char> > Tok; escaped_list_separator<char> sep(string(),string("\0;",2),string()); Tok tokens(str, sep); for(Tok::iterator i = tokens.begin();i != tokens.end();++i) cout << "<" << *i << ">";
will yield
<X><Y><Z>
instead of
<X><Y><><Z><> .
How about using the string_algo library instead? It handles NUL characters just fine. -- Be seeing you.

Hi! Thore,
I need to get non-null tokens from std::basic_string deliminated by null characters. For example, hopefully the following code
string str=string("X;Y\0\0Z\0",7); typedef tokenizer<boost::escaped_list_separator<char> > Tok; escaped_list_separator<char> sep(string(),string("\0;",2),string()); Tok tokens(str, sep); for(Tok::iterator i = tokens.begin();i != tokens.end();++i) cout << "<" << *i << ">";
will yield
<X><Y><Z>
instead of
<X><Y><><Z><> .
How about using the string_algo library instead? It handles NUL characters just fine.
The usage of split() is indeed neat and handy to myself. Thank you very much! CN -- http://www.fastmail.fm - Access your email from home and the web
participants (2)
-
CN
-
Thore Karlsen