Re: [Boost-users] Tokenizer Question

12 Jun 2005

      On Sat, 11 Jun 2005 12:52:56 -0500, "Tom Browder" <tbrowder@cox.net>
wrote:
...
I have used my own C++ tokenizer in the past, but I would like to use
Boost's instead.
The predominant use of tokenizing for me is to split on white space, but
Boost's default is to use white space AND punctuation.  Is there any
possibility to have either the default changed, or another TokenizerFunction
added such as ws_separator, or something similar?
I know I can use
boost::char_separator<char> sep(" \n\t");
(but do I need to add "\v" to the char set?)
but I would rather have something like
boost::ws_separator sep;
and, better, make the ws_separator be the default TokenizerFunction for
tokenizer.
Have you looked at the string_algo library? I much prefer its split
functionality to the tokenizer library, and what you want here is very
easy to accomplish with it.

Example:

  vector<string> v;
  split(v, "split me into tokens", is_space(), token_compress_on);

You should really check this library out. It's got a ton of useful
stuff.

-- 
Be seeing you.

Re: [Boost-users] Tokenizer Question

Thore Karlsen