Tokenizer design question

14 Jul 2009

      The Tokenizer library has a char_separator with the option to keep delimiters, drop delimiters, and keep or drop empty tokens. However, with escaped_list_separator, the only behavior is to keep empty tokens. While this is the obvious behavior for parsing csv and similar files, it would be nice to have the ability to also drop empty tokens when constructing an escaped_list_separator.

I have a command line parser that either reads its arguments from the command line itself or a text file supplied on the command line. In the file I'm passing in formats for the Date Time library I/O routines, and the formats have spaces that I'm escaping so the format will be a single token, which Tokenizer does find. But I sometimes use multiple tabs to separate my fields so it will look pretty in a text editor, and escaped_list_separator is keeping these. The solution for now is to have a switch in my command line parser for which separator I want to use.

thanks,
matthew

Polder, Matthew J

Zachary Turner

tags

participants (2)