
Thank you Yechezkel. I've indeed tried with boost.Regex, on a slightly different path though - I was using boost::regex_search instead. One drawback of the regex approach, IMO, is I feel the code a little bit rigid, or lack of flexibility, or in any other words, it's not anything I feel it should be - even though I cannot actually tell in what respect. Thanks for your yet another regex approach. I'm trying to rewrite your regex "([^"]*)"|(?:^|[[:space:],])+([^[:space:],]+)(?:$|[[:space:],])+ In a form that I'm more familiar "([^"]*)"|(?:^|[\s,])+([^\s,]+)(?:$|[\s,])+ But I still cannot understand it, after reading through http://www.boost.org/doc/libs/1_45_0/libs/regex/doc/html/boost_regex/syntax/ perl_syntax.html The part I could not interpret is: ^|[\s,] And $|[\s,] :-( (this is not a part of the regex, part of my expression instead.) Thanks. Max
-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Yechezkel Mett Sent: Wednesday, February 09, 2011 7:42 PM To: boost@lists.boost.org Subject: Re: [boost] [Tokenizer]Usage and documentation
On Tue, Feb 8, 2011 at 3:13 PM, Max <more4less@sina.com> wrote:
I'm using boost::tokenizer to do some simple parsing of data file in a format specified by the following rules:
- One record of several fields in a single line
- Adjacent data fields in a record separated by space char's(space or tab), with or without ","
- String without space(s), with or without quotation marks
- String with space(s), with quotation marks
One example of a 4-field-per-record file is like:
"string 2" 3 4 5 4.3
"String", 2, 3.04 4 3
AnyOtherText, 2, 3.04 4 3
I normally use boost.regex's regex_token_iterator for this sort of task. Try the following regex:
"([^"]*)"|(?:^|[[:space:],])+([^[:space:],]+)(?:$|[[:space:],])+
and tell regex_token_iterator to extract matches 1 and 2.
The above regex has a couple of quirks: "a""b" will be taken as two fields, "a" and "b". a,,b will be taken as two fields, not three.
To read the file line by line, simply use std::getline.
Yechezkel Mett _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost