Can I use Boost.Regex with multilne text to be recognized? - Boost-users - lists.preview.boost.org

newer
[Regex] Problem with optionality...

Can I use Boost.Regex with multilne text to be recognized?

older
How to compile boost regex

Ramon F Herrera

15 Sep 2009 15 Sep '09

12:12 a.m.

More accurately, my question should be: How hard (or easy) is it to deal with multiple lines of parsed text in Regex? Is there anything special to do? (such as defining an end of line character(s)) Perhaps the best advice is to stay away from Regex and use something more powerful (such as Xpressive)? TIA, -Ramon

Reply

Sign in to reply online Use email software

Show replies by date

OvermindDL1

15 Sep 15 Sep

12:21 a.m.

On Mon, Sep 14, 2009 at 6:12 PM, Ramon F Herrera <ramon@patriot.net> wrote:

More accurately, my question should be:

How hard (or easy) is it to deal with multiple lines of parsed text in Regex?

Is there anything special to do? (such as defining an end of line character(s))

Perhaps the best advice is to stay away from Regex and use something more powerful (such as Xpressive)?

Regex can handle multi-lines easy, it is all a text blob as far as it is concerned. Depending on what you want to use will depend on what you are doing, so what are you trying to parse out?

Reply

Sign in to reply online Use email software

Ramon F Herrera

12:35 a.m.

OvermindDL1 wrote:

On Mon, Sep 14, 2009 at 6:12 PM, Ramon F Herrera <ramon@patriot.net> wrote:

...
More accurately, my question should be:

How hard (or easy) is it to deal with multiple lines of parsed text in Regex?

Is there anything special to do? (such as defining an end of line character(s))

Perhaps the best advice is to stay away from Regex and use something more powerful (such as Xpressive)?

Regex can handle multi-lines easy, it is all a text blob as far as it is concerned. Depending on what you want to use will depend on what you are doing, so what are you trying to parse out?

Hi Overmind, I am trying to parse multiple files with the structure indicated below. I sort of got started, but I hate it if I am going to hit a wall. I guess I could start by defining a line like this: string variable = "([A-Za-z0-9][\\w\\h\\(\\)\\-\\.,/&]*)"; char equal_sign = '='; string value = "(.+)"; string assignment = variable + equal_sign + value; string line = assignment + eol; Any tips and hints are most appreciated and welcome... -Ramon --------------------- [Unique ID 1] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value [Unique ID 2] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value [Unique ID 3] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value

Reply

Sign in to reply online Use email software

OvermindDL1

1:02 a.m.

On Mon, Sep 14, 2009 at 6:35 PM, Ramon F Herrera <ramon@patriot.net> wrote:

I am trying to parse multiple files with the structure indicated below. I sort of got started, but I hate it if I am going to hit a wall.

I guess I could start by defining a line like this:

string variable = "([A-Za-z0-9][\\w\\h\\(\\)\\-\\.,/&]*)"; char equal_sign = '='; string value = "(.+)"; string assignment = variable + equal_sign + value;

string line = assignment + eol;

Any tips and hints are most appreciated and welcome...

-Ramon

---------------------

[Unique ID 1] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value

[Unique ID 2] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value

[Unique ID 3] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value

You could do that using regex, but you will have to parse it into a structure yourself. You did not state, so I will just assume that the things in [] are section headings (ala an ini file) and that they are required first, and the Variable Name's can be duplicated (ala an ini file) and ordered. If so it might just be easier to use boost::spirit 2.1 as it can do the parsing and filling in your data structure all in one step, and it will run a great deal faster then regex. Something like this code would probably work: // Have not tested this code, writing it inside the email client itself... std::map< std::string, std::vector< std::pair<std::string,std::string>

...
dataStuff;

using namespace boost::spirit; using namespace boost::spirit::qi; using namespace boost::spirit::standard; bool successful = parse(inputstream.begin(),inputstream.end(), *( '[' >> *(print-']') >> ']' >> eol >> ( +(print-(*space>>'=')) >> *space >> '=' >> *space >> +print >> eol ) ) ,dataStuff); As always, I make no guarantees of the quality of my above code when I am running on 6 hours past when I should be sleeping. You can also add a _pass semantic action to the first string match so you can absolutely ensure that each section ([]) name will be unique.

Reply

Sign in to reply online Use email software

John Maddock

18 Sep 18 Sep

10:30 a.m.

New subject: Can I use Boost.Regex with multilne text to berecognized?

How hard (or easy) is it to deal with multiple lines of parsed text in Regex?

Multiline support is the default behaviour.

Is there anything special to do? (such as defining an end of line character(s))

No. HTH, John.

Reply

Sign in to reply online Use email software

5787

Age (days ago)

5790

Last active (days ago)

Download

4 comments

3 participants

tags

participants (3)

John Maddock
OvermindDL1
Ramon F Herrera