Can Regex callback function arguments be modified?

I am having some trouble with the Regex callback architecture. Fist of all for, for a variable number of matches, the only way to code is with a callback function and for_each iterator, correct? To make it more interesting, I have a callback calling another callback. This could obviously go deeper and deeper. Due to the inflexibility (perhaps by me perceived?) of the callback architecture, I am forced to use a lot of global variables. To be fair, this is highly manageable within my current application, but I am concerned that this "implicit number and types of parameters" restriction may become a stumbling block in future code. At this stage, what I am trying to determine whether I should base my program(s) on (from simplest to more complex): - Regex - eXpressive - Spirit So far, I prefer Regex because it is much more stable than Spirit. The latter seems to have great potential, but when when I find that I am unable to even build the test examples properly, and the specs vary from one version to the next, I conclude that it is best to allow the good people of Spirit ("spirited people"? :-) time to deliver a more polished product. Don't know anything about eXpressive yet. Regards, -Ramon

I am having some trouble with the Regex callback architecture. Fist of all for, for a variable number of matches, the only way to code is with a callback function and for_each iterator, correct?
What callback architecture? Regex is iterator based: see regex_iterator and regex_token_iterator. Of course you could pass these iterators to std::for_each along with a callback function, but there's really no need to do that. What is it you're trying to do? John.

Hi John, Thanks for your kind reply. My code is based on this program of yours: http://patriot.net/~ramon/regex_iterator_example.cpp.txt My data is structured like this: [Unique ID1] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value [Unique ID2] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value All I need to do is to extract the different parts, to be saved in a map. In my first callback, I divide every section into (a) Unique ID, and (b) rest of the (multiline) block. Once I have the latter, I make a call to a second callback to break the LHS and the RHS of each assignment expression. So you say that the callback is really not required? Thanks! -Ramon John Maddock wrote:
I am having some trouble with the Regex callback architecture. Fist of all for, for a variable number of matches, the only way to code is with a callback function and for_each iterator, correct?
What callback architecture? Regex is iterator based: see regex_iterator and regex_token_iterator. Of course you could pass these iterators to std::for_each along with a callback function, but there's really no need to do that. What is it you're trying to do?
John.

So you say that the callback is really not required?
Hi Ramon, The STL for_each [1] accepts a "unary function object" [2] which offers more flexibility than a simple non-member function. For example (off the top of my head) [3]: Class myClass; for_each( begin, end, boost::bind( & Class::function, &myClass _1 )); I also point out that if you wanted to, you could replace for_each with your own for loop. (Not that you need to.) 1. http://www.sgi.com/tech/stl/for_each.html 2. http://www.sgi.com/tech/stl/functors.html 3. http://www.boost.org/doc/libs/1_40_0/libs/bind/bind.html#with_algorithms HTH, Kevin

Thanks so much, Kevin and John, for your patience with this self proclaimed member of the "newbie ignoramus" species :^D I am very glad I decided to immerse myself in this daunting language, C++, which is most likely the most complex programming language ever devised. I had been avoiding it for years. The availability of Boost has been a main factor in my decision to program in C++ from now on. Best regards, -Ramon Kevin Kassil wrote:
So you say that the callback is really not required?
Hi Ramon,
The STL for_each [1] accepts a "unary function object" [2] which offers more flexibility than a simple non-member function. For example (off the top of my head) [3]:
Class myClass; for_each( begin, end, boost::bind( & Class::function, &myClass _1 ));
I also point out that if you wanted to, you could replace for_each with your own for loop. (Not that you need to.)
1. http://www.sgi.com/tech/stl/for_each.html
2. http://www.sgi.com/tech/stl/functors.html
3. http://www.boost.org/doc/libs/1_40_0/libs/bind/bind.html#with_algorithms
HTH,
Kevin

Yes I agree on both counts: C++ is complex, and also that Boost is a great way to exploit its best features! Much of C++'s complexity comes from its legacy. There's a lot to learn but eventually you learn to choose among the many ways to accomplish a task. Cheers, Kevin

Ramon F Herrera <ramon <at> patriot.net> writes:
Variable Name = Variable Value
All I need to do is to extract the different parts, to be saved in a map. In my first callback, I divide every section into (a) Unique ID, and (b) rest of the (multiline) block. Once I have the latter, I make a call to a second callback to break the LHS and the RHS of each assignment expression.
Definitely take a look at Boost.Xpressive and its semantic actions. Its introductory example for semantic actions seems very similar to half your problem and the rest you can probably figure out quite easily: http://tinyurl.com/kl6dq9 HTH, -Ryan

My data is structured like this:
[Unique ID1] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value
[Unique ID2] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value
All I need to do is to extract the different parts, to be saved in a map. In my first callback, I divide every section into (a) Unique ID, and (b) rest of the (multiline) block. Once I have the latter, I make a call to a second callback to break the LHS and the RHS of each assignment expression.
So you say that the callback is really not required?
How about nested for-loops (caution untested code!!): std::string text = get_text_from_file(); regex block_extractor("\\[(\\w+)\\]([^\\[]+)"); regex variable_extractor("(\\w+)\\s*=\\s*(\\w+)"); for(regex_iterator<std::string::const_iterator> i = make_regex_iterator(text, block_extractor); i != regex_iterator<std::string::const_iterator>(); ++i) { std::string block_name = i->str(1); std::string block_contents = i->str(2); std::map<std::string, std::string> param_list; for(regex_iterator<std::string::const_iterator> j = make_regex_iterator(block_contents, variable_extractor); j != regex_iterator<std::string::const_iterator>(); ++j) { param_list[j->str(1)] = j->str(2); } } Obviously the regexes used may vary depending upon the exact data format, but hopefully this should give you the general idea? John.

Great! This certainly helps. Thanks a lot, John. -Ramon John Maddock wrote:
My data is structured like this:
[Unique ID1] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value
[Unique ID2] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value
All I need to do is to extract the different parts, to be saved in a map. In my first callback, I divide every section into (a) Unique ID, and (b) rest of the (multiline) block. Once I have the latter, I make a call to a second callback to break the LHS and the RHS of each assignment expression.
So you say that the callback is really not required?
How about nested for-loops (caution untested code!!):
std::string text = get_text_from_file(); regex block_extractor("\\[(\\w+)\\]([^\\[]+)"); regex variable_extractor("(\\w+)\\s*=\\s*(\\w+)");
for(regex_iterator<std::string::const_iterator> i = make_regex_iterator(text, block_extractor); i != regex_iterator<std::string::const_iterator>(); ++i) { std::string block_name = i->str(1); std::string block_contents = i->str(2); std::map<std::string, std::string> param_list;
for(regex_iterator<std::string::const_iterator> j = make_regex_iterator(block_contents, variable_extractor); j != regex_iterator<std::string::const_iterator>(); ++j) { param_list[j->str(1)] = j->str(2); } }
Obviously the regexes used may vary depending upon the exact data format, but hopefully this should give you the general idea?
John.

On Sun, Sep 20, 2009 at 8:53 AM, Ramon F Herrera <ramon@patriot.net> wrote:
Great! This certainly helps.
Thanks a lot, John.
-Ramon
John Maddock wrote:
My data is structured like this:
[Unique ID1] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value
[Unique ID2] Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value Variable Name = Variable Value
All I need to do is to extract the different parts, to be saved in a map. In my first callback, I divide every section into (a) Unique ID, and (b) rest of the (multiline) block. Once I have the latter, I make a call to a second callback to break the LHS and the RHS of each assignment expression.
As I stated elsewhere, Boost.Spirit2.1 could still do it all in one quick step, quite literally one line of code (although might spread the grammar to two lines to make it more readable), no for loops or anything needed, and it would do it a great deal faster.
participants (5)
-
John Maddock
-
Kevin Kassil
-
OvermindDL1
-
Ramon F Herrera
-
Ryan Gallagher