
On Tue, 02 Feb 2010 17:12:35 +0800, Joel de Guzman <joel@boost-consulting.com> wrote:
I'm not sure what you mean by "the straightforward way implies creating a string". String for what? The attribute? The input? Spirit does not allocate any memory. Also, AFAIK, you can avoid using strings. Perhaps be more specific?
Let's say I have an input of const char *. My input consists of a command with up to two parameters. I used Spirit to parse this command and decompose it into chunks (which means that the result of the parsing will be one enum and two "strings"). Currently it's a simple loop which returns the pointers to the chunks. Very fast (obviously).
From what I understood of Spirit, I wrote a parser which created a string when it found my chunk. On top of my head it might have looked like this
lit("command1") >> +char_[ref(my_str) = _1] >> lit("separator") >> +char_[ref(my_str2) = _1] | lit("command2) >> +alnum_[ref(my_str) = _1] This is really cool and much easier to understand than the current loop. Currently the memory allocation occurs when putting the input into the string. I now realize I can replace ref(my_str) = _1 with something that's going to build a pair of pointers based on the input which should reduce the gap between Spirit and the custom parser. But then I encountered a different problem which is that the 'grammar' cannot be read from left to right. Basically the input may contain the separator, so what I currently do is read my command, start from the right, when I reach the separator I have my second token and the rest is the first token. That can be avoided as well in offloading this part off Spirit. Then again I want to insist I'm new to Spirit and just wanted to give it a shot and see if I could quickly leverage it. It is my current conclusion that for my specific problem MSM might be best suited, but then again I just wanted to allocate one day for this mini-benchmark and all I can say is that it will take more than one day to tell, day I don't have. -Edouard