[spirit] scan_keyword: scan an InputIterator for a keyword.

Hi, for chrono input I'm using a scan_keyword function that adapted from libc++ library (you can find the code in[1]). Next follows the interface: // scan_keyword // Scans [b, e) until a match is found in the basic_strings range // [kb, ke) or until it can be shown that there is no match in [kb, ke). // b will be incremented (visibly), consuming CharT until a match is found // or proved to not exist. A keyword may be "", in which will match anything. // If one keyword is a prefix of another, and the next CharT in the input // might match another keyword, the algorithm will attempt to find the longest // matching keyword. If the longer matching keyword ends up not matching, then // no keyword match is found. If no keyword match is found, ke is returned // and failbit is set in err. // Else an iterator pointing to the matching keyword is found. If more than // one keyword matches, an iterator to the first matching keyword is returned. // If on exit b == e, eofbit is set in err. // Examples: // Keywords: "a", "abb" // If the input is "a", the first keyword matches and eofbit is set. // If the input is "abc", no match is found and "ab" are consumed. template <class InputIterator, class ForwardIterator> ForwardIterator scan_keyword(InputIterator& b, InputIterator e, ForwardIterator kb, ForwardIterator ke, std::ios_base::iostate& err ); What is the better way to do that (or something similar) with Spirit/Lexer? Can we hope that the Spirit solution could perform much better? What about if the strings to parse are know at compile-time? [1] https://svn.boost.org/svn/boost/trunk/boost/chrono/detail/scan_keyword.hpp Best, Vicente

On 11/29/2011 5:45 AM, Vicente J. Botet Escriba wrote:
Hi,
for chrono input I'm using a scan_keyword function that adapted from libc++ library (you can find the code in[1]). Next follows the interface:
[snip]
What is the better way to do that (or something similar) with Spirit/Lexer? Can we hope that the Spirit solution could perform much better? What about if the strings to parse are know at compile-time?
[1] https://svn.boost.org/svn/boost/trunk/boost/chrono/detail/scan_keyword.hpp
The symbol parser seems to be your friend here. http://tinyurl.com/7vue9nu The search is O(log n+k) using a TST. Spirit parses at runtime, not compile time. I've seen a compile time parser for compile-time strings posted somewhere in this list, but I don't recall exactly. Regards, -- Joel de Guzman http://www.boostpro.com http://boost-spirit.com

Le 29/11/11 00:47, Joel de Guzman a écrit :
On 11/29/2011 5:45 AM, Vicente J. Botet Escriba wrote:
Hi,
for chrono input I'm using a scan_keyword function that adapted from libc++ library (you can find the code in[1]). Next follows the interface:
[snip]
What is the better way to do that (or something similar) with Spirit/Lexer? Can we hope that the Spirit solution could perform much better? What about if the strings to parse are know at compile-time?
[1] https://svn.boost.org/svn/boost/trunk/boost/chrono/detail/scan_keyword.hpp The symbol parser seems to be your friend here. http://tinyurl.com/7vue9nu The search is O(log n+k) using a TST. Hi,
thanks for the pointer. This seems quite simple, I will try it and compare the performances. Does the parser try to scan the largest symbol? Shouldn't Lexer be more adequate/efficient to this use case? Sorry, what is a TST?
Spirit parses at runtime, not compile time. I've seen a compile time parser for compile-time strings posted somewhere in this list, but I don't recall exactly. Thanks anyway, Vicente

On 11/30/2011 6:23 AM, Vicente J. Botet Escriba wrote:
Le 29/11/11 00:47, Joel de Guzman a écrit :
On 11/29/2011 5:45 AM, Vicente J. Botet Escriba wrote:
Hi,
for chrono input I'm using a scan_keyword function that adapted from libc++ library (you can find the code in[1]). Next follows the interface:
[snip]
What is the better way to do that (or something similar) with Spirit/Lexer? Can we hope that the Spirit solution could perform much better? What about if the strings to parse are know at compile-time?
[1] https://svn.boost.org/svn/boost/trunk/boost/chrono/detail/scan_keyword.hpp The symbol parser seems to be your friend here. http://tinyurl.com/7vue9nu The search is O(log n+k) using a TST. Hi,
thanks for the pointer. This seems quite simple, I will try it and compare the performances. Does the parser try to scan the largest symbol?
Yes, it parses as much as it can.
Shouldn't Lexer be more adequate/efficient to this use case?
Yes, you can also use Spirit Lex.
Sorry, what is a TST?
Ternary Search Trees are faster than hashing for many typical search problems especially when the search interface is iterator based. Searching for a string of length k in a ternary search tree with n strings will require at most O(log n+k) character comparisons. TSTs are many times faster than hash tables for unsuccessful searches since mismatches are discovered earlier after examining only a few characters. Hash tables always examine an entire key when searching. For details see http://www.cs.princeton.edu/~rs/strings/. Regards, -- Joel de Guzman http://www.boostpro.com http://boost-spirit.com
participants (2)
-
Joel de Guzman
-
Vicente J. Botet Escriba