
Can anyone tell me why Boost.Spirit (which reuses Boost.Regex for regular
expressions) works while the code below where Boost.Regex is used directly
can not be compiled?
std::string in = "test";
boost::wregex expr(L"test");
boost::match_resultsstd::string::const_iterator what;
boost::regex_search(in, what, expr); // COMPILER ERROR
std::string out;
boost::spirit::rxstrlit

Boris wrote:
Can anyone tell me why Boost.Spirit (which reuses Boost.Regex for regular expressions) works while the code below where Boost.Regex is used directly can not be compiled?
std::string in = "test";
boost::wregex expr(L"test"); boost::match_resultsstd::string::const_iterator what; boost::regex_search(in, what, expr); // COMPILER ERROR
std::string out; boost::spirit::rxstrlit
expr2(L"test"); boost::spirit::parse(in.c_str(), expr2[boost::spirit::assign_a(out)]); The basic problem is of course that the regular expressions are based on wchar_t while the input string is based on char. Thus I'm not so much surprised why Boost.Regex does not work.
Right, by design that doesn't compile.
I'm much more surprised though that my compiler (VC++ 2008) doesn't report an error with the code above which uses Boost.Spirit.
I don't know why the code using Boost.Spirit compiles (is this implementation-dependent or guaranteed behavior)? It would make using regular expressions in templates much easier though as regular expressions don't seem to depend on the Char type of strings (assuming this is a feature I can rely on)?
Well that depends on the regular expression, something like "[\x{0370}-\x{03ff}]+" most certainly does depend upon the character type! ;-) John.

On Wed, 02 Apr 2008 17:32:56 +0200, John Maddock
[...]
I don't know why the code using Boost.Spirit compiles (is this implementation-dependent or guaranteed behavior)? It would make using regular expressions in templates much easier though as regular expressions don't seem to depend on the Char type of strings (assuming this is a feature I can rely on)?
Well that depends on the regular expression, something like "[\x{0370}-\x{03ff}]+" most certainly does depend upon the character type! ;-)
I converted a JSON parser called TinyJSON (see http://blog.beef.de/projects/tinyjson/) to a template recently so it can be used with char- and wchar_t-based strings (among others). As it's a JSON parser text has to be parsed which typically means tokens have to be compared. A JSON object starts for example with { which is '{' for char-based parsers and L'{' for wchar_t-based parsers. You can't write either '{' nor L'{' though as the literal type does not depend on the template parameter. When I found out that Boost.Spirit still works for various string types (std::string, std::wstring, std::basic_string<int>) even though char-based literals are used to define the rules (like regular expressions) I was pretty much surprised. I created a couple of test cases to verify that the parser works but I'm not sure if I tricked myself. The idea of making the JSON parser a template is of course to make it work with any string type no matter what charT type it is based on. As the JSON parser is a text processing class though I'm not sure if there is a hidden additional requirement related to encoding. Then the template would not only be dependent on a type but on something else which would not be obvious to the user (and which would make me think that making the JSON parser a template is actually not a good idea)? I searched Usenet to see if anyone else ran into a similar problem and found this thread: http://groups.google.com/group/comp.lang.c++/browse_thread/thread/23dc9ccc60... With Boost.Spirit the problem can be solved: template< typename Char > bool isPassword( std::basic_string<Char> const& s ) { // return s == "password"; <-- This works only for std::string // return s == L"password"; <-- This works only for std::wstring // This works for any string type: boost::spirit::rxstrlit<> expr("password"); return boost::spirit::parse(s.c_str(), expr).full; } I wonder if I missed anything? Is this the cheap, simple and efficient solution for literal strings in templates Alf was looking for in his thread on Usenet? Or is there an implicit requirement about encoding which I currently don't see and which makes code like above a bad idea? Boris

On Wed, 02 Apr 2008 20:06:57 +0200, Boris
[...]I wonder if I missed anything? Is this the cheap, simple and efficient solution for literal strings in templates Alf was looking for in his thread on Usenet? Or is there an implicit requirement about encoding which I currently don't see and which makes code like above a bad idea?
I've been looking around at Boost to find libraries where the designers faced similar problems and found Boost.Program_options. There is a webpage about Unicode support at http://www.boost.org/doc/libs/1_35_0/doc/html/program_options/design.html. Having read the discussion it seems like that it might be better to convert strings to an internal string type (in a JSON parser) instead of relying on Boost.Spirit somehow magically working with any string type provided by the user? Boris
participants (2)
-
Boris
-
John Maddock