[spirit2] How to get encoding-specific parsers by character type

I was wondering if it was possible to get encoding-specific parsers by character type? Something like that: template< typename CharT > struct encoding_specific { ... }; typedef encoding_specific< char > narrow; narrow::char_; // equivalent to spirit::standard::char_ typedef encoding_specific< wchar_t > wide; wide::char_; // equivalent to spirit::standard_wide::char_ This would help a lot in generic programming, when the character type is not known. Is there a tool like that already? If not, could it be added?

I was wondering if it was possible to get encoding-specific parsers by character type? Something like that:
template< typename CharT > struct encoding_specific { ... };
typedef encoding_specific< char > narrow; narrow::char_; // equivalent to spirit::standard::char_
typedef encoding_specific< wchar_t > wide; wide::char_; // equivalent to spirit::standard_wide::char_
This would help a lot in generic programming, when the character type is not known. Is there a tool like that already? If not, could it be added?
There isn't anything like that. The reason is that usually you want to specify a character set in addition to the character type to use. So the solution above is nothing I would like to see in Spirit, even if it might be sufficient for your particular case. OTOH, if it is sufficient for you why don't you just add it inside your namespace and be done? Regards Hartmut --------------- Meet me at BoostCon www.boostcon.com

On 06/11/2010 12:35 PM, Hartmut Kaiser wrote:
I was wondering if it was possible to get encoding-specific parsers by character type? Something like that:
template< typename CharT> struct encoding_specific { ... };
typedef encoding_specific< char> narrow; narrow::char_; // equivalent to spirit::standard::char_
typedef encoding_specific< wchar_t> wide; wide::char_; // equivalent to spirit::standard_wide::char_
This would help a lot in generic programming, when the character type is not known. Is there a tool like that already? If not, could it be added?
There isn't anything like that. The reason is that usually you want to specify a character set in addition to the character type to use. So the solution above is nothing I would like to see in Spirit, even if it might be sufficient for your particular case. OTOH, if it is sufficient for you why don't you just add it inside your namespace and be done?
Well, I did so. It just so happens that I find myself replicating that code in different places. I'm not very familiar with the standard specs on the new character types but isn't there a strong relationship between the character type and encoding? Can't we be sure that e.g. char16_t is UTF-16, char is the standard narrow and wchar_t is the standard wide encoding?

On 6/12/10 5:13 AM, Andrey Semashev wrote:
On 06/11/2010 12:35 PM, Hartmut Kaiser wrote:
I was wondering if it was possible to get encoding-specific parsers by character type? Something like that:
template< typename CharT> struct encoding_specific { ... };
typedef encoding_specific< char> narrow; narrow::char_; // equivalent to spirit::standard::char_
typedef encoding_specific< wchar_t> wide; wide::char_; // equivalent to spirit::standard_wide::char_
This would help a lot in generic programming, when the character type is not known. Is there a tool like that already? If not, could it be added?
There isn't anything like that. The reason is that usually you want to specify a character set in addition to the character type to use. So the solution above is nothing I would like to see in Spirit, even if it might be sufficient for your particular case. OTOH, if it is sufficient for you why don't you just add it inside your namespace and be done?
Well, I did so. It just so happens that I find myself replicating that code in different places.
I'm not very familiar with the standard specs on the new character types but isn't there a strong relationship between the character type and encoding? Can't we be sure that e.g. char16_t is UTF-16, char is the standard narrow and wchar_t is the standard wide encoding?
I don't think so. I use char/UTF-8 a lot now; I don't want char to be tied to standard narrow. Regards, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

On 06/12/2010 04:09 AM, Joel de Guzman wrote:
On 6/12/10 5:13 AM, Andrey Semashev wrote:
On 06/11/2010 12:35 PM, Hartmut Kaiser wrote:
There isn't anything like that. The reason is that usually you want to specify a character set in addition to the character type to use. So the solution above is nothing I would like to see in Spirit, even if it might be sufficient for your particular case. OTOH, if it is sufficient for you why don't you just add it inside your namespace and be done?
Well, I did so. It just so happens that I find myself replicating that code in different places.
I'm not very familiar with the standard specs on the new character types but isn't there a strong relationship between the character type and encoding? Can't we be sure that e.g. char16_t is UTF-16, char is the standard narrow and wchar_t is the standard wide encoding?
I don't think so. I use char/UTF-8 a lot now; I don't want char to be tied to standard narrow.
Ok, what about specifying encoding tag then? template< typename EncodingTagT > struct encoding_specific { ... }; I could create my traits to deduce the default encoding by the character type for my case.

On 6/12/10 2:42 PM, Andrey Semashev wrote:
On 06/12/2010 04:09 AM, Joel de Guzman wrote:
On 6/12/10 5:13 AM, Andrey Semashev wrote:
On 06/11/2010 12:35 PM, Hartmut Kaiser wrote:
There isn't anything like that. The reason is that usually you want to specify a character set in addition to the character type to use. So the solution above is nothing I would like to see in Spirit, even if it might be sufficient for your particular case. OTOH, if it is sufficient for you why don't you just add it inside your namespace and be done?
Well, I did so. It just so happens that I find myself replicating that code in different places.
I'm not very familiar with the standard specs on the new character types but isn't there a strong relationship between the character type and encoding? Can't we be sure that e.g. char16_t is UTF-16, char is the standard narrow and wchar_t is the standard wide encoding?
I don't think so. I use char/UTF-8 a lot now; I don't want char to be tied to standard narrow.
Ok, what about specifying encoding tag then?
template< typename EncodingTagT > struct encoding_specific { ... };
I could create my traits to deduce the default encoding by the character type for my case.
I'm sorry, you lost me. Could you please elaborate? Cheers, -- Joel de Guzman http://www.boostpro.com http://spirit.sf.net

On 06/12/2010 02:00 PM, Joel de Guzman wrote:
I don't think so. I use char/UTF-8 a lot now; I don't want char to be tied to standard narrow.
Ok, what about specifying encoding tag then?
template< typename EncodingTagT > struct encoding_specific { ... };
I could create my traits to deduce the default encoding by the character type for my case.
I'm sorry, you lost me. Could you please elaborate?
What I want to be able to do is something like that: template< typename CharT > void parse(const CharT* str) { typedef typename my_traits< CharT >::encoding encoding; typedef spirit::encoding_specific< encoding > parsers; qi::parse(str, *parsers::char_); } The encoding type shall be one of the tags defined by spirit (for instance, spirit::char_encoding::standard). What I need is the encoding_specific traits in spirit.

I don't think so. I use char/UTF-8 a lot now; I don't want char to be tied to standard narrow.
Ok, what about specifying encoding tag then?
template< typename EncodingTagT > struct encoding_specific { ... };
I could create my traits to deduce the default encoding by the character type for my case.
I'm sorry, you lost me. Could you please elaborate?
What I want to be able to do is something like that:
template< typename CharT > void parse(const CharT* str) { typedef typename my_traits< CharT >::encoding encoding; typedef spirit::encoding_specific< encoding > parsers; qi::parse(str, *parsers::char_); }
The encoding type shall be one of the tags defined by spirit (for instance, spirit::char_encoding::standard). What I need is the encoding_specific traits in spirit.
That's making a lot more sense. Would you be willing to prepare a patch? Regards Hartmut --------------- Meet me at BoostCon www.boostcon.com

On 06/12/2010 04:15 PM, Hartmut Kaiser wrote:
What I want to be able to do is something like that:
template< typename CharT> void parse(const CharT* str) { typedef typename my_traits< CharT>::encoding encoding; typedef spirit::encoding_specific< encoding> parsers; qi::parse(str, *parsers::char_); }
The encoding type shall be one of the tags defined by spirit (for instance, spirit::char_encoding::standard). What I need is the encoding_specific traits in spirit.
That's making a lot more sense. Would you be willing to prepare a patch?
I created the following ticket: <https://svn.boost.org/trac/boost/ticket/4336> The patch is attached to it.

At Sat, 12 Jun 2010 01:13:58 +0400, Andrey Semashev wrote:
Can't we be sure that e.g. char16_t is UTF-16, char is the standard narrow and wchar_t is the standard wide encoding?
Actually, IIUC char16_t is used for UCS-2, which is just like UTF-16 except there are no surrogate pairs. -- Dave Abrahams BoostPro Computing http://www.boostpro.com
participants (4)
-
Andrey Semashev
-
David Abrahams
-
Hartmut Kaiser
-
Joel de Guzman