Re: [boost] Formal Review Request: Boost.String.Convert

newer
Architecture Analysis of the boost...

Vladimir.Batov＠wrsa.com.au

17 Feb 2009 17 Feb '09

2 a.m.

Upgraded to the version 0.10. Added boost::string::is_string<T> SFINAE-based check. That allowed to discriminate the applicability of conversion functions. Now they demand one of the parameters to be a string (in broad sense, like C strings, std::(w)string, chat/wchar_t-based containers) and the other *not* a string. Have straighten up the main boost::string::convert() interface (was quite broken) so that all the tests now run as expected. g++ 4.2.4 (Linux) and Visual Studio 2008 are happy. Thanks, Vladimir.

Show replies by date

Hartmut Kaiser

17 Feb 17 Feb

2:18 a.m.

New subject: Formal Review Request: Boost.String.Convert

...

Upgraded to the version 0.10. Added boost::string::is_string<T> SFINAE-based check. That allowed to discriminate the applicability of conversion functions. Now they demand one of the parameters to be a string (in broad sense, like C strings, std::(w)string, chat/wchar_t-based containers) and the other *not* a string. Have straighten up the main boost::string::convert() interface (was quite broken) so that all the tests now run as expected.

Why not allow to convert between different string types? Possible applications: - std::string <--> std::wstring or similar (based on a future Boost.Unicode library) - conversion between different symbol (character) sets Regards Hartmut

Vladimir Batov

4:30 a.m.

New subject: Formal Review Request: Boost.String.Convert

Hartmut,

...

Why not allow to convert between different string types?

Yes, I believe that is a very sensible question. In fact, I've tightened up the type checks for the implemented boost::string::convert() to minimize possible signature clashes with/of further extensions (via specializations/oveloads). In the current form boost::string::convert() is essentially a replacement for lexical_cast with added forrmatting, locale, etc. support. To add, say, u8string<->wstring conversion support we'll need overloads std::wstring convert(std::u8string) std::u8string convert(std::wstring) added.

...

Possible applications:

- std::string <--> std::wstring or similar (based on a future Boost.Unicode library)

I am not sure we can do std::string <-> std::wstring unless we know what std::string represents (currently it can be UTF8 or MBCS). If, with the introduction of std::u8string, std::string is guaranteed to be MBCS, then I guess we can have std::string <-> std::wstring as well.

...

- conversion between different symbol (character) sets

Currently convert() heavily relies on supplied types. Are we going to have distinct types for different symbol (character) sets? If not, then we might move forward as I've done for the throwing behavior (i.e. run-time configuration vs. compile-time configuration): int i = boost::string::convert(str, -1) >> boost::throw_t(); to pass a clue/directive what to do. Similarly we might do string new_set_str = boost::string::convert(old_set_str) >> new_set_directive(); Just thinking out loud. Does it look anywhere close to what you had in mind? V. P.S. Thank you for your Spirit conversion snippet another day. Appreciated.

Hartmut Kaiser

1:34 p.m.

New subject: Formal Review Request: Boost.String.Convert

...

...
Why not allow to convert between different string types?

Yes, I believe that is a very sensible question. In fact, I've tightened up the type checks for the implemented boost::string::convert() to minimize possible signature clashes with/of further extensions (via specializations/oveloads).

In the current form boost::string::convert() is essentially a replacement for lexical_cast with added forrmatting, locale, etc. support. To add, say, u8string<->wstring conversion support we'll need overloads

std::wstring convert(std::u8string) std::u8string convert(std::wstring)

added.

Makes sense.

...

...
Possible applications:

- std::string <--> std::wstring or similar (based on a future Boost.Unicode library)

I am not sure we can do std::string <-> std::wstring unless we know what std::string represents (currently it can be UTF8 or MBCS). If, with the introduction of std::u8string, std::string is guaranteed to be MBCS, then I guess we can have std::string <-> std::wstring as well.

That's what I meant. Sorry for being in-concise.

...

...
- conversion between different symbol (character) sets

Currently convert() heavily relies on supplied types. Are we going to have distinct types for different symbol (character) sets? If not, then we might move forward as I've done for the throwing behavior (i.e. run-time configuration vs. compile-time configuration):

int i = boost::string::convert(str, -1) >> boost::throw_t();

to pass a clue/directive what to do. Similarly we might do

string new_set_str = boost::string::convert(old_set_str) >> new_set_directive();

Just thinking out loud. Does it look anywhere close to what you had in mind?

In Spirit we use a using namespace boost::spirit::ascii; (or similar) to tie in a specific character set. I'm not sure if this is a viable solution for you. Regards Hartmut

Andrey Semashev

3:57 p.m.

New subject: Formal Review Request: Boost.String.Convert

Vladimir Batov wrote:

...

...
- std::string <--> std::wstring or similar (based on a future Boost.Unicode library)

I am not sure we can do std::string <-> std::wstring unless we know what std::string represents (currently it can be UTF8 or MBCS). If, with the introduction of std::u8string, std::string is guaranteed to be MBCS, then I guess we can have std::string <-> std::wstring as well.

I think it would be sufficient to rely on the locale to make decisions about the char nature. I have solved this particular task in Boost.Log. You may find it in boost/log/detail/code_conversion.hpp and libs/log/src/code_conversion.cpp, if you're interested.

Vladimir Batov

18 Feb 18 Feb

5:51 a.m.

New subject: Formal Review Request: Boost.String.Convert

...

I think it would be sufficient to rely on the locale to make decisions about the char nature. I have solved this particular task in Boost.Log. You may find it in boost/log/detail/code_conversion.hpp and libs/log/src/code_conversion.cpp, if you're interested.

Andrey, thank you for the pointer. I'll definitely have a look. I am somewhat cautious though due to past experience. Back then we had OpenLDAP on Windows. That is, for all internal purposes we used the platform's coding -- MBCS. However, for anything OpenLDAP-related we had to handle UTF8 as heavily. That is we had to have explicit UTF8<->MBCS<->WIDE and could not rely on any support from locale. V.

Andrey Semashev

5:21 p.m.

New subject: Formal Review Request: Boost.String.Convert

Vladimir Batov wrote:

...

...
I think it would be sufficient to rely on the locale to make decisions about the char nature. I have solved this particular task in Boost.Log. You may find it in boost/log/detail/code_conversion.hpp and libs/log/src/code_conversion.cpp, if you're interested.

Andrey, thank you for the pointer. I'll definitely have a look. I am somewhat cautious though due to past experience. Back then we had OpenLDAP on Windows. That is, for all internal purposes we used the platform's coding -- MBCS. However, for anything OpenLDAP-related we had to handle UTF8 as heavily. That is we had to have explicit UTF8<->MBCS<->WIDE and could not rely on any support from locale.

The locale can be adjusted, if needed. One only has to substitute the codecvt facet in the locale to use different encoding rules. I think that your particular UTF8<->MBCS<->WIDE case could have been solved this way, too.

5992

Age (days ago)

5993

Last active (days ago)

List overview

Download

6 comments

4 participants

participants (4)

Andrey Semashev
Hartmut Kaiser
Vladimir Batov
Vladimir.Batov＠wrsa.com.au