RE: [Boost-Users] boost::lexical_cast string to string problem - New proposition available
--- In Boost-Users@y..., Tom Matelich
wrote: I and many others have requested fixes for this for a very long time. Basically, it involves creating class templates to do the actual conversion. It's not the most complicated system, just nobody has done it yet. If you search the boost list archives, you'll see a lot of mail about this. There is no quick solution to your problem though.
I've searched the archive, now, and see that there's been quite a few propositions for changes, but apparently none of them have ended up in the lexical_cast implementation. There's also a couple of propositions, here (http://groups.yahoo.com/group/boost/files/lexical_cast_propositions/). Some of the proposed changes are: - wlexical_cast - A version that can use wide characters - basic_lexical_cast - A version that takes the interpreter (stringstream) as a parameter, to support things like changing locale - Being able to set number base (dec, oct or hex). This would be implicitly possible using basic_lexical_cast - Being able to set precision for numerical conversions. This may also be done with basic_lexical_cast - Checking for change of sign, when using unsigned numbers. This is addressed using an integer adapter, in the above lexical_cast_propositions One problem with doing these changes, after having talked with Kevlin Henney a while ago (without knowing it had been already proposed, I proposed an extra parameter, for setting the number base). is that he prefers to not have extra parameters in the interface, as it then will no longer look like a cast, and it may be less generic, depending on the way it's done. For example, a base-setting argument wouldn't make much sense if you converted from floating point to string. In any case, I have actually just now made a version of lexical_cast, that fixes the problem pointed out by OP, here, and I've made it available in the Boost files section (http://groups.yahoo.com/group/boost/files/lexical_cast_proposition/). I've also put a unit test for it, there. Feedback is welcome. It doesn't change the interface in any way, so it should be ok. The new version supports conversion between string and char, including char containing whitespace, and when the string is empty, or containing only whitespace. If the source or destination (or both) is basic_string or char/wchar_t, it will set the interpreter to use the char type used, so it implicitely supports wide characters. Wide character support requires partial specialisation if one of the operands is not either basic_string or char/wchar_t. In the case where the source is basic_string, it also uses pass by const reference, for efficiency. It's tested on Intel C++ 6.0, MSVC 6.0 and BCC 5.6. Regards, Terje
Would it be possible to use an existing conversion operator ? Example : class Output { ... } class Input { operator Output(void) [ return ... ; } } Then : lexical_cast<Output>( Input ) ... would use this conversion operator. This article http://www.cuj.com/experts/1810/alexandr.htm?topic=experts ... describes hox it is possible at compile-time to detect whether a class A is convertible to class B, ie if an operator already exists. If so, we would not need in lexical_cast<> to use the serialization operation. What also would be great, is to specialise lexical_cast<> for conversion where is very fast builtin function or operator already exists. ----- Original Message ----- From: Terje Slettebø To: Boost-Users@yahoogroups.com Sent: Wednesday, May 22, 2002 10:46 PM Subject: RE: [Boost-Users] boost::lexical_cast string to string problem - New proposition available
--- In Boost-Users@y..., Tom Matelich
wrote: I and many others have requested fixes for this for a very long time. Basically, it involves creating class templates to do the actual conversion. It's not the most complicated system, just nobody has done it yet. If you search the boost list archives, you'll see a lot of mail about this. There is no quick solution to your problem though.
I've searched the archive, now, and see that there's been quite a few propositions for changes, but apparently none of them have ended up in the lexical_cast implementation. There's also a couple of propositions, here (http://groups.yahoo.com/group/boost/files/lexical_cast_propositions/). Some of the proposed changes are: - wlexical_cast - A version that can use wide characters - basic_lexical_cast - A version that takes the interpreter (stringstream) as a parameter, to support things like changing locale - Being able to set number base (dec, oct or hex). This would be implicitly possible using basic_lexical_cast - Being able to set precision for numerical conversions. This may also be done with basic_lexical_cast - Checking for change of sign, when using unsigned numbers. This is addressed using an integer adapter, in the above lexical_cast_propositions One problem with doing these changes, after having talked with Kevlin Henney a while ago (without knowing it had been already proposed, I proposed an extra parameter, for setting the number base). is that he prefers to not have extra parameters in the interface, as it then will no longer look like a cast, and it may be less generic, depending on the way it's done. For example, a base-setting argument wouldn't make much sense if you converted from floating point to string. In any case, I have actually just now made a version of lexical_cast, that fixes the problem pointed out by OP, here, and I've made it available in the Boost files section (http://groups.yahoo.com/group/boost/files/lexical_cast_proposition/). I've also put a unit test for it, there. Feedback is welcome. It doesn't change the interface in any way, so it should be ok. The new version supports conversion between string and char, including char containing whitespace, and when the string is empty, or containing only whitespace. If the source or destination (or both) is basic_string or char/wchar_t, it will set the interpreter to use the char type used, so it implicitely supports wide characters. Wide character support requires partial specialisation if one of the operands is not either basic_string or char/wchar_t. In the case where the source is basic_string, it also uses pass by const reference, for efficiency. It's tested on Intel C++ 6.0, MSVC 6.0 and BCC 5.6. Regards, Terje Yahoo! Groups Sponsor ADVERTISEMENT Info: http://www.boost.org Wiki: http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl Unsubscribe: mailto:boost-users-unsubscribe@yahoogroups.com Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service. [Non-text portions of this message have been removed]
From: ravioli@softhome.net
Would it be possible to use an existing conversion operator ? Example : class Output { ... }
class Input { operator Output(void) [ return ... ; } }
Then : lexical_cast<Output>( Input ) ... would use this conversion operator.
It would be possible, yes. That would mean that it uses the fastest and most accurate conversion available, and only resorts to using stringstream when it has to. That seems like a good idea. I've incorporated this change, and updated the version in the Boost files section (http://groups.yahoo.com/group/boost/files/lexical_cast_proposition/). It uses boost::is_convertible for it. Even if that may not work on all compilers, if it doesn't work, it will just resort to using stringstream. Thanks for the suggestion. :) Actually, come to think of it, this also makes some of the special case handling in the previous version obsolete, and that's a good thing. This includes things like converting from char to char, or wchar_t to wchar_t. Therefore, I've removed those special cases, where they are handled by implicit conversion. There's still conversion between char and string, that needs special handling. However, this does mean that the semantics is changed slightly. In the cases where an implicit conversion exists, it may give a different result than if not, in the case of char/wchar_t. Without implict conversion, you get 1 -> '1'. With implicit conversion, you get 1 -> 1 (ASCII 1). As far as I can tell, this only affects conversions between char/wchar_t and other types, though. If this is a problem, please let me know, and I can change it to make an exception for char/wchar_t. Another thing I've been thinking of, is some way to configure the interpreter stream, to be able to change the formatting, such as number base, precision and field width. In fact, I've just now made a version where that is possible, _without_ changing the interface of lexical_cast. I've updated the version in the Boost files version with this. I'll post the details for how to use the stream configuration, in another posting.
This article http://www.cuj.com/experts/1810/alexandr.htm?topic=experts
... describes how it is possible at compile-time to detect whether a class A is convertible to class B, ie if an operator already exists. If so, we would not need in lexical_cast<> to use the serialization operation.
Exactly. In fact, I'm both using the conversion (using boost::is_convertible), and the Int2Type technique descibed in the article (and in "Modern C++ Design"), in order to make it work, to select between implicit conversion, or using stringstream. Without the Int2Type technique, it won't compile. I'm using a different name in the source, though, but the technique is the same: template<bool flag> struct int_to_type { };
What also would be great, is to specialise lexical_cast<> for conversion where is very fast builtin function or operator already exists.
That's also an idea. Again, this means it would use the most efficient way possible. However, it may not work that well with things like locale, and if configuration of the interpreter is possible. Maybe have the specialisations as an option. Do you know about any such fast functions or operators (besides the conversion operators mentioned earlier here) for some types? Thanks for the feedback. :) Regards, Terje ----- Original Message ----- From: Terje Slettebø To: Boost-Users@yahoogroups.com Sent: Wednesday, May 22, 2002 10:46 PM Subject: RE: [Boost-Users] boost::lexical_cast string to string problem - New proposition available
--- In Boost-Users@y..., Tom Matelich
wrote: I and many others have requested fixes for this for a very long time. Basically, it involves creating class templates to do the actual conversion. It's not the most complicated system, just nobody has done it yet. If you search the boost list archives, you'll see a lot of mail about this. There is no quick solution to your problem though.
I've searched the archive, now, and see that there's been quite a few propositions for changes, but apparently none of them have ended up in the lexical_cast implementation. There's also a couple of propositions, here (http://groups.yahoo.com/group/boost/files/lexical_cast_propositions/). Some of the proposed changes are: - wlexical_cast - A version that can use wide characters - basic_lexical_cast - A version that takes the interpreter (stringstream) as a parameter, to support things like changing locale - Being able to set number base (dec, oct or hex). This would be implicitly possible using basic_lexical_cast - Being able to set precision for numerical conversions. This may also be done with basic_lexical_cast - Checking for change of sign, when using unsigned numbers. This is addressed using an integer adapter, in the above lexical_cast_propositions One problem with doing these changes, after having talked with Kevlin Henney a while ago (without knowing it had been already proposed, I proposed an extra parameter, for setting the number base). is that he prefers to not have extra parameters in the interface, as it then will no longer look like a cast, and it may be less generic, depending on the way it's done. For example, a base-setting argument wouldn't make much sense if you converted from floating point to string. In any case, I have actually just now made a version of lexical_cast, that fixes the problem pointed out by OP, here, and I've made it available in the Boost files section (http://groups.yahoo.com/group/boost/files/lexical_cast_proposition/). I've also put a unit test for it, there. Feedback is welcome. It doesn't change the interface in any way, so it should be ok. The new version supports conversion between string and char, including char containing whitespace, and when the string is empty, or containing only whitespace. If the source or destination (or both) is basic_string or char/wchar_t, it will set the interpreter to use the char type used, so it implicitely supports wide characters. Wide character support requires partial specialisation if one of the operands is not either basic_string or char/wchar_t. In the case where the source is basic_string, it also uses pass by const reference, for efficiency. It's tested on Intel C++ 6.0, MSVC 6.0 and BCC 5.6. Regards, Terje
However, this does mean that the semantics is changed slightly. In the cases where an implicit conversion exists, it may give a different result than if not, in the case of char/wchar_t. Without implict conversion, you get 1 -> '1'. With implicit conversion, you get 1 -> 1 (ASCII 1). As far as I can tell, this only affects conversions between char/wchar_t and other types, though. If this is a problem, please let me know, and I can change it to make an exception for char/wchar_t.
Is this behaviour overridable, for example by adding a specialization transforming 1 => '1' ?
Do you know about any such fast functions or operators (besides the conversion operators mentioned earlier here) for some types? It seems these good old clib functions are pretty fast for conversions. Sorry if they sadly look old-fashioned : atoi(), atol(), strtol(), strtod(), sprintf(), sscanf()... You may laugh at me, but they are, afaik, really the fastest ones :) depending on the platform lib , of course
Maybe, for time types on Unix, asctime(const struct tm *) and strftime() if properly wrapped, all built-ins functions involving complex<> types and their conversion to and from doubles, ints, and so on. PS : Have you considered conversion to and from Roman numbers ;) ;) ? http://www.damtp.cam.ac.uk/user/bp10004/calc_roman.html http://www.maclean-nj.com/romancode.htm ----- Original Message ----- From: Terje Slettebø To: Boost-Users@yahoogroups.com Sent: Thursday, May 23, 2002 2:21 PM Subject: Re: [Boost-Users] boost::lexical_cast string to string problem - New proposition available
From: ravioli@softhome.net
Would it be possible to use an existing conversion operator ? Example : class Output { ... }
class Input { operator Output(void) [ return ... ; } }
Then : lexical_cast<Output>( Input ) ... would use this conversion operator.
It would be possible, yes. That would mean that it uses the fastest and most accurate conversion available, and only resorts to using stringstream when it has to. That seems like a good idea. I've incorporated this change, and updated the version in the Boost files section (http://groups.yahoo.com/group/boost/files/lexical_cast_proposition/). It uses boost::is_convertible for it. Even if that may not work on all compilers, if it doesn't work, it will just resort to using stringstream. Thanks for the suggestion. :) Actually, come to think of it, this also makes some of the special case handling in the previous version obsolete, and that's a good thing. This includes things like converting from char to char, or wchar_t to wchar_t. Therefore, I've removed those special cases, where they are handled by implicit conversion. There's still conversion between char and string, that needs special handling. However, this does mean that the semantics is changed slightly. In the cases where an implicit conversion exists, it may give a different result than if not, in the case of char/wchar_t. Without implict conversion, you get 1 -> '1'. With implicit conversion, you get 1 -> 1 (ASCII 1). As far as I can tell, this only affects conversions between char/wchar_t and other types, though. If this is a problem, please let me know, and I can change it to make an exception for char/wchar_t. Another thing I've been thinking of, is some way to configure the interpreter stream, to be able to change the formatting, such as number base, precision and field width. In fact, I've just now made a version where that is possible, _without_ changing the interface of lexical_cast. I've updated the version in the Boost files version with this. I'll post the details for how to use the stream configuration, in another posting.
This article http://www.cuj.com/experts/1810/alexandr.htm?topic=experts
... describes how it is possible at compile-time to detect whether a class A is convertible to class B, ie if an operator already exists. If so, we would not need in lexical_cast<> to use the serialization operation.
Exactly. In fact, I'm both using the conversion (using boost::is_convertible), and the Int2Type technique descibed in the article (and in "Modern C++ Design"), in order to make it work, to select between implicit conversion, or using stringstream. Without the Int2Type technique, it won't compile. I'm using a different name in the source, though, but the technique is the same: template<bool flag> struct int_to_type { };
What also would be great, is to specialise lexical_cast<> for conversion where is very fast builtin function or operator already exists.
That's also an idea. Again, this means it would use the most efficient way
possible. However, it may not work that well with things like locale, and if
configuration of the interpreter is possible. Maybe have the specialisations
as an option.
Do you know about any such fast functions or operators (besides the
conversion operators mentioned earlier here) for some types?
Thanks for the feedback. :)
Regards,
Terje
----- Original Message -----
From: Terje Slettebø
To: Boost-Users@yahoogroups.com
Sent: Wednesday, May 22, 2002 10:46 PM
Subject: RE: [Boost-Users] boost::lexical_cast string to string problem -
New proposition available
>--- In Boost-Users@y..., Tom Matelich
From: ravioli@softhome.net
There was feedback about this on the Boost list, as well, so I've posted a reply about this, there, as well.
However, this does mean that the semantics is changed slightly. In the cases where an implicit conversion exists, it may give a different result than if not, in the case of char/wchar_t. Without implict conversion, you get 1 -> '1'. With implicit conversion, you get 1 -> 1 (ASCII 1). As far as I can tell, this only affects conversions between char/wchar_t and other types, though. If this is a problem, please let me know, and I can change it to make an exception for char/wchar_t.
Is this behaviour overridable, for example by adding a specialization transforming 1 => '1' ?
Absolutely. That's what I meant by making an exception. This version of lexical_cast relies heavily on specialisation, and partial specialisation where available, although where not available, specialisations for the common cases such as char/wchar_t and std::string/std::wstring, is included. I also intend to make a version where partial specialisation isn't required. Considering this, it does indeed seem like a reasonable conversion, for something called lexical_cast. After all, this is how numbers are converted to strings, so it makes sense that the same happens for characters. Therefore, I've changed this so that it performs the above conversion from/to char/wchar_t, to make it consistent with the conversion from/to std::basic_string. Feedback at the Boost list also suggested what you suggested, here.
Do you know about any such fast functions or operators (besides the conversion operators mentioned earlier here) for some types?
It seems these good old clib functions are pretty fast for conversions. Sorry if they sadly look old-fashioned : atoi(), atol(), strtol(), strtod(), sprintf(), sscanf()...
That's no problem in general. Hidden in a library implementation, they can be as cryptic as they want. :)
You may laugh at me, but they are, afaik, really the fastest ones :) depending on the platform lib , of course
:) One reason that I hesitate with this, however, is what I've mentioned earlier, regarding being able to customise the formatting, by configuring the stringstream object. This won't be possible if you use such C functions, because they won't follow the stream state, including locale settings. Especially as it's now possible to configure the stringstream interpreter, and I'm also working on a new version where you can supply the stringstream object as an optional argument to lexical_cast, I don't think it's a good idea to use the C functions above. They will not follow the stream formatting. Is this reasonable for you?
Maybe, for time types on Unix, asctime(const struct tm *) and strftime() if properly wrapped, all built-ins functions involving complex<> types and their conversion to and from doubles, ints, and so on.
These types, like a Roman numbers class you mention below, here, can be made without any extra support from lexical_cast. You just provide the required stream operators, and any constructors or conversion operators. Therefore, I don't think these are the responsibility of lexical_cast. In the grand C++ tradition, lexical_cast is designed to be extensible, so that it can handle any such new types.
PS : Have you considered conversion to and from Roman numbers ;) ;) ?
Well, lexical_cast is about conversions between _types_. So if you want to
convert to and from Roman numbers, make a Roman numbers class. :)
If you design it the right way, i.e. including stream operators for
reading/writing Roman numbers, and any implicit conversions you'd like to
have (for example from/to int), then it should work with lexical_cast. :)
As you know, lexical_cast now supports implicit conversion, where available.
Thanks for the feedback. :)
Another thing I'm wondering about, using implicit conversion, where
available (except for the special cases, like conversion from/to
char/wchar_t, etc., as mentioned), it means that the following works:
int i=lexical_cast<int>(1.23); // double to int
However, using the original version of lexical_cast, the above would throw a
bad_lexical_cast. That's because the function is defined like this:
template
participants (2)
-
ravioli@softhome.net
-
Terje Slettebø