Re: [boost] boost::lexical_cast string to string problem - New proposition available - Boost-users

23 May 2002

      (This posting is also sent to the Boost list)

Following a suggestion at the Boost User's list, I've updated the
lexical_cast version in the Boost files section
(http://groups.yahoo.com/group/boost/files/lexical_cast_proposition/) to use
implicit conversion, where possible. This means it uses the fastest and most
accurate conversion available, and only resorts to using stringstream when
it has to.

It uses boost::is_convertible for this. Even if that may not work on all
compilers, if it doesn't work, it will just resort to using stringstream.

However, this does mean that the semantics is changed slightly. In the cases
where an implicit conversion exists, it may give a different result than if
not, in the case of char/wchar_t. Without implict conversion, you get 1 ->
'1'. With implicit conversion, you get 1 -> 1 (ASCII 1). As far as I can
tell, this only affects conversions between char/wchar_t and other types,
though. If this is a problem, please let me know, and I can change it to
make an exception for char/wchar_t.

Another thing is that there's now an option to configure the interpreter
stream, as well.

This is done _without_ changing the interface of lexical_cast. Instead, it's
done in a trait-like fashion, by making the interpreter object available
through an associated component, which is used by lexical_cast.

The reason this is optional, is that when enabled, it relies on having a
static stringstream object. This has two effects (unless implicit conversion
is used):

- It may well make lexical_cast faster, as it avoids the creation and
destruction of a stringstream object, each time it's called.
- Unless stringstream is reentrant, which is probably not the case, it means
that when the option is enabled, lexical_cast is not thread safe.

Therefore, this option is off by default, which gives the usual behaviour -
stringstream objects are created at the function call.

If the option is used, it lets you configure the stream used, in any way.
This way, one can implement all the features of the "wish list" in the
previous posting in this thread, such as.

- wlexical_cast - A version that can use wide characters
  - This is already handled by the previous version.

- basic_lexical_cast - A version that takes the interpreter (stringstream)
as a parameter, to support things like changing locale

- Being able to set number base (dec, oct or hex). This would be implicitly
possible using basic_lexical_cast

- Being able to set precision for numerical conversions. This may also be
done with basic_lexical_cast

  - These are all handled now.

- Checking for change of sign, when using unsigned numbers. This is
addressed using an integer adapter, in the other lexical_cast_propositions

The lack of formatting is what is largely the reason I haven't used
lexical_cast much. With control over the formatting, it may be more widely
applicable.

Since the stream may be different, for different conversions (char, wchar_t
or other character type), one need to supply the source and destination
types, when configuring the stream.

Here's an example of its use.

--- Start ---

#define BOOST_LEXICAL_CAST_STREAM // Enables stream configuration

#include <iostream>
#include <iomanip>
#include <string>
#include "lexical_cast.hpp"

int main()
{
boost::lexical_cast_stream<std::string,double>::stream() <<
std::setprecision(0);

std::cout << boost::lexical_cast<std::string>(1.0/3.0) << '\n';

boost::lexical_cast_stream<std::string,double>::stream() <<
std::setprecision(10);

std::cout << boost::lexical_cast<std::string>(1.0/3.0) << '\n';

boost::lexical_cast_stream<std::string,int>::stream() << std::setbase(16);

std::cout << boost::lexical_cast<std::string>(0x1234) << '\n';
}

--- End ---

This prints.

0.333333
0.3333333333
1234

Any suggestions for changes or additions, including interface changes, is
welcome.

I realise that the way to change the stream formatting is rather
long-winded, but I haven't found a shorter way to do it (the name has to be
different from lexical_cast, too). Suggestions for alternative ways are
welcome.

I've not found a way to make the stream configurable, and yet avoid a static
stringstream. If the stringstream is created when lexical_cast is called, it
means any configuration information is lost. Even if it copied the state
from another stream, that other stream would have to exist, such as being
static, so that would only move it.

Despite the extra code, to handle implicit conversions, where available, by
examining the resulting assembly output (from Intel C++), it's in fact able
to optimise _all_ of it away, producing indentical code as if built in
conversions were used directly.

Here's an example.

int n;
double d=1.23;

int main()
{
n=boost::lexical_cast<int>(d);
}

00401084   fld         qword ptr [d (00427db0)]
0040108A   fstp        qword ptr [esp+8]
0040108E   mov         esi,dword ptr [esp+8]
00401092   mov         edx,dword ptr [esp+0Ch]
00401096   mov         ecx,edx
00401098   shr         ecx,14h
0040109B   and         ecx,7FFh
004010A1   cmp         ecx,3FEh
004010A7   jle         main+52h (004010c6)
004010A9   mov         eax,edx
004010AB   neg         ecx
004010AD   add         ecx,41Eh
004010B3   shld        eax,esi,0Bh
004010B7   or          eax,80000000h
004010BC   shr         eax,cl
004010BE   test        edx,edx
004010C0   jge         main+54h (004010c8)
004010C2   neg         eax
004010C4   jmp         main+54h (004010c8)
004010C6   xor         eax,eax
004010C8   mov         [n (0042bef8)],eax

int n;
double d=1.23;

int main()
{
n=d;
}

0041AFF0   fld         qword ptr [d (00427db0)]
0041AFF6   fstp        qword ptr [esp+8]
0041AFFA   mov         esi,dword ptr [esp+8]
0041AFFE   mov         edx,dword ptr [esp+0Ch]
0041B002   mov         ecx,edx
0041B004   shr         ecx,14h
0041B007   and         ecx,7FFh
0041B00D   cmp         ecx,3FEh
0041B013   jle         main+1A022h (0041b032)
0041B015   mov         eax,edx
0041B017   neg         ecx
0041B019   add         ecx,41Eh
0041B01F   shld        eax,esi,0Bh
0041B023   or          eax,80000000h
0041B028   shr         eax,cl
0041B02A   test        edx,edx
0041B02C   jge        main+1A024h (0041b034)
0041B02E   neg         eax
0041B030   jmp        main+1A024h (0041b034)
0041B032   xor         eax,eax
0041B034   mov         [n (0042b84c)],eax

The code is identical. :)

Thus, you get a generic conversion, with the speed and accuracy of built in
conversions, where that is possible.

It's a good thing I have a unit test, so I could do regression testing, to
make sure I haven't broken anything.

Regards,

Terje

Re: [boost] boost::lexical_cast string to string problem - New proposition available

Terje Slettebø

tags

participants (1)