[Review] Boost.Convert library, last day

newer
[Spirit Karma] Mutable references...

older
Re: [boost] Boost.Locale and the...

Edward Diener

2 May 2011 2 May '11

3:05 p.m.

This is a reminder that formal review of the Boost.Convert library by Vladimir Batov ends today, May 2. There have only been a few reviews of the library so the library welcomes reviews. If you are one of the people who have been dissatisfied by the perceived limitations of Boost lexical_cast, or have discussed such limitations on Boost mailing lists in the past, please look at the library and give a review of it, as its intent is to offer a greatly enhanced version of lexical_cast as a new library. *************** * Its Purpose * *************** The library builds on the past boost::lexical_cast experience, then takes those conversion-related ideas further to suit better today's applications and programing needs. It still offers simple, minimal interface, familiar conversion behavior and additionally provides: * throwing and non-throwing conversion-failure behavior; * support for the default/fallback value to be returned when conversion fails; * two types of the conversion-failure check - basic/simple and better/safe; * formatting support based on the standard std::streams and std::stream-based manipulators (like std::hex, std::scientific, etc.); * support for different locales; * support for boost::range-compliant char and wchar_t-based containers; * no DefaultConstructibility requirement for the Target/Destination type; * consistent framework to uniformly incorporate any type-to-type conversions, extensibility and additional room to grow. With its present support for string-to-type and type-to-string conversions it is an essential tool for applications making extensive use of configuration or MS-Windows-Registry-style files or having to process/prepare considerable amounts of data in, say, XML, etc. More so, it is easily extendable to accommodate, specialize and uniformly deploy new user-defined type-to-type conversions. ******************* * Where to get it * ******************* The Boost Vault is accesible again after being down for most of the week, so you can get the library from the Boost Vault at http://www.boostpro.com/vault/index.php?action=downloadfile&filename=boost-string-convert.zip You can also get the library, using Subversion, from https://svn.boost.org/svn/boost/sandbox/convert. The HTML documentation is part of the distribution and can be found at libs/convert/index.html in the distribution above. ******************** * Writing a review * ******************** The reviews and all comments should be submitted to the developers list, and the email should have "[convert] Review" at the beginning of the subject line to make sure it's not missed. Please explicitly state in your review whether the library should be accepted. The general review checklist: - What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With what compiler? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain? And finally, every review should answer this question: - Do you think the library should be accepted as a Boost library? Be sure to say this explicitly so that your other comments don't obscure your overall opinion. Edward Diener, Review Manager

Show replies by date

Hartmut Kaiser

2 May 2 May

3:34 p.m.

...

This is a reminder that formal review of the Boost.Convert library by Vladimir Batov ends today, May 2.

There have only been a few reviews of the library so the library welcomes reviews. If you are one of the people who have been dissatisfied by the perceived limitations of Boost lexical_cast, or have discussed such limitations on Boost mailing lists in the past, please look at the library and give a review of it, as its intent is to offer a greatly enhanced version of lexical_cast as a new library.

Sorry for not being able to do a full review here, even if I looked at through the whole of the documentation I don't feel like having a definite answer whether it should be accepted. But I tend to vote NO.

...

From reading the docs I came away puzzled. It was highly confusing to me to have the convert::from<> return different things in different contexts. I'd suggest rethinking the API, making the different behaviors explicit.

As Thomas Heller pointed out, the existence of several default conversion paths for the result type will (with high probability) create havoc for users who really just want to do the conversion without spending much time reading the docs. The current implementation of lexical_cast (even if is not always the fastest), appeals to many people because 'it just works'. Any conversion library needs to be similarly resilient to be useful. To summarize: I feel uneasy to suggest accepting this library as it covers functionality which in my opinion belongs into lexical_cast<> in the first place. It is a natural extension of lexical_cast's functionality and for this reason should be merged with lexical_cast instead of being accepted as a separate library. Having two stream based conversion libraries in Boost (which moreover while complementing each other, sometimes expose different behavior) does not make any sense to me. OTOH, the functionality exposed by convert is much needed and has to make it into Boost somehow. Regards Hartmut --------------- http://boost-spirit.com

Vladimir Batov

3 May 3 May

10:41 a.m.

...

Hartmut Kaiser <hartmut.kaiser <at> gmail.com> writes: ...

Hartmut, Thank you for your review. Much appreciated. Please let me address a few points that you mention that I personally find incorrect or have to disagree with.

...

From reading the docs I came away puzzled. It was highly confusing to me to have the convert::from<> return different things in different contexts.

I was under impression that from the user perspective convert::from returns either the converted value or convert::result in all contexts. If users "really just want to do the conversion without spending much time reading the docs", then they'll use int i = convert<int>::from(string) as described in "Getting Started" without even knowing there is convert::result. Hardly puzzling and confusing.

...

As Thomas Heller pointed out, the existence of several default conversion paths for the result type will (with high probability) create havoc for users who really just want to do the conversion without spending much time reading the docs.

I am surprised you are using users not "reading the docs" as an argument. I find such argument quite unfortunate especially from the developer of such a complex library as Spirit. Secondly, I am not sure about "existence of several default conversion paths for the result type". Again, the only types convert::from result is converted to are the Target type and the convert::result type. Both types are for different uses and are described in the docs. As for Thomas Heller's doomsday predictions, then it must be so if he says so. Or may be not. I do not see it that way anyway.

...

The current implementation of lexical_cast (even if is not always the fastest), appeals to many people because 'it just works'. Any conversion library needs to be similarly resilient to be useful.

I am not sure if you can count my (and my colleagues) experience who've been using convert for about 3-4 years. It just works. Admittedly, our usage patterns are largely close to lexical_cast. Like int i = convert<int>::from(str, INT_MAX); if (i == INT_MAX) message("ignored"); proceed using 'i'. Indeed, there were some use-cases mentioned which I've never come across. I am not convinced that is because 'convert' is flawed. I suspect that is because those use-cases are of more of academic rather than practical value and so far I do not remember seeing anything I would not be able to address.

...

It is a natural extension of lexical_cast's functionality and for this reason should be merged with lexical_cast instead of being accepted as a separate library. Having two stream based conversion libraries in Boost (which moreover while complementing each other, sometimes expose different behavior) does not make any sense to me.

I am not sure if I ever mentioned why convert is not a "natural extension of lexical_cast". :-) More so, it might have started as such but at this point I do not see 'convert' as an extension of lexical_cast. If it is so, then similarly a car would be a natural extension of a wooden cart. Should we abandon car development and try fitting an engine on to a cart instead? As for stream-based conversions, then I chose those because it was easier to get proof-of-concept working. Then, I was satisfied with its features and performance so I never needed to replace that. I believe long time ago we agreed that providing Spirit-based string-to-int converter though convert API would be sensible for those who needed it.

...

OTOH, the functionality exposed by convert is much needed and has to make it into Boost somehow.

Yes, that seems to have been the common approach of 'no' votes. Like, yes, we need the functionality. No, the proposal is bad or not good enough (for one reason or another). We need to do it differently. I'd be happy if someone else could step forward and propose something wonderful, commit to it and deliver. Any serious and committed/responsible takers? Any time frames? V.

Thomas Heller

12:19 p.m.

On Tue, May 3, 2011 at 12:41 PM, Vladimir Batov <vbatov@people.net.au> wrote:

...

...
Hartmut Kaiser <hartmut.kaiser <at> gmail.com> writes: From reading the docs I came away puzzled. It was highly confusing to me to have the convert::from<> return different things in different contexts.

I was under impression that from the user perspective convert::from returns either the converted value or convert::result in all contexts. If users "really just want to do the conversion without spending much time reading the docs", then they'll use

int i = convert<int>::from(string)

as described in "Getting Started" without even knowing there is convert::result. Hardly puzzling and confusing.

...
As Thomas Heller pointed out, the existence of several default conversion paths for the result type will (with high probability) create havoc for users who really just want to do the conversion without spending much time reading the docs. <snip> Secondly, I am not sure about "existence of several default conversion paths for the result type". Again, the only types convert::from result is converted to are the Target type and the convert::result type. Both types are for different uses and are described in the docs. As for Thomas Heller's doomsday predictions, then it must be so if he says so. Or may be not. I do not see it that way anyway.

Sorry, this statement is wrong. At the very least convert<T>::from returns exactly one thing which is this converter thingy, which is implicitly converted to either convert<T>::result or T. And yes, this means that the result is different based on the LHS of from, which is out of your control and doomed to fail. I wonder how much codes breaks by having this. Consider the following example: #include <boost/convert.hpp> #include <string> template <typename Char, typename Allocator> void f(std::basic_string<Char, Allocator> string) {} int main() { f(boost::convert<std::string>::from(123)); } Which is not working because of this design flaw I constantly keep mocking about. Sorry to be the party pooper here. But your design is just confusing and does not really work like you are advertising it.

Christopher Jefferson

3:22 p.m.

On 3 May 2011, at 13:19, Thomas Heller wrote:

...

On Tue, May 3, 2011 at 12:41 PM, Vladimir Batov <vbatov@people.net.au> wrote:

...
...
Hartmut Kaiser <hartmut.kaiser <at> gmail.com> writes: From reading the docs I came away puzzled. It was highly confusing to me to have the convert::from<> return different things in different contexts.

I was under impression that from the user perspective convert::from returns either the converted value or convert::result in all contexts. If users "really just want to do the conversion without spending much time reading the docs", then they'll use

int i = convert<int>::from(string)

as described in "Getting Started" without even knowing there is convert::result. Hardly puzzling and confusing.

...
As Thomas Heller pointed out, the existence of several default conversion paths for the result type will (with high probability) create havoc for users who really just want to do the conversion without spending much time reading the docs. <snip> Secondly, I am not sure about "existence of several default conversion paths for the result type". Again, the only types convert::from result is converted to are the Target type and the convert::result type. Both types are for different uses and are described in the docs. As for Thomas Heller's doomsday predictions, then it must be so if he says so. Or may be not. I do not see it that way anyway.

Sorry, this statement is wrong. At the very least convert<T>::from returns exactly one thing which is this converter thingy, which is implicitly converted to either convert<T>::result or T. And yes, this means that the result is different based on the LHS of from, which is out of your control and doomed to fail. I wonder how much codes breaks by having this. Consider the following example:

#include <boost/convert.hpp> #include <string>

template <typename Char, typename Allocator> void f(std::basic_string<Char, Allocator> string) {}

int main() { f(boost::convert<std::string>::from(123)); }

Which is not working because of this design flaw I constantly keep mocking about.

Out of interest, what would: auto str = boost::convert<std::string>::from(123); Do? If boost::convert stores by pointer/reference, it would be quite dangerous. I think making libraries 'auto safe' is a good idea, I expect to see auto usage shoot up very quickly, and libraries should do their best to work well with auto. Chris

Stewart, Robert

3:36 p.m.

Christopher Jefferson wrote: [snip massive overquotation]

...

Out of interest, what would:

auto str = boost::convert<std::string>::from(123);

Do? If boost::convert stores by pointer/reference, it would be quite dangerous.

str would be a boost::convert<std::string>::converter, not a std::string as one would probably expect.

...

I think making libraries 'auto safe' is a good idea, I expect to see auto usage shoot up very quickly, and libraries should do their best to work well with auto.

This may be one of the most compelling arguments against the current interface. _____ Rob Stewart robert.stewart@sig.com Software Engineer using std::disclaimer; Dev Tools & Components Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Matthew Chambers

4:06 p.m.

On 5/3/2011 10:36 AM, Stewart, Robert wrote:

...

Christopher Jefferson wrote:

...
Out of interest, what would:

auto str = boost::convert<std::string>::from(123);

Do? If boost::convert stores by pointer/reference, it would be quite dangerous.

str would be a boost::convert<std::string>::converter, not a std::string as one would probably expect.

...
I think making libraries 'auto safe' is a good idea, I expect to see auto usage shoot up very quickly, and libraries should do their best to work well with auto.

This may be one of the most compelling arguments against the current interface.

Agreed. I know I'll be jumping on the bandwagon as soon as auto becomes mainstream in my two main compilers, MSVC and GCC. So this is a compelling argument. But would it even be possible to keep the "smart result" (either convert<>::result or optional<>) and auto compatibility? If (as I suspect) not, this would seem to necessitate different syntax for value/throw and functor. -Matt

Stewart, Robert

4:50 p.m.

Matthew Chambers wrote:

...

On 5/3/2011 10:36 AM, Stewart, Robert wrote:

...
Christopher Jefferson wrote:

...
...
Out of interest, what would:

auto str = boost::convert<std::string>::from(123);

Do? If boost::convert stores by pointer/reference, it would be quite dangerous.

str would be a boost::convert<std::string>::converter, not a std::string as one would probably expect.

...
I think making libraries 'auto safe' is a good idea, I expect to see auto usage shoot up very quickly, and libraries should do their best to work well with auto.

This may be one of the most compelling arguments against the current interface.

Agreed. I know I'll be jumping on the bandwagon as soon as auto becomes mainstream in my two main compilers, MSVC and GCC. So this is a compelling argument. But would it even be possible to keep the "smart result" (either convert<>::result or optional<>) and auto compatibility? If (as I suspect) not, this would seem to necessitate different syntax for value/throw and functor.

The above does provide the "smart result," but I think is surprising in the context. Arguably, there would be times a user would want the "smart result," so it isn't certain that this example obviates that part of the interface. Rather, this points to the need for distinct calling conventions -- whether by varying the argument list or using differently named functions (function templates) -- in order that what auto infers is what the user expects. auto i(convert_to<int>("123")); // 1 auto j(convert_to<int>("123", -1)); // 2 auto t(try_converting_to<int>("123")); // 3 converter<int> f; // 4 In // 1, i is int and the call can throw an exception on conversion failure. In // 2, j is int and the call will return -1 on conversion failure. In // 3, t is boost::optional<int>. Note that using optional rather than Boost.Convert's result type means that accessing the value will not throw an exception; it's undefined behavior, IIRC. "try_converting_to" is rather verbose, so some other name is probably wise. In // 4, auto isn't helpful, so I didn't show it. I'm thinking f is a function object with a member function template function call operator that forwards to some customization interface that permits target/source-specific customizations. I think that covers all of the use cases and plays well with auto. In the first three cases, we could use "xxx_cast" instead of "convert_to" and "try_converting_to," as I'd previously suggested, except that "123" cannot be used to distinguish between the first two and the third cases. Now, given "auto," we could make it work as follows: auto i(xxx_cast<int>("123")); // 1 auto j(xxx_cast<int>("123", -1)); // 2 auto t(xxx_cast<optional<int>>("123")); // 3 Because of "auto," there's no repetition of "optional<int>" in // 3. Of course there may be those offended by the second argument in a new-style cast as in // 2. More food for thought. _____ Rob Stewart robert.stewart@sig.com Software Engineer using std::disclaimer; Dev Tools & Components Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Barend Gehrels

5:19 p.m.

On 3-5-2011 17:36, Stewart, Robert wrote:

...

Christopher Jefferson wrote:

[snip massive overquotation]

...
Out of interest, what would:

auto str = boost::convert<std::string>::from(123);

Do? If boost::convert stores by pointer/reference, it would be quite dangerous. str would be a boost::convert<std::string>::converter, not a std::string as one would probably expect.

Confirmed. This code: BOOST_AUTO(ia, convert<int>::from("123")); std::cout << typeid(ia).name() << std::endl; writes on MSVC: struct boost::convert<int,void>::converter<char const [4],void>

...

...
I think making libraries 'auto safe' is a good idea, I expect to see auto usage shoot up very quickly, and libraries should do their best to work well with auto. This may be one of the most compelling arguments against the current interface. Agreed.

This code: template <typename T> void dummy(T const& input) { std::cout << boost::is_arithmetic<T>::value << std::endl; } (....) dummy(convert<int>::from("123")); writes 0, so will be problematic in many generic functions, giving unexpected compilation errors where values are expected. We (in Boost.Geometry) did have in the past a distance result which was "convertable to a value". The reason was to save the sqrt, if possible (so comparing two distances without sqrt). However, this turned out to be quite problematic. As Eric Niebler said during BoostCon, something convertible to something is difficult to get right. We now return values everywhere (and there is a separate function comparable_distance) Finally: dummy(lexical_cast<int>("123")); writes 1 Regards, Barend

Vladimir Batov

9:51 p.m.

...

Christopher Jefferson <chris <at> bubblescope.net> writes: ... Out of interest, what would:

auto str = boost::convert<std::string>::from(123);

Do? If boost::convert stores by pointer/reference, it would be quite dangerous.

I think making libraries 'auto safe' is a good idea, I expect to see auto usage shoot up very quickly, and libraries should do their best to work well with auto.

boost::convert<std::string>::from(123) return a converter object. Some might consider that design decision wacky/bad/etc. but it there for a reason to satisfy other use-cases. For 'auto' to work one will need to explicitly call value() as follows: auto str = boost::convert<std::string>::from(123).value(); V.

Vicente BOTET

3:56 p.m.

...

Message du 03/05/11 14:20 De : "Thomas Heller" A : boost@lists.boost.org Copie à : Objet : Re: [boost] [Review] Boost.Convert library, last day Sorry, this statement is wrong. At the very least convert::from returns exactly one thing which is this converter thingy, which is implicitly converted to either convert::result or T. And yes, this means that the result is different based on the LHS of from, which is out of your control and doomed to fail. I wonder how much codes breaks by having this. Consider the following example:

#include #include

template void f(std::basic_string string) {}

int main() { f(boost::convert::from(123)); }

Which is not working because of this design flaw I constantly keep mocking about.

Sorry to be the party pooper here. But your design is just confusing and does not really work like you are advertising it.

I've the impression that if what you are saying is right, this is a show stopper for the adoption of this interface. Could you show the compile error? Best, Vicente

Vladimir Batov

10:04 p.m.

...

Thomas Heller <thom.heller <at> googlemail.com> writes: ... Sorry, this statement is wrong. At the very least convert<T>::from returns exactly one thing which is this converter thingy, which is implicitly converted to either convert<T>::result or T. And yes, this means that the result is different based on the LHS of from, which is out of your control and doomed to fail. I wonder how much codes breaks by having this. Consider the following example:

#include <boost/convert.hpp> #include <string>

template <typename Char, typename Allocator> void f(std::basic_string<Char, Allocator> string) {}

int main() { f(boost::convert<std::string>::from(123)); }

Which is not working because of this design flaw I constantly keep mocking about.

It'd be considerably more constructive if you stopped "mocking", showed some manners and started respecting other-peoples effort. As for your not-working example

...

f(boost::convert<std::string>::from(123));

then 'yes' it does not work the way *you* intentionally deploy 'convert'. For it to work it needs to be f(boost::convert<std::string>::from(123).value()); But from your posts I conclude you do not want to know that. If my post makes you fee I am somewhat irritated by your insistently rude tone, then you are right, I am. V.

Matthew Chambers

10:26 p.m.

On 5/3/2011 5:04 PM, Vladimir Batov wrote:

...

As for your not-working example

...
f(boost::convert<std::string>::from(123));

then 'yes' it does not work the way *you* intentionally deploy 'convert'. For it to work it needs to be

f(boost::convert<std::string>::from(123).value());

This issue itself could decide the library's acceptance IMO. I.e. is it acceptable design that some (possibly many) contexts will require the extra call to value? I personally don't like sometimes requiring it and other times not. Especially when I suspect it will lead to plenty of confusing template-heavy compile-time errors and in some cases possibly some very hard to pinpoint runtime errors! These errors are especially significant IMO when the most common use of the library is for seemingly simple conversions. I am beginning to realize the danger of implicit conversions (something I hadn't thought much about before)... It's not right to say "it does not work the way *you* intentionally deploy" because it's a perfectly reasonable usage. At the very least you would need to warn about it in the documentation. -Matt

Thomas Heller

4 May 4 May

5:36 a.m.

On Wed, May 4, 2011 at 12:04 AM, Vladimir Batov <vbatov@people.net.au> wrote:

...

...
Thomas Heller <thom.heller <at> googlemail.com> writes: <snip> Which is not working because of this design flaw I constantly keep mocking about.

It'd be considerably more constructive if you stopped "mocking", showed some manners and started respecting other-peoples effort. <snip> But from your posts I conclude you do not want to know that. If my post makes you fee I am somewhat irritated by your insistently rude tone, then you are right, I am.

Vladimir, I did not intend to offend you personally. I am deeply sorry if I did. I certainly do respect the effort you put into your library and still think that it has potential, others gave you nice input, I just complained. Truly yours, Thomas

John Bytheway

3 May 3 May

8:05 p.m.

This is not a review. I may write a review later, but don't wait for it. I do however, have serious concerns about some aspects of the convert library which I would like to raise. On the subject of the "multi-purpose" function: Personally, I feel (as some others have suggested) that returning an object convertible to the "real" result type is dangerous. I am a heavy user of "auto" these days, and I would like to be able to write auto i = convert<long_type_name>::from(s); or similar things without worrying about the "surprising" result type. However, it would be easy to write a wrapper function to facilitate this use case if necessary, so I don't consider it a critical issue. I have a much bigger concern with the detailed semantics of the conversion function. One of the big issues (IMHO) with lexical_cast is the potentially surprising behaviour that can occur when the global locale is set to something other than the classic locale. It is good that convert offers the locale_ option to override the locale used for conversion, but I didn't see anywhere in the docs a clear explanation of what locale is used when none is specified. Is it the global locale or the classic locale? This needs to be mentioned clearly and conspicuously. My preference would be that the classic locale is always used unless another is specified. That allows convert to be safely used in library code that cannot know the global locale without surprises. It also allows "optimized" implementations such as that you are suggesting for string-to-bool to be inserted without changing the existing interface. The suggestion in another post that the optimized string-to-bool implementation should try "true", etc. and then fall back on a localized conversion (or, worse, that it should not support the localized conversion at all) concerns me greatly. This optimization is then changing the behaviour in non-English locales, since they will now accept "true" where before they did not. If you intend to perform this kind of breaking change in the development of Boost.Convert, then that fact also needs to be made clear and conspicuous in the documentation. Finally, I am concerned by the prospect of ODR violations as a result of specialised conversions. Taking again the string-to-bool conversion as an example: if part of a program is compiled with this optimized implementation present, and part is compiled with it absent, then I suspect the program will be in violation of the ODR. This seems really quite dangerous. In particular, I see no safe way for me to write an optimized or otherwise customized implementation of any conversion unless either (a) it is distributed as part of Boost.Convert, and included unconditionally whenever the definition of convert is, or (b) it converts to or from a type A, and is included unconditionally whenever the definition of A is. Is the extension facility intended to be used in these limited circumstances? If so, then that fact must also be clearly and conspicuously documented. If not, then I would like to know how ODR violations are to be avoided. John Bytheway

Matthew Chambers

8:28 p.m.

On 5/3/2011 3:05 PM, John Bytheway wrote:

...

Finally, I am concerned by the prospect of ODR violations as a result of specialised conversions. Taking again the string-to-bool conversion as an example: if part of a program is compiled with this optimized implementation present, and part is compiled with it absent, then I suspect the program will be in violation of the ODR. This seems really quite dangerous. In particular, I see no safe way for me to write an optimized or otherwise customized implementation of any conversion unless either

(a) it is distributed as part of Boost.Convert, and included unconditionally whenever the definition of convert is, or

(b) it converts to or from a type A, and is included unconditionally whenever the definition of A is.

Is the extension facility intended to be used in these limited circumstances? If so, then that fact must also be clearly and conspicuously documented. If not, then I would like to know how ODR violations are to be avoided.

I use an optional header that defines lexical_cast string->numeric specializations using strto[ld]. Is this at risk for ODR violations? I use it widely but I'm not at all careful about making sure it's included everywhere. Yet I haven't had any ODR violations on GCC or MSVC. Unless that's pure luck, I'd like to know how the optional specializations are ODR violations. -Matt

John Bytheway

4 May 4 May

9:23 a.m.

On 03/05/11 21:28, Matthew Chambers wrote: <snip>

...

I use an optional header that defines lexical_cast string->numeric specializations using strto[ld]. Is this at risk for ODR violations? I use it widely but I'm not at all careful about making sure it's included everywhere. Yet I haven't had any ODR violations on GCC or MSVC. Unless that's pure luck, I'd like to know how the optional specializations are ODR violations.

The thing about ODR violations is that most of the time your program will work fine despite containing them, and it's practically impossible for a compiler to detect them (I think the very latest versions of gcc make some attempt, but not much). So, you might easily have them and not know. On the one hand, the fact that they rarely break the program means that we might be somewhat blasé and not worry about them, but on the other hand I certainly think Boost should not promote deliberate violations of the standard, and I suspect these things might lead to more problems with whole-program optimization. If you would like a concrete example, here's one. Suppose we're implementing a C++ version of the Python repr functionality (called, perhaps foolishly, "print", here). We have a default implementation which simply prints out the typeid and address of the object, and it can be specialised for specific types where appropriate. I have written two functions, f() and g(), both of which print out the integer 0, but g does it in the presence of a specialization, and f does not. (I've had to use "__attribute__((noinline))" in this code to make the badness obvious, but of course the compiler is always at liberty to not inline the function, so the badness was always there) ===== print.hpp #include <cstdio> #include <typeinfo> template<typename T> struct print_helper { void operator()(T const& t) { std::printf("<%s at %p>\n", typeid(T).name(), &t); } }; template<typename T> __attribute__((noinline)) void print(T const& t) { print_helper<T>()(t); } void f(); void g(); ===== f.cpp #include "print.hpp" void f() { print(0); } ===== g.cpp #include "print.hpp" template<> struct print_helper<int> { void operator()(int const& t) { std::printf("int(%d)\n", t); } }; void g() { print(0); } ===== main.cpp #include "print.hpp" int main() { f(); g(); return 0; } There is an ODR violation in this program because there are two definitions of print<int>, and they violate N3092 [basic.def.odr] p7 point 2, because the name "print_helper<T>" refers to different entities in the two definitions. This ODR violation does indeed lead to undefined behaviour; observe what happens when I compile and run this program: $ g++ -o odr main.cpp f.cpp g.cpp $ ./odr <i at 0x7fff0d49ef8c> <i at 0x7fff0d49ef8c> $ g++ -o odr main.cpp g.cpp f.cpp $ ./odr int(0) int(0) It does different things according the the order in which I pass the source files on the command line. This is because the two (different) definitions of print<int> are being merged into one. There may or may not be a second ODR violation because of the two different definitions of print_helper<int>. My standard-ese is insufficient to figure out whether this is a violation or not. But even this first problem could clearly arise if e.g. library A which uses Boost.Convert in a template function to perform a conversion which is specialised by library B which depends on library A. This is very difficult to detect and avoid. John Bytheway

Matthew Chambers

5:11 p.m.

On 5/4/2011 4:23 AM, John Bytheway wrote:

...

On 03/05/11 21:28, Matthew Chambers wrote: <snip>

...
I use an optional header that defines lexical_cast string->numeric specializations using strto[ld]. Is this at risk for ODR violations? I use it widely but I'm not at all careful about making sure it's included everywhere. Yet I haven't had any ODR violations on GCC or MSVC. Unless that's pure luck, I'd like to know how the optional specializations are ODR violations.

[...snip simple and effective explanation...]

...

$ g++ -o odr main.cpp f.cpp g.cpp $ ./odr <i at 0x7fff0d49ef8c> <i at 0x7fff0d49ef8c> $ g++ -o odr main.cpp g.cpp f.cpp $ ./odr int(0) int(0)

It does different things according the the order in which I pass the source files on the command line. This is because the two (different) definitions of print<int> are being merged into one.

There may or may not be a second ODR violation because of the two different definitions of print_helper<int>. My standard-ese is insufficient to figure out whether this is a violation or not.

But even this first problem could clearly arise if e.g. library A which uses Boost.Convert in a template function to perform a conversion which is specialised by library B which depends on library A. This is very difficult to detect and avoid.

You make me cry sir. I'll file this under "ways that C++ assists your process to commit suicide". Cross-referenced with "ways that C++ causes programmer neuroses." I agree that boost shouldn't provide optional template specializations. So in a single binary, only one definition for lexical_cast<int, string> is allowed? Through "luck" it seems my optimized specializations have been overriding the stringstream ones, but it makes me cringe now. Maybe it even causes some mysterious crashes I've had in my C++/CLI bindings. Thanks, -Matt

5198

Age (days ago)

5200

Last active (days ago)

List overview

Download

17 comments

10 participants

participants (10)

Barend Gehrels
Christopher Jefferson
Edward Diener
Hartmut Kaiser
John Bytheway
Matthew Chambers
Stewart, Robert
Thomas Heller
Vicente BOTET
Vladimir Batov