typeid/type_info proposal (long, go get another jolt cola)

Warning: This is a fairly lengthy post, as I have had a difficult time putting my thoughts into words (also, I have had very little free time lately -- though it is hard to say that with a straight face since I just spent the holiday weekend skiiing). If you want to skip my monotonous musings, read the summary only, and possibly skip down a few paragraphs until you see a significant amout of vertical white space, then read some more technical points. I would really appreciate any thoughts or insights, as I am quite frankly not very satisfied with my own. SUMMARY: I would like to delay proposing a typeid() replacement. Instead, I'd like to submit my testsuite, and ask people to run it on various platforms and report the results. This will help us decide how best to approach this seemingly simple, but very difficult, subject. I would like to propose a boost::type_info class now (or very soon). Meaningful instances of boost::type_info will be constructed from a std::type_info resulting from a native typeid() call. Once we figure out what route to take wrt typeid(), then I think it can be used in a very straight forward manner. Now for some rumblings, where I try to sort out my experiences with typeid (I apologize in advance for the lack of polish, but I want to get this out for comment because I am going out of town Friday afternoon, and will not be back until Sunday night). A while back, I had some of my own code to use typeid/type_info. The code was inspired by the TypeInfo class in "Modern C++ Design" by Andrei Alexandrescu. However, I ran into some issues (namely the infamous g++ shared library problem). A search of boost found an implementation in Boost.Python. As has been mentioned in the past, this seems like a useful utility to have around, so I made an attempt to bring it out of Boost.Python, and give it life of its own. David suggested that such a small exposure is a good candidate for a fast track review, so I naively thought that meant it would be a small amount of work. Along the way, I ran into several portability issues, and decided to make sure the code had the best typeid/type_info support I could imagine. Unfortunately, this took me down a very twisted highway. I could not find an existing test suite, so I wrote my own, based on the standard (5.2.8 and 18.5.1). Well, this turned into a nightmare, because I have had a hard time finding any compiler that will pass all my tests. I have had to move to gcc 3.4 to get most of them to pass, and still some fail (though it may be due to my tests... I am still looking into that). I will say that I now know more about typeid() than I ever thought I'd want to know (I could quite possibly quote verbatum section 5.2.8, and a fair bit of 18.5.1 to boot). Anyway, getting a portable typeid/type_info library working is not exactly as easy as it appears. A previous post of mine attributes most typeid problems to older msvc/intel compilers. This is only mostly true. They are the worst, because they do not even try to remove the top-level cv-qualifiers. However, even relatively recent versions of gcc has had problems with this in esoteric cases (multidimensional array with cv qualifiers mixed with typedefs). All this to say that I have my doubts that many compilers fully support typeid() correctly. To top it all off, many implementations have trouble honoring the comparison mechanism across separate shared libraries (as noted in the Boost.Python implementation), so there are some workarounds needed for type_info as well. Since I have seen so many problems, I am no longer certain that the review of typeid can be done quickly. While the implementation in Boost.Python works well, it also has several holes, so I do not think it wise to simply "bring it out" into boost. Thus, I have made an attempt to "industrialize" it. With this in mind, I thought I'd give a rundown of the major issues. I recently posted some of my concerns, but due to the length of that post, and possibly its ramblings, I have not seen any replies. Yes, I know, this one is longer, and possibly more "rambling" but I am hopeful yet ;-) I will leave the more minor issues for later, but would really like some input on the bigger ones. The library, as used in Boost.Python, relies on calling the no-parameter template function, type_id, to get the type info object. Use of typeid() by itself is absent, because of the problem with compilers getting the typeid() implementation correct. For example: struct foo { /* ... */ }; boost::python::type_info ti = boost::python::type_id< foo >(); Unfortunately, boost::python::type_id<>() has a couple major drawbacks (and a number of "minor" ones). 1. It only supports type-id calls, with a specific type passed as a template parameter. The ability to compute type_info based on the result of an expression is scant (a bit more functionality can possibly be obtained with more tricks like the ingenious typeof() stuff recently posted). 2. To add support for expressions and lvalues, overloads of type_id() can be provided. This works reasonably well (with enough machinery to get around compiler problems), but the spec for typeid() says that the expression is not supposed to be evaluated (unless the expression is an lvalue of a polymorphic type, or dereferences a pointer to a polymorphic type). I have yet to see a way to call a function and prevent the expression from being evaluated (maybe with some modern tricks and is_polymorphic -- but we are then probably leaving several compilers behind as well). Also, I have been unable to get the "workaround" to throw std::bad_typeid when dereferencing a null pointer, as specified in 5.2.8.2). 3. For compilers with bad cv-qualifier support wrt typeid(), it requires a fair amount of machinery to get rid of the cv-qualifiers. Unfortunately, the same machinery makes it difficult to preserve the polymorphic requirements of 5.2.8.2, and I have yet to find the right code to get this to work (though I think it may be possible). Thus, we face a dilemma. Either always use the native typeid(), knowing that your compiler may not fully support it "by the book" or provide some workaround, that will not provide everything either. By fixing some things, others are broken. We are left with some options: 1. Always use the native typeid() in code, and live with what the compiler does. 2. Always use a "workaround" and live with whatever level of "brokeness" that provides. I am not convinced that we can get a consistent level of brokeness for all compilers, though it is more likely to be close for all compilers than just using typeid(). 3. Provide a macro BOOST_TYPEID(), which is set depending on which compiler is being used, to get the "best" effect. Depending on the day, I feel differently about which option I like. Thus, I have written a fairly extensive set of tests for native typeid(). I feel fairly certain that they can tell us what to expect from any compiler relative to typeid conformance. We could use those tests results, from "all" the supported platforms to see where we are (my access to compilers/platforms is quite limited). Maybe if we had a better idea about what each of the "major" platoforms do/not support we could get a better idea as to how we want to proceed. In any case, I do not think this will be super quick. So, let's move to type_info. I think this one is a bit more clear (though we still need some compiler workarounds). However, the only real workarounds I have found so far involve "broken" comparison operators. I think these are fairly well known, and mostly solved. I also have a fairly extensive set of tests for native std::type_info, as well as the new boost::type_info. However, I am still hung up on the whole typeid thing stated above. I tend to think that most use of typeid() will work as expected. The "major" problems concern quite old compilers (I think), and some of the problems in more modern compilers are in the "corners" of usage. Thus, it makes sense to get boost::type_info in place, and wait until we have some test results to see what is the propor course of action for typeid. Thanks, and if you read this far, maybe YOU should go skiing ;-)

The serialization library deals with a lot of these issues. I've managed to make it work with all the compilers that boost supports. I didn't have problems getting it to this point - but I had no choice but to resolve them all somehow. The result of my efforts is extended_type_info and void_cast. Asside from the fact that not all compilers handled the 'const' the same way, I also needed a little more information than the normal typid() delivers. Specifically I wanted to cast between pointers at runtime eventhough the classes weren't polymorphic in the usual sense. I toyed with the idea of documenting extended_type_info and void_cast separatly. But I sort of bogged down while trying to imagine how it might be used by other applications. I do think it could be enhanced and documented to handle the applications you have in mind - but I can't say for sure. Given that you've already invested a lot of effort understanding all the issues related to this, I would much appreciate it if you could take a look at extended_type_info and void_cast in the serialization package and render your assessment of their more general utility. Robert Ramey

On Thu, 24 Feb 2005 20:51:43 -0800 "Robert Ramey" <ramey@rrsd.com> wrote:
The serialization library deals with a lot of these issues. I've managed to make it work with all the compilers that boost supports. I didn't have problems getting it to this point - but I had no choice but to resolve them all somehow. The result of my efforts is extended_type_info and void_cast. Asside from the fact that not all compilers handled the 'const' the same way, I also needed a little more information than the normal typid() delivers. Specifically I wanted to cast between pointers at runtime eventhough the classes weren't polymorphic in the usual sense.
I toyed with the idea of documenting extended_type_info and void_cast separatly. But I sort of bogged down while trying to imagine how it might be used by other applications. I do think it could be enhanced and documented to handle the applications you have in mind - but I can't say for sure.
Given that you've already invested a lot of effort understanding all the issues related to this, I would much appreciate it if you could take a look at extended_type_info and void_cast in the serialization package and render your assessment of their more general utility.
OK. I am leaving tonight, for the weekend, and I doubt I will have a chance to look at it before then. However, I will look at it next week and let you know my thoughts. I took a BRIEF glance, and here are my initial comments. Please correct/redirect me where I am in error. Also, note that I have not had time to think whether these comments mean I agree/disagree with the choices, they are just initial thoughts (and may even be totally inaccurate). 1. extended_type_info is a base class for all other type_info classes. 2. extended_type_info disables copy/ctor and copy/assignment, which is one of the perceived benefits of using a user defined type_info replacement. 3. extended_type_info constructor is only visible to derived classes (actually, extended_type_info_typeid::get_instance() creates all the instances). 4. No conversion from std::type_info to extended_type_info. 5. Must register explicitly to use (via extended_type_info::key_register or defining extended_type_info_no_rtti<T>::type_key, or setting the public "key" member variable). 6. I realize (as of 1996), the default linkage of inline functions is external, but I am still reluctant to put static "singleton" data inside an inline function. 7. I still think you have the same issues with typeid() that I describe earlier, since all of your access to std::type_info is through a naked call to typeid(). I could not find any code that tried to get around broken implementations of typeid(). 8. Your "solution" is similar to that in Boost.Python as it is to use the extended_type_info and extended_type_info_typeid<T>::get_instance(), which has different semantics than typeid(). However, you go a step further to prevent any kind of unexpected results, since you can not create an extended_type_info out of a std::type_info. 9. Your concept may provide benefits outside serialization, but I think it is a separate tool, and I do not think it addresses the desire to have a more usable std::type_info. Thanks again for working with me on this!

Jody Hagins <jody-boost-011304@atdesk.com> writes:
The library, as used in Boost.Python, relies on calling the no-parameter template function, type_id, to get the type info object. Use of typeid() by itself is absent, because of the problem with compilers getting the typeid() implementation correct. For example:
struct foo { /* ... */ }; boost::python::type_info ti = boost::python::type_id< foo >();
Unfortunately, boost::python::type_id<>() has a couple major drawbacks (and a number of "minor" ones).
1. It only supports type-id calls, with a specific type passed as a template parameter. The ability to compute type_info based on the result of an expression is scant (a bit more functionality can possibly be obtained with more tricks like the ingenious typeof() stuff recently posted).
Yeah, it's hard to suppress actual evaluation of the expression without typeof.
2. To add support for expressions and lvalues
What do you mean by "support for lvalues?" typeid is _supposed_ to strip top-level reference.
overloads of type_id() can be provided. This works reasonably well (with enough machinery to get around compiler problems), but the spec for typeid() says that the expression is not supposed to be evaluated (unless the expression is an lvalue of a polymorphic type, or dereferences a pointer to a polymorphic type). I have yet to see a way to call a function and prevent the expression from being evaluated (maybe with some modern tricks and is_polymorphic -- but we are then probably leaving several compilers behind as well). Also, I have been unable to get the "workaround" to throw std::bad_typeid when dereferencing a null pointer, as specified in 5.2.8.2).
?? I don't know what "workaround" you're referring to, but detecting null pointers is pretty trivial.
3. For compilers with bad cv-qualifier support wrt typeid(), it requires a fair amount of machinery to get rid of the cv-qualifiers.
It shouldn't.
Unfortunately, the same machinery makes it difficult to preserve the polymorphic requirements of 5.2.8.2
Details? Oh, I think I can guess...
, and I have yet to find the right code to get this to work (though I think it may be possible).
It should be pretty trivial if you just add an additional function parameter.
Thus, we face a dilemma. Either always use the native typeid(), knowing that your compiler may not fully support it "by the book" or provide some workaround, that will not provide everything either. By fixing some things, others are broken.
I think you're giving up too easily.
So, let's move to type_info. I think this one is a bit more clear (though we still need some compiler workarounds). However, the only real workarounds I have found so far involve "broken" comparison operators. I think these are fairly well known, and mostly solved. I also have a fairly extensive set of tests for native std::type_info, as well as the new boost::type_info. However, I am still hung up on the whole typeid thing stated above.
I tend to think that most use of typeid() will work as expected. The "major" problems concern quite old compilers (I think),
Yeah. There are actually older EDG compilers (e.g. SGI Irix) for which the problems can be insoluble. Comparison across shared libraries is done based on addresses, and in different translation units, the same type may have different string representations. -- Dave Abrahams Boost Consulting www.boost-consulting.com

On Fri, 25 Feb 2005 11:05:02 -0500 David Abrahams <dave@boost-consulting.com> wrote: Thanks, David!
2. To add support for expressions and lvalues
What do you mean by "support for lvalues?" typeid is _supposed_ to strip top-level reference.
I meant supporting typeid of functions and objects, like... struct foo { }; foo f; typeid(f);
?? I don't know what "workaround" you're referring to, but detecting null pointers is pretty trivial.
Sure, detecting a null pointer before it is dereferenced is trivial, but I am not aware of trivial ways to detect a null pointer being dereferenced. struct base { virtual ~base() { } }; struct derived : public base { }; base * get_base_ptr(); typeid(*get_base_ptr()); If get_base_ptr() returns 0, this is not undefined behavior, nor does it SEGV, but specifically, it should throw std::bad_typeid.
3. For compilers with bad cv-qualifier support wrt typeid(), it requires a fair amount of machinery to get rid of the cv-qualifiers.
It shouldn't.
Maybe my definition of "fair amount" is out of the norm. I consider msvc_typeid.hpp to be a fair amount, and to handle other corner cases I have found, it requires a bit more than that.
, and I have yet to find the right code to get this to work (though I think it may be possible).
It should be pretty trivial if you just add an additional function parameter.
Well, for gcc, I have most of the stuff working, even when I assume that gcc is broken, and force it to use the workaround code. However, I still run into some problems because typeid behaves differently for polymorphic and nonpolymorphic types. From the docs, is_polymorphic requires features that are not present in some compilers. I am a bit concerned about trying workarounds that I can not test because I do not have those compilers available.
I think you're giving up too easily.
Fair enough, but I wouldn't quite say I was giving up. I feel that I do not have enough information at this time to keep going. Specifically, I am concerned that I may not be able to get common functionality across the board because of bad typeid() implementations. I think there are at least two areas that can not be overcome without correct native typeid() support (exception of deref NULL and not evaluating the expression). I think we can come close on the others, but I am not sure how close on some older compilers. So, I am not ready to call it quits, and I am sorry that is how it came across. Instead, I think I need assistance from others who have other platforms available to them to determine how we should proceed with typeid().
Yeah. There are actually older EDG compilers (e.g. SGI Irix) for which the problems can be insoluble. Comparison across shared libraries is done based on addresses, and in different translation units, the same type may have different string representations.
Thank you very much for your insights. I hope I have "continued" the conversation, and not "deadened" it. I certainly do not pretend to have all the answers, so do not take me as being argumentative. However, the questions I have been forced to answer leave me scratching my head as how I can do so for other compilers... <lame_excuse> FWIW, my experience with C++ is quite detailed, but my experieince with backward portability is very small. I have been (un)fortunate in that I have never had to work on a project that required backward compatibility with older compilers. Support for older compilers stopped at a certain version, and use of more modern versions required upgrading to a compiler that supported the new code. So, getting code to work on a whole range of compilers is a new realm for me. I find it difficult (especially when I am "guessing" at how these compilers will behave), and, frankly, no fun at all. </lame_excuse>

Jody Hagins <jody-boost-011304@atdesk.com> writes:
On Fri, 25 Feb 2005 11:05:02 -0500 David Abrahams <dave@boost-consulting.com> wrote:
Thanks, David!
2. To add support for expressions and lvalues
What do you mean by "support for lvalues?" typeid is _supposed_ to strip top-level reference.
I meant supporting typeid of functions and objects, like...
struct foo { }; foo f; typeid(f);
?? I don't know what "workaround" you're referring to, but detecting null pointers is pretty trivial.
Sure, detecting a null pointer before it is dereferenced is trivial, but I am not aware of trivial ways to detect a null pointer being dereferenced.
struct base { virtual ~base() { } }; struct derived : public base { }; base * get_base_ptr(); typeid(*get_base_ptr());
If get_base_ptr() returns 0, this is not undefined behavior, nor does it SEGV, but specifically, it should throw std::bad_typeid.
Heh, 5.2.8/2. That's _almost_ a very silly exception to the rest of the C++ rules. You can deal with this either of 2 ways: 1. Declare this library to be a workaround for broken implmentations. Inside such a library you can always do something strictly illegal but in practice portable. I don't know of a machine anywhere that will hiccup when fed this: int* f(int& p) { return &p; } int* p = 0; int* q = f(p); 2. Write a separate pointee_typeid function.
Well, for gcc, I have most of the stuff working, even when I assume that gcc is broken, and force it to use the workaround code. However, I still run into some problems because typeid behaves differently for polymorphic and nonpolymorphic types. From the docs, is_polymorphic requires features that are not present in some compilers. I am a bit concerned about trying workarounds that I can not test because I do not have those compilers available.
If the compilers are broken anyway, I wouldn't worry about it. There's no rule that every Boost library has to work perfectly on every nonconforming compiler. If there are problems, we'll find out, and adjust accordingly.
I think you're giving up too easily.
Fair enough, but I wouldn't quite say I was giving up. I feel that I do not have enough information at this time to keep going. Specifically, I am concerned that I may not be able to get common functionality across the board because of bad typeid() implementations. I think there are at least two areas that can not be overcome without correct native typeid() support (exception of deref NULL and not evaluating the expression). I think we can come close on the others, but I am not sure how close on some older compilers. So, I am not ready to call it quits, and I am sorry that is how it came across. Instead, I think I need assistance from others who have other platforms available to them to determine how we should proceed with typeid().
Yeah. There are actually older EDG compilers (e.g. SGI Irix) for which the problems can be insoluble. Comparison across shared libraries is done based on addresses, and in different translation units, the same type may have different string representations.
Thank you very much for your insights. I hope I have "continued" the conversation, and not "deadened" it. I certainly do not pretend to have all the answers, so do not take me as being argumentative. However, the questions I have been forced to answer leave me scratching my head as how I can do so for other compilers...
<lame_excuse> FWIW, my experience with C++ is quite detailed, but my experieince with backward portability is very small. I have been (un)fortunate in that I have never had to work on a project that required backward compatibility with older compilers.
You're still not... except inasmuch as you want to replace the code in Boost.Python, it needs to work as well as it used to on platforms that library supports. -- Dave Abrahams Boost Consulting www.boost-consulting.com

Jody Hagins wrote:
The library, as used in Boost.Python, relies on calling the no-parameter template function, type_id, to get the type info object. Use of typeid() by itself is absent, because of the problem with compilers getting the typeid() implementation correct. For example:
struct foo { /* ... */ }; boost::python::type_info ti = boost::python::type_id< foo >();
Unfortunately, boost::python::type_id<>() has a couple major drawbacks (and a number of "minor" ones).
1. It only supports type-id calls, with a specific type passed as a template parameter. The ability to compute type_info based on the result of an expression is scant (a bit more functionality can possibly be obtained with more tricks like the ingenious typeof() stuff recently posted).
Please have a look at the article just published at the C++ Source (http://www.artima.com/cppsource/index.jsp) entitled "Conditional Love". If you want a BOOST_TYPEID macro that accepts an expression and returns the type_info without evaluating the expression, it's really quite simple: template<typename T> T const * encode_type( T const & ) { return 0; } template<typename T> type_info type_id_helper( T const * ) { // return type_info for type T } #define BOOST_TYPEID( expr ) \ type_id_helper( true? 0 : encode_type( expr ) ) With carefully selected overloads of encode_type and type_id_helper, you can correctly handle cv-qualifier and rvalues/lvalues. -- Eric Niebler Boost Consulting www.boost-consulting.com

On Fri, 25 Feb 2005 08:48:09 -0800 "Eric Niebler" <eric@boost-consulting.com> wrote:
Please have a look at the article just published at the C++ Source (http://www.artima.com/cppsource/index.jsp) entitled "Conditional Love". If you want a BOOST_TYPEID macro that accepts an expression and returns the type_info without evaluating the expression, it's really quite simple:
template<typename T> T const * encode_type( T const & ) { return 0; }
template<typename T> type_info type_id_helper( T const * ) { // return type_info for type T }
#define BOOST_TYPEID( expr ) \ type_id_helper( true? 0 : encode_type( expr ) )
With carefully selected overloads of encode_type and type_id_helper, you can correctly handle cv-qualifier and rvalues/lvalues.
Thanks, Eric. I will print it out and read it over the weekend...

Hi Jody First of all: Many thanks for taking care of this!
SUMMARY: I would like to delay proposing a typeid() replacement. Instead, I'd like to submit my testsuite, and ask people to run it on various platforms and report the results. This will help us decide how best to approach this seemingly simple, but very difficult, subject. I would like to propose a boost::type_info class now (or very soon). Meaningful instances of boost::type_info will be constructed from a std::type_info resulting from a native typeid() call. Once we figure out what route to take wrt typeid(), then I think it can be used in a very straight forward manner.
This looks like a good plan. FWIW, I don't currently care whether the boost::type_info is implemented in terms of std::type_info (and therefore necessarily exposes the non-standardness of the underlying platform) or a sophisticated technique to emulate standard behavior. My uses of typeid are so limited that I can easily avoid the bugs of the platforms I care for. In other words, I only need std::type_info with comparison operators. Other proposed libraries seem to have similar needs. Regards, -- Andreas Huber When replying by private email, please remove the words spam and trap from the address shown in the header.
participants (5)
-
Andreas Huber
-
David Abrahams
-
Eric Niebler
-
Jody Hagins
-
Robert Ramey