[proto] char vs. int behaviour when matching

Proto behaves in a different way when matching terminal<char> and terminal<int>. The only difference I'm aware is that it is implementation defined whether 'char' means 'signed char' or 'unsigned char' but it has to mean one of the two. I was surprised to hit the following: struct char_grammar : proto::or_< proto::terminal<signed char>, proto::terminal<unsigned char>
{}; struct extended_char_grammar : proto::or_< proto::terminal<char>, proto::terminal<signed char>, proto::terminal<unsigned char>
{}; struct int_grammar : proto::or_< proto::terminal<signed int>, proto::terminal<unsigned int>
{}; int main() { // WHY _NOT? BOOST_MPL_ASSERT_NOT ((proto::matches<proto::terminal<char>, char_grammar>)); BOOST_MPL_ASSERT ((proto::matches<proto::terminal<char>, extended_char_grammar>)); // GIVEN THAT: BOOST_MPL_ASSERT ((proto::matches<proto::terminal<int>, int_grammar>)); return 0; } Is this behaviour intentional? Regards, Maurizio

Maurizio Vitale wrote:
Proto behaves in a different way when matching terminal<char> and terminal<int>. The only difference I'm aware is that it is implementation defined whether 'char' means 'signed char' or 'unsigned char' but it has to mean one of the two. <snip>
That's where you're mistaken. char, signed char and unsigned char are 3 distinct types. #include <boost/type_traits/is_same.hpp> #include <boost/mpl/assert.hpp> int main() { // char is not signed char or unsigned char BOOST_MPL_ASSERT_NOT((boost::is_same<char, signed char>)); BOOST_MPL_ASSERT_NOT((boost::is_same<char, unsigned char>)); // but int is signed int BOOST_MPL_ASSERT((boost::is_same<int, signed int>)); } -- Eric Niebler BoostPro Computing http://www.boostpro.com

On Wed, Sep 2, 2009 at 3:51 PM, Eric Niebler<eric@boostpro.com> wrote:
That's where you're mistaken. char, signed char and unsigned char are 3 distinct types.
Hi Eric. After learning about int "abcd" literals from your mpl::string post, I'm now learning about an integral type which is neither signed nor unsigned! (sounds like my assumption that char is an integral type might be wrong in fact...) Would you mind expanding a little on this, possibly with pointers to further info please? My naive take on char was it was either equivalent to signed char or unsigned char in a platform-specific manner, but if it's boost::is_same to neither, I'm confused... Thanks, --DD

AMDG Dominique Devienne wrote:
On Wed, Sep 2, 2009 at 3:51 PM, Eric Niebler<eric@boostpro.com> wrote:
That's where you're mistaken. char, signed char and unsigned char are 3 distinct types.
Hi Eric. After learning about int "abcd" literals from your mpl::string post, I'm now learning about an integral type which is neither signed nor unsigned!
(sounds like my assumption that char is an integral type might be wrong in fact...)
Would you mind expanding a little on this, possibly with pointers to further info please?
My naive take on char was it was either equivalent to signed char or unsigned char in a platform-specific manner, but if it's boost::is_same to neither, I'm confused...
char is always equivalent to either signed char or unsigned char, but it is not the same type. It is always a distinct type. In Christ, Steven Watanabe

"Dominique" == Dominique Devienne <ddevienne@gmail.com> writes:
Dominique> On Wed, Sep 2, 2009 at 3:51 PM, Eric Niebler<eric@boostpro.com> wrote: >> That's where you're mistaken. char, signed char and unsigned char >> are 3 distinct types. Dominique> Hi Eric. After learning about int "abcd" literals from Dominique> your mpl::string post, I'm now learning about an integral Dominique> type which is neither signed nor unsigned! Dominique> (sounds like my assumption that char is an integral type Dominique> might be wrong in fact...) Dominique> Would you mind expanding a little on this, possibly with Dominique> pointers to further info please? They are defined as different types in the standard. What may be more surprising is that given: void foo (int); void foo (signed char); void foo (unsigned char); char a; foo (a) -> invokes foo (int). This is because being different types they are not particularly related and each is promoted to int (unless a perfect match is found, of course). Here's a fragment I've grabbed from the web, don't know if it is for C or C++: 3.9.1 Fundamental types [basic.fundamental] 1 Objects declared as characters char) shall be large enough to store any member of the implementation's basic character set. If a character from this set is stored in a character object, the integral value of that character object is equal to the value of the single character literal form of that character. It is implementation-defined whether a char object can hold negative values. Characters can be explicitly declared unsigned or signed. Plain char, signed char, and unsigned char are three distinct types. A char, a signed char, and an unsigned char occupy the same amount of storage and have the same alignment requirements (basic.types); that is, they have the same object representation. For character types, all bits of the object representation participate in the value representation. For unsigned character types, all possible bit patterns of the value representation represent numbers. These requirements do not hold for other types. In any particular implementation, a plain char object can take on either the same values as a signed char or an unsigned char; which one is implementation-defined. --

On 9/3/09, Eric Niebler <eric@boostpro.com> wrote:
Maurizio Vitale wrote:
Proto behaves in a different way when matching terminal<char> and terminal<int>. The only difference I'm aware is that it is implementation defined whether 'char' means 'signed char' or 'unsigned char' but it has to mean one of the two.
That's where you're mistaken. char, signed char and unsigned char are 3 distinct types.
Yes. In "The C++ Programming Language, 3rd Ed.", Bjarne Stroustrup says that sign of a "char" type is implementation defined. A "char" type must behave identically to "signed char" or "unsigned char" but the three char types are distinct, pointers to these distinct char types can't be mixed, assigning too large a value to "signed char" is undefined and advises to prefer plain "char" over signed char and unsigned char to alleviate portability issues. (4.3, page 72 & 4.10 page 85). Best regards, -Asif
participants (5)
-
Asif Lodhi
-
Dominique Devienne
-
Eric Niebler
-
Maurizio Vitale
-
Steven Watanabe