
Hello Joel, JG> No, that does not happen. The signed(ness) is ignored by the JG> char-set class. The full 8-bits is mapped to a 256-bit bitset. Do JG> you see a need for negative char values? Unfortunately, it does happen. Look at the definition of two functions. The first is boost::spirit::utility::impl::construct_chset (in the file $(BOOST)/boost/spirit/utility/impl/chset.ipp): template <typename CharT, typename CharT2> void construct_chset(boost::shared_ptr<basic_chset<CharT> >& ptr, CharT2 const* definition) { CharT2 ch = *definition++; while (ch) { CharT2 next = *definition++; if (next == '-') { next = *definition++; if (next == 0) { ptr->set(ch); ptr->set('-'); break; } ptr->set(ch, next); } else { ptr->set(ch); } ch = next; } } It is obvious that for our input "\x9\xA\xD\x20-\xFF" the function "ptr->set(ch, next);" will be invoked with the first argument equals to '\x20', i.e. 32, and the second equals to '\xff', i.e. -1. Yes, you right, the implementation then will try to map these arguments into a bitset of unsigned values. But look at the way it is implemented in boost::spirit::basic_chset_8bit<CharT>::set. You can find the implementation of in $(BOOST)/boost/spirit/utility/impl/chset/basic_chset.ipp: basic_chset_8bit<CharT>::set(CharT from, CharT to) { for (int i = from; i <= to; ++i) bset.set((unsigned char)i); } There will be not a single iteration when from == 32 and to == -1. To correct this issue I suggest change Char = chset_t("\x9\xA\xD\x20-\xFF"); to Char = chset_t("\x9\xA\xD\x20-\x7f\x80-\xFF"); The latter one will work independently of sign'ness of the char type. JG> Do JG> you see a need for negative char values? No. But char is signed by default on my compiler. That fact along with improper initialization of Char primitive caused me problems when I tried to implement XML comments skipping in boost::serialization. Regards. -- -- Vyacheslav E. Andrejev -- System Architect, Optech International, Inc. -- E-mail: mortituris@mail.ru JG> Vyacheslav E. Andrejev wrote: JG>
Hello All,
XML grammar parser in boost::serialization has a definition of spirit primitive to deal with XML character set. Initialization of this pimitive looks like following (file $(BOOST)/libs/serialization/src/xml_grammar.cpp):
Char = chset_t("\x9\xA\xD\x20-\xFF");
Obviously, if char type is signed, then \xFF means -1 and the above initialization will be equivalent to
Char = chset_t("\x9\xA\xD");
JG> No, that does not happen. The signed(ness) is ignored by the JG> char-set class. The full 8-bits is mapped to a 256-bit bitset. Do JG> you see a need for negative char values? JG> JG> Regards, JG> JG> _______________________________________________ JG> Unsubscribe & other changes: JG> http://lists.boost.org/mailman/listinfo.cgi/boost