[Convert] std::isspace requires unsigned #28
Hi,
boost\convert\base.hpp(112): warning C6330: 'const char' passed as _Param_(1) when 'unsigned char' is required in call to 'isspace'.
boost\convert\base.hpp(115): warning C6330: 'const char' passed as _Param_(1) when 'unsigned char' is required in call to 'isspace'.
https://github.com/boostorg/convert/issues/28 IMO the code should be fixed and it's not a documentation issue. What do you think? -- Olaf
Olaf, On 2017-03-20 19:10, Olaf van der Spek via Boost wrote:
boost\convert\base.hpp(112): warning C6330: 'const char' passed as _Param_(1) when 'unsigned char' is required in call to 'isspace'.
boost\convert\base.hpp(115): warning C6330: 'const char' passed as _Param_(1) when 'unsigned char' is required in call to 'isspace'.
https://github.com/boostorg/convert/issues/28
IMO the code should be fixed and it's not a documentation issue. What do you think?
We discussed it on https://github.com/boostorg/convert/issues/28. I even thought we've come to the same conclusion that is comes down to documentation. With you saying: "OK, so where are those preconditions of boost::convert documented?" and me agreeing to address that in the docs as you asked. Clearly I misunderstood. :-)
Vladimir Batov wrote:
On 2017-03-20 19:10, Olaf van der Spek via Boost wrote:
boost\convert\base.hpp(112): warning C6330: 'const char' passed as _Param_(1) when 'unsigned char' is required in call to 'isspace'.
boost\convert\base.hpp(115): warning C6330: 'const char' passed as _Param_(1) when 'unsigned char' is required in call to 'isspace'.
https://github.com/boostorg/convert/issues/28
IMO the code should be fixed and it's not a documentation issue. What do you think?
We discussed it on https://github.com/boostorg/convert/issues/28.
The analyzer is correct; passing char to isspace is a well-known bug, should be cast to unsigned char. The accepted range of the argument to isspace is -1..255 (assuming 8 bit character table), with -1 being EOF, and the default promotion from char to int doesn't do the right thing when char is signed.
On 2017-03-20 22:13, Peter Dimov via Boost wrote:
Vladimir Batov wrote:
... We discussed it on https://github.com/boostorg/convert/issues/28.
The analyzer is correct; passing char to isspace is a well-known bug, should be cast to unsigned char. The accepted range of the argument to isspace is -1..255 (assuming 8 bit character table), with -1 being EOF, and the default promotion from char to int doesn't do the right thing when char is signed.
Thanks, Peter, for clarifications. Clearly, the analyzer is correct. My interpretation was that it was the documented isspace() behavior. So, I felt somewhat uncomfortable "correcting"/changing the documented behavior. Now that you mention that it is a well-known bug to be (unfortunately) addressed I did just that. Thanks again. Much appreciated.
On Mon, Mar 20, 2017 at 4:07 PM, Vladimir Batov via Boost
On 2017-03-20 22:13, Peter Dimov via Boost wrote:
Vladimir Batov wrote:
... We discussed it on https://github.com/boostorg/convert/issues/28.
The analyzer is correct; passing char to isspace is a well-known bug, should be cast to unsigned char. The accepted range of the argument to isspace is -1..255 (assuming 8 bit character table), with -1 being EOF, and the default promotion from char to int doesn't do the right thing when char is signed.
Thanks, Peter, for clarifications. Clearly, the analyzer is correct. My interpretation was that it was the documented isspace() behavior. So, I felt somewhat uncomfortable "correcting"/changing the documented behavior. Now that you mention that it is a well-known bug to be (unfortunately) addressed I did just that. Thanks again. Much appreciated.
The call to std::toupper in that file has the same bug.
On Mon, Mar 20, 2017 at 9:07 PM, Vladimir Batov via Boost
On 2017-03-20 22:13, Peter Dimov via Boost wrote:
Vladimir Batov wrote:
... We discussed it on https://github.com/boostorg/convert/issues/28.
The analyzer is correct; passing char to isspace is a well-known bug, should be cast to unsigned char. The accepted range of the argument to isspace is -1..255 (assuming 8 bit character table), with -1 being EOF, and the default promotion from char to int doesn't do the right thing when char is signed.
Thanks, Peter, for clarifications. Clearly, the analyzer is correct. My interpretation was that it was the documented isspace() behavior. So, I felt somewhat uncomfortable "correcting"/changing the documented behavior. Now that you mention that it is a well-known bug to be (unfortunately) addressed I did just that. Thanks again. Much appreciated.
That's quite a lot of changes for a simple 'bug fix'. ;) https://github.com/boostorg/convert/commit/1b66bd64085d66ff4d8e1cba520e025ff... -- Olaf
On Mon, Mar 20, 2017 at 9:07 PM, Vladimir Batov via Boost
wrote: On 2017-03-20 22:13, Peter Dimov via Boost wrote:
... The analyzer is correct; passing char to isspace is a well-known bug, should be cast to unsigned char. The accepted range of the argument to isspace is -1..255 (assuming 8 bit character table), with -1 being EOF, and the default promotion from char to int doesn't do the right thing when char is signed.
Thanks, Peter, for clarifications. Clearly, the analyzer is correct. My interpretation was that it was the documented isspace() behavior. So, I felt somewhat uncomfortable "correcting"/changing the documented behavior. Now that you mention that it is a well-known bug to be (unfortunately) addressed I did just that. Thanks again. Much appreciated.
Damn confused.
WAS: std::isspace(ch);
THEN: std::isspace(static_cast<unsigned char>(ch));
THEN I realized I needed wchar_t support and therefore needed also
std::iswspace(). So, for such a seemingly trivial thing I ended up with
the following ugliness:
namespace cnv
{
template<typename char_type>
struct char_test {};
template<>
struct char_test<char>
{
static bool is_space(char ch)
{
return std::isspace(static_cast<unsigned char>(ch));
}
};
template<>
struct char_test
On 2017-03-21 11:16, Vladimir Batov via Boost wrote:
... Damn confused.
WAS: std::isspace(ch); THEN: std::isspace(static_cast<unsigned char>(ch));
THEN I realized I needed wchar_t support and therefore needed also std::iswspace(). So, for such a seemingly trivial thing I ended up with the following ugliness:
namespace cnv { template<typename char_type> struct char_test {};
template<> struct char_test<char> { static bool is_space(char ch) { return std::isspace(static_cast<unsigned char>(ch)); } }; template<> struct char_test
{ static bool is_space(wchar_t ch) { return std::iswspace(ch); } }; template<typename char_type> bool is_space(char_type ch) { return char_test ::is_space(ch); } } That raises two questions.
1) Is it really the only way to go or I went senile or fell behind language-wise? 2) What about std::iswspace? Does it also need static_cast?
OK, I guess function specifications are much shorter. Phew. Still. What do I do with No.2? template<typename char_type> bool is_space(char_type); template<> bool is_space (unsigned char c)... template<> bool is_space (char c)... template<> bool is_space (unsigned wchar_t c)... template<> bool is_space (wchar_t c)...
On Mon, Mar 20, 2017 at 8:26 PM, Vladimir Batov via Boost wrote:
OK, I guess function specifications are much shorter. Phew. Still. What do I do with No.2?
template<typename char_type> bool is_space(char_type);
template<> bool is_space (unsigned char c)... template<> bool is_space (char c)... template<> bool is_space (unsigned wchar_t c)... template<> bool is_space (wchar_t c)...
Hi Vladimir, Why one function template that you then specialize four times? i.e. Why not just four functions? bool is_space(unsigned char c) bool is_space(char c) bool is_space(unsigned wchar_t c) bool is_space(wchar_t c) Unless I'm missing some context... Glen
Template specializations stop type propagations. OTOH overloads might be funny in that regard... I think. On 2017-03-21 11:34, Glen Fernandes via Boost wrote:
On Mon, Mar 20, 2017 at 8:26 PM, Vladimir Batov via Boost wrote:
OK, I guess function specifications are much shorter. Phew. Still. What do I do with No.2?
template<typename char_type> bool is_space(char_type);
template<> bool is_space (unsigned char c)... template<> bool is_space (char c)... template<> bool is_space (unsigned wchar_t c)... template<> bool is_space (wchar_t c)...
Hi Vladimir,
Why one function template that you then specialize four times? i.e. Why not just four functions?
bool is_space(unsigned char c) bool is_space(char c) bool is_space(unsigned wchar_t c) bool is_space(wchar_t c)
Unless I'm missing some context...
Glen
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 21/03/2017 13:16, Vladimir Batov via Boost wrote:
Damn confused.
WAS: std::isspace(ch); THEN: std::isspace(static_cast<unsigned char>(ch));
THEN I realized I needed wchar_t support and therefore needed also std::iswspace(). So, for such a seemingly trivial thing I ended up with the following ugliness: [...] That raises two questions.
1) Is it really the only way to go or I went senile or fell behind language-wise? 2) What about std::iswspace? Does it also need static_cast?
Apologies for seemingly naive questions but I really feel startled.
I haven't verified but presumably
std::char_traits
On 2017-03-21 11:43, Gavin Lambert via Boost wrote:
On 21/03/2017 13:16, Vladimir Batov via Boost wrote:
Damn confused.
WAS: std::isspace(ch); THEN: std::isspace(static_cast<unsigned char>(ch));
THEN I realized I needed wchar_t support and therefore needed also std::iswspace(). So, for such a seemingly trivial thing I ended up with the following ugliness: [...] That raises two questions.
1) Is it really the only way to go or I went senile or fell behind language-wise? 2) What about std::iswspace? Does it also need static_cast?
Apologies for seemingly naive questions but I really feel startled.
I haven't verified but presumably std::char_traits
::to_int_type() would convert correctly for input to std::isspace/iswspace.
Thanks, Gavin, indeed I should have a dig in char_traits before re-inventing the wheel. Thanks. What about iswspace()? Does it have the same problem as std::isspace(), i.e. needs an explicit cast? Seems unlikely...
On 2017-03-21 11:46, Vladimir Batov via Boost wrote:
On 2017-03-21 11:43, Gavin Lambert via Boost wrote:
... I haven't verified but presumably std::char_traits
::to_int_type() would convert correctly for input to std::isspace/iswspace. Thanks, Gavin, indeed I should have a dig in char_traits before re-inventing the wheel. Thanks.
Isn't funny? I went to cppreference.com and immediately bumped into the exact problem I am trying to address: struct ci_char_traits : public std::char_traits<char> { static bool eq(char c1, char c2) { return std::toupper(c1) == std::toupper(c2); } The example seems to ignore that char vs. "unsigned char" issue/vulnerability.
On 21/03/2017 13:46, Vladimir Batov via Boost wrote:
Thanks, Gavin, indeed I should have a dig in char_traits before re-inventing the wheel. Thanks.
There's also std::ctype
What about iswspace()? Does it have the same problem as std::isspace(), i.e. needs an explicit cast? Seems unlikely...
I think it's probably good practice to use char_traits<>::to_int_type() for that too, since the parameter type is wint_t rather than wchar_t, and that's the "right" way to do that conversion. As Peter said in practice it probably wouldn't cause any bugs without that since it would be weird for wchar_t to be signed, but as both the sign and size of the type is implementation-defined it's not actually illegal for that to be the case. (And you're even less likely to notice on platforms where wchar_t is 32-bit or larger, but again that's implementation-defined.)
Many thanks, Gavin. Looking into char_traits<>::to_int_type()... Very little info/explanation... So, seems have to look at the code... std::ctype looks very promising... even has toupper(). Gosh. I hate choices. :-) Thank you for the pointers. Very much appreciated. On 2017-03-21 12:49, Gavin Lambert via Boost wrote:
On 21/03/2017 13:46, Vladimir Batov via Boost wrote:
Thanks, Gavin, indeed I should have a dig in char_traits before re-inventing the wheel. Thanks.
There's also std::ctype
::is(), although the call syntax for that seems a bit more cumbersome. Also at least according to cppreference.com it hasn't been extended for the new 16-bit and 32-bit C++11 char types. I don't know if that means anything. What about iswspace()? Does it have the same problem as std::isspace(), i.e. needs an explicit cast? Seems unlikely...
I think it's probably good practice to use char_traits<>::to_int_type() for that too, since the parameter type is wint_t rather than wchar_t, and that's the "right" way to do that conversion.
As Peter said in practice it probably wouldn't cause any bugs without that since it would be weird for wchar_t to be signed, but as both the sign and size of the type is implementation-defined it's not actually illegal for that to be the case. (And you're even less likely to notice on platforms where wchar_t is 32-bit or larger, but again that's implementation-defined.)
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Vladimir Batov wrote:
2) What about std::iswspace? Does it also need static_cast?
No, the wide functions do not require a cast. In practice, wchar_t is an unsigned type anyway. (It's technically required to be the same as signed char, but I doubt one can encounter that case in practice, and even then, I think that passing it to iswspace is required to work.)
On 2017-03-21 11:44, Peter Dimov via Boost wrote:
Vladimir Batov wrote:
2) What about std::iswspace? Does it also need static_cast?
No, the wide functions do not require a cast. In practice, wchar_t is an unsigned type anyway. (It's technically required to be the same as signed char, but I doubt one can encounter that case in practice, and even then, I think that passing it to iswspace is required to work.)
Many thanks, Peter. No wonder I found nothing on the net... But thought I'd better ask. :-) Thanks again.
participants (6)
-
Gavin Lambert
-
Glen Fernandes
-
Olaf van der Spek
-
Peter Dimov
-
Tim Song
-
Vladimir Batov