boost::asio::buffer bug with string literal
When using boost::asio::buffer with a string literal the terminating null character is included in the buffer size, leading to bugs like accidentally sending out a null character. We could fix this by adding something like this to buffer.hpp: /// Create a new non-modifiable buffer that represents the given character array. /** * @returns A const_buffer value equivalent to: * @code const_buffer( * static_cast<const void*>(data), * (N - 1) * sizeof(char)); @endcode */ template <std::size_t N> inline BOOST_ASIO_CONST_BUFFER buffer( const char (&data)[N]) BOOST_ASIO_NOEXCEPT { return BOOST_ASIO_CONST_BUFFER(data, (N - 1) * sizeof(char)); } But this might cause trouble when using a const char array that is not null-terminated, since we then basically drop the last character. Would this be the best approach?
Would this be the best approach?
It gives me the shivers. Binary data could easily have a zero in the last byte. Wouldn't it be more explicit to simply provide an overload for boost::utility::string_view and std::string_view? On 28 November 2017 at 08:43, Martijn Otto via Boost-users < boost-users@lists.boost.org> wrote:
When using boost::asio::buffer with a string literal the terminating null character is included in the buffer size, leading to bugs like accidentally sending out a null character. We could fix this by adding something like this to buffer.hpp:
/// Create a new non-modifiable buffer that represents the given character array. /** * @returns A const_buffer value equivalent to: * @code const_buffer( * static_cast<const void*>(data), * (N - 1) * sizeof(char)); @endcode */ template <std::size_t N> inline BOOST_ASIO_CONST_BUFFER buffer( const char (&data)[N]) BOOST_ASIO_NOEXCEPT { return BOOST_ASIO_CONST_BUFFER(data, (N - 1) * sizeof(char)); }
But this might cause trouble when using a const char array that is not null-terminated, since we then basically drop the last character. Would this be the best approach? _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
Fair point, which is why I asked. The overloads for std::string_view are already there in the 1.66 beta, but they are not used for string literals. I think this is because it would be a conversion, where using the template with an array of PodType is not - so this is the better match. Of course, with the std::string_view overload and using std::literals one can simply add sv after their string literals to make it work, since the overload is then an exact match. Perhaps we could do something with std::char_traits, which has a constexpr length function (since c++17). A quick test seems to indicate it handles both null-terminated and non-null-terminated arrays properly, though how it does so I have no idea. I have tested it across different compilation units to prevent the compiler from inlining the whole thing. What about something like this? /// Create a new non-modifiable buffer that represents the given character array. /** * @returns A const_buffer value equivalent to: * @code const_buffer( * static_cast<const void*>(data), * std::char_traits<char>::length(data)); @endcode */ inline BOOST_ASIO_CONST_BUFFER buffer( const char *data) BOOST_ASIO_NOEXCEPT { return BOOST_ASIO_CONST_BUFFER(data, std::char_traits<char>::length(data)); } Richard Hodges via Boost-users wrote:
Would this be the best approach?
It gives me the shivers. Binary data could easily have a zero in the last byte.
Wouldn't it be more explicit to simply provide an overload for boost::utility::string_view and std::string_view?
On 28 November 2017 at 08:43, Martijn Otto via Boost-users <boost-use rs@lists.boost.org> wrote:
When using boost::asio::buffer with a string literal the terminating null character is included in the buffer size, leading to bugs like accidentally sending out a null character. We could fix this by adding something like this to buffer.hpp:
/// Create a new non-modifiable buffer that represents the given character array. /** * @returns A const_buffer value equivalent to: * @code const_buffer( * static_cast<const void*>(data), * (N - 1) * sizeof(char)); @endcode */ template <std::size_t N> inline BOOST_ASIO_CONST_BUFFER buffer( const char (&data)[N]) BOOST_ASIO_NOEXCEPT { return BOOST_ASIO_CONST_BUFFER(data, (N - 1) * sizeof(char)); }
But this might cause trouble when using a const char array that is not null-terminated, since we then basically drop the last character. Would this be the best approach? _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
On 29/11/2017 01:57, Richard Hodges wrote:
It gives me the shivers. Binary data could easily have a zero in the last byte.
In theory those should be distinguishable -- char is a distinct type from unsigned char (even if char is unsigned by default), and *hopefully* anyone poking binary data at the wire is doing so as an unsigned char or uint8_t. In practice, there might be some trouble with people abusing std::string as if it were std::vector<uint8_t> by pointer casting.
On 29 November 2017 at 02:58, Gavin Lambert via Boost-users < boost-users@lists.boost.org> wrote:
On 29/11/2017 01:57, Richard Hodges wrote:
It gives me the shivers. Binary data could easily have a zero in the last byte.
In theory those should be distinguishable -- char is a distinct type from unsigned char (even if char is unsigned by default), and *hopefully* anyone poking binary data at the wire is doing so as an unsigned char or uint8_t.
In reality, in my world (e.g. protocol buffers), std::string is a more useful type for binary data buffers that std::vector<std::uint8_t>. I know it's a different type, but still, it's chars modelling bytes. This is not an uncommon idea. The fact that you can't construct a std::vector from data without a copy means that std::vector<byte> is less suitable for buffering data than one might imagine.
In practice, there might be some trouble with people abusing std::string as if it were std::vector<uint8_t> by pointer casting.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
I don't think std::string should be a problem in this case, if you want to construct a buffer from an std::string, you wouldn't be using this overload anyway, since it's a pointer to a char, not an array of characters. It would only be a problem for someone who has a proper array of chars that is not null-terminated. This is not exactly unimaginable, but it's certainly unlikely and nothing is preventing anyone from using unsigned char here. On wo, 2017-11-29 at 07:47 +0100, Richard Hodges via Boost-users wrote:
On 29 November 2017 at 02:58, Gavin Lambert via Boost-users <boost-us ers@lists.boost.org> wrote:
On 29/11/2017 01:57, Richard Hodges wrote:
It gives me the shivers. Binary data could easily have a zero in the last byte.
In theory those should be distinguishable -- char is a distinct type from unsigned char (even if char is unsigned by default), and *hopefully* anyone poking binary data at the wire is doing so as an unsigned char or uint8_t. In reality, in my world (e.g. protocol buffers), std::string is a more useful type for binary data buffers that std::vector<std::uint8_t>. I know it's a different type, but still, it's chars modelling bytes. This is not an uncommon idea.
The fact that you can't construct a std::vector from data without a copy means that std::vector<byte> is less suitable for buffering data than one might imagine.
In practice, there might be some trouble with people abusing std::string as if it were std::vector<uint8_t> by pointer casting.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
I take your point. But what I am alluding to is that people have used chars for bytes since 1970. I suspect that simply assuming that any char[N] is a string will break all kinds of code. I think it's better to leave things as they are and demand that if people are dealing in zero-terminated strings, they explicitly state intent with a string_view or similar. I agree that it's slightly unsatisfactory state of affairs, but this is not ASIO's problem to solve. The problem is with the initial decision by Mrs Kernighan and Ritchie to mix the concerns of pointers to bytes and strings in one type. On 29 November 2017 at 12:59, Martijn Otto via Boost-users < boost-users@lists.boost.org> wrote:
I don't think std::string should be a problem in this case, if you want to construct a buffer from an std::string, you wouldn't be using this overload anyway, since it's a pointer to a char, not an array of characters.
It would only be a problem for someone who has a proper array of chars that is not null-terminated. This is not exactly unimaginable, but it's certainly unlikely and nothing is preventing anyone from using unsigned char here.
On wo, 2017-11-29 at 07:47 +0100, Richard Hodges via Boost-users wrote:
On 29 November 2017 at 02:58, Gavin Lambert via Boost-users <boost-us ers@lists.boost.org> wrote:
On 29/11/2017 01:57, Richard Hodges wrote:
It gives me the shivers. Binary data could easily have a zero in the last byte.
In theory those should be distinguishable -- char is a distinct type from unsigned char (even if char is unsigned by default), and *hopefully* anyone poking binary data at the wire is doing so as an unsigned char or uint8_t. In reality, in my world (e.g. protocol buffers), std::string is a more useful type for binary data buffers that std::vector<std::uint8_t>. I know it's a different type, but still, it's chars modelling bytes. This is not an uncommon idea.
The fact that you can't construct a std::vector from data without a copy means that std::vector<byte> is less suitable for buffering data than one might imagine.
In practice, there might be some trouble with people abusing std::string as if it were std::vector<uint8_t> by pointer casting.
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users
participants (3)
-
Gavin Lambert
-
Martijn Otto
-
Richard Hodges