
Stewart, Robert wrote:
Sebastian Redl wrote:
Phil Endecott wrote:
Eric Niebler wrote:
I believe that as of C++03, std::string is required to have contiguous storage. The null-termination is another thing. The only guarantee is that the char* returned by c_str() is required to be null terminated. No such guarantee is made for the sequence traversed by std::string's iterators. Dereferencing the end iterator is verboten.
Here is an outline of a zero-overhead wrapper for std::string that I hope guarantees that *end() == 0:
struct string_with_zero_beyond_end: std::string { typedef const char* const_iterator; const_iterator begin() const { return c_str(); } const_iterator end() const { return c_str() + length(); } };
Not zero-overhead if the string implementation is one (of the non-existent ones) that doesn't store the NUL internally.
Is c_str() allowed to be > O(1) ?
Worse: you'd be returning iterators to what may well be a temporary array of characters. The result of calling c_str() need not refer to the internal storage of the string.
People are taking my "code" too literally. I am just trying to point out that a function for UTF-8 decoding can be significantly more efficient if it exploits various characteristics of common types, such as contiguous storage and null termination. I feel that this is something worth doing. Phil.