
On Nov 27, 2012, at 7:01 AM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
On Tue, Nov 27, 2012 at 2:36 PM, Rob Stewart <robertstewart@comcast.net> wrote:
On Nov 26, 2012, at 6:56 AM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
The problem with std::string is the same as with string_ref - it doesn't support implicit construction from an arbitrary range, so my examples with custom string types would still not work.
That's right. We have no universal string/range type for that purpose, so you use the standard string type.
My point was that, in my understanding, string_ref is aimed to solve this issue in a transparent way but the proposal lacks the necessary interface.
I didn't realize you were arguing WRT the proposed class versus the concept, which is what I've been doing.
I would have used string_ref to unify string-related interfaces if it transparently supported multiple string types, not limited by those defined in STL (and Boost, if boost::string_ref is to be implemented). Limiting it to particular types defeats its purpose.
OK, I suspect we're agreeing more than disagreeing. Here's the I/F of my string_ref: - converting ctors from: o char const * o std::string const & o std::vector<char> const & o const_substring const & (my substring type) - other ctors: o char const *, size_t o char const *, char const * o char const (&)[N] - similar assignment operators - similar assign() member functions - bool is_null() - safe bool or explicit bool conversion operator - char const * data() - size_t length() - char const * begin()/end() - string_ref substr() - char operator[](size_t) (I think that's a complete list. I'm doing it from memory now.) It is very string-like and convenient. The same behaviors would be messier without a class (versus a range type and algorithms), though less general. I have not extended mine to support arbitrary ranges, via Boost.Range, simply because the need hasn't arisen, but it can be done. Likewise for arbitrary iterator pairs.
It is possible, if the third-party strings follow the begin()/end() protocol.
Now you're changing the rules. TP strings don't all provide iterators.
Any reasonable string type will have some notion of iterators, be that custom types or pointers or a pointer and a size, whatever. As long as this holds, the third-party string type can be adopted.
I understand that not all (nearly none?) third-party strings support begin()/end() protocol now, but I expect them to support eventually. Even if they don't, the necessary overloads can be provided externally.
I think such support is a reasonable addition.
No, this is not needed. iterator_range has implicit constructor from a range, so the conversion will be hidden from both the user and the library developer.
That only applies to types recognized as ranges. It isn't all string types. The same support should be part of string_ref, but an important distinction is that string_ref requires a contiguous range.
iterator_range doesn't detect that its constructor argument is a range or not. If applying begin()/end() to it is a valid operation, the conversion will succeed. I'd like string_ref to behave the same way.
OK
I see only one corner case: C strings. But I believe the solution is possible. Either begin()/end() can be defined for const char* or the string_ref can have the corresponding constructor. The latter is one (and only, AFAICS) reason to have string_ref type in addition to contiguous_range.
(char const *, size_t) is also common and convenient.
Extracting termination policy to a template parameter is a possibility but it has drawbacks of its own. It makes harder to provide a stable API/ABI for compiled libraries.
You'd only use the terminated one in APIs in rare cases, so a separate class is simpler.
So I would not introduce it at all for that reason. Just use std::string in such cases.
Using std::string loses the possibility of using the string_ref when it references a null terminated range. Thus, you'd always allocate and copy.
There are semantic differences between a contiguous range of characters and a string, but a contiguous range type would be useful in and of itself.
The semantic difference is a matter of content and its interpretation. You can store non-printable elements in std::string (and it is sometimes more convenient and efficient than std::vector< char >) and printable characters in std::vector< char >. The interface of std::vector< char > and std::string is mostly the same when it comes to string processing (not counting std::string members that can be replaced with free algorithms). The same applies to string_ref and contiguous_range< const char* >, the only notable difference being the construction from const char*.
I've never used std::string for non-string character storage. I use std::vector<char>. I realize that precludes any SBO opportunity, but I'd use another, non-string type in that case. Like Daniel, I see string processing as special. Maybe I'm just stuck in my old ways. ___ Rob