[string_ref] Feature requests

Hi, I've been trying to convert quickbook to use boost::string_ref, and there are a few missing features that might be useful: Construction from a pair of iterators, either std::basic_string::const_iterator, or boost::basic_string_ref::const_iterator. Which of course can be the same type, which is annoying but not too hard to work around. Easy conversion to std::basic_string. There's a C++11 explicit string operator, but that isn't much good for portable code. Something like a 'string' or 'to_string' member would do the trick. Support for comparisons with std::basic_string. The comparison operators are templates so they aren't called with an implicit conversion. Could use a template parameter for the type so that it picks up anything which is (implicitly?) convertible to boost::basic_string_ref, maybe using SFINAE. I didn't need it, but could probably also could do with a 'hash_value' function for boost::hash support. Is potentially error prone if the original string is modified or destroy, but there are some valid use cases. thanks, Daniel

On Sat, Dec 29, 2012 at 3:33 PM, Daniel James <dnljms@gmail.com> wrote:
Easy conversion to std::basic_string. There's a C++11 explicit string operator, but that isn't much good for portable code. Something like a 'string' or 'to_string' member would do the trick.
Wouldn't non-member to_string make more sense? http://en.cppreference.com/w/cpp/string/basic_string/to_string For a member function str() seems more consistent with existing names. -- Olaf

On Dec 29, 2012, at 6:33 AM, Daniel James <dnljms@gmail.com> wrote:
Hi,
I've been trying to convert quickbook to use boost::string_ref,
Great!
and there are a few missing features that might be useful:
Construction from a pair of iterators, either std::basic_string::const_iterator, or boost::basic_string_ref::const_iterator. Which of course can be the same type, which is annoying but not too hard to work around.
Hrm. Given a pair of iterators (first, last), how do you ensure that the range [first, last) is contiguous in memory? If you (the caller) know that, then you can construct thus: string_ref (&*first, std::distance(first, last)) but how would a string_ref constructor know that?
Easy conversion to std::basic_string. There's a C++11 explicit string operator, but that isn't much good for portable code. Something like a 'string' or 'to_string' member would do the trick.
Makes all kinds of sense. However, this is problematic in C++03. I tried: template<typename Allocator = std::allocator <charT> > std::basic_string<charT, traits, Allocator> to_string () const { return std::basic_string<charT, traits, Allocator> ( ptr_, len_ ); } But that doesn't work, because you can't have default template arguments in functions. I tried: template<typename Allocator> std::basic_string<charT, traits, Allocator> to_string () const { return std::basic_string<charT, traits, Allocator> ( ptr_, len_ ); } and that works, but the user has to write: str2 = sr1.to_string<std::allocator<char> > (); and that's just _ugly_ I could just leave the allocator parameter out altogether: std::basic_string<charT, traits> to_string () const { return std::basic_string<charT, traits> ( ptr_, len_ ); } And that works, but you lose the ability to specify an allocator. Another option is to just define a "convert" function: template <typename Result> Result convert () const { return Result ( ptr_, len ); } and people could write: str2 = sr1.convert<std::string> (); which is ugly too - but very clear.
Support for comparisons with std::basic_string. The comparison operators are templates so they aren't called with an implicit conversion. Could use a template parameter for the type so that it picks up anything which is (implicitly?) convertible to boost::basic_string_ref, maybe using SFINAE.
I like this idea. I'll have to think about the best way to do it. What happens if the basic_string has a different type_traits class than the string_ref? How do they compare then? [ Of course, that could happen with two different instantiations of std::basic_string today, so that's probably a solved problem. ]
I didn't need it, but could probably also could do with a 'hash_value' function for boost::hash support. Is potentially error prone if the original string is modified or destroy, but there are some valid use cases.
Yeah, hashing support and conversion to numbers is in the header file, but commented out for now. -- Marshall Marshall Clow Idio Software <mailto:mclow.lists@gmail.com> A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait). -- Yu Suzuki

On 29 December 2012 16:56, Marshall Clow <mclow.lists@gmail.com> wrote:
On Dec 29, 2012, at 6:33 AM, Daniel James <dnljms@gmail.com> wrote:
and there are a few missing features that might be useful:
Construction from a pair of iterators, either std::basic_string::const_iterator, or boost::basic_string_ref::const_iterator. Which of course can be the same type, which is annoying but not too hard to work around.
Hrm. Given a pair of iterators (first, last), how do you ensure that the range [first, last) is contiguous in memory?
If you (the caller) know that, then you can construct thus: string_ref (&*first, std::distance(first, last))
This is technically more dangerous, as it will work for std::list<char>. There's also the risk of using different variables for 'first' (and yes, I did exactly that). If I wasn't clear enough, I'm not suggesting supporting arbitrary iterators. Just std::basic_string::const_iterator and boost::basic_string_ref::const_iterator. I suppose I should have included raw char pointers as well.
but how would a string_ref constructor know that?
How does your existing constructor know that the length doesn't overflow?
Easy conversion to std::basic_string. There's a C++11 explicit string operator, but that isn't much good for portable code. Something like a 'string' or 'to_string' member would do the trick.
Makes all kinds of sense. However, this is problematic in C++03.
I tried: template<typename Allocator = std::allocator <charT> > std::basic_string<charT, traits, Allocator> to_string () const { return std::basic_string<charT, traits, Allocator> ( ptr_, len_ ); }
But that doesn't work, because you can't have default template arguments in functions.
I wouldn't expect support for arbitrary allocators. Olaf pointed out std::to_string which doesn't seem to support such things. But if you must have support, how about returning a type which is implicitly convertible to std::basic_string?
Support for comparisons with std::basic_string. The comparison operators are templates so they aren't called with an implicit conversion. Could use a template parameter for the type so that it picks up anything which is (implicitly?) convertible to boost::basic_string_ref, maybe using SFINAE.
I like this idea. I'll have to think about the best way to do it.
What happens if the basic_string has a different type_traits class than the string_ref? How do they compare then? [ Of course, that could happen with two different instantiations of std::basic_string today, so that's probably a solved problem. ]
I wouldn't support that. You can't compare strings with different char_traits, and such strings aren't convertible to boost::basic_string_ref.

On Sat, Dec 29, 2012 at 6:33 PM, Daniel James <dnljms@gmail.com> wrote:
If I wasn't clear enough, I'm not suggesting supporting arbitrary iterators. Just std::basic_string::const_iterator and boost::basic_string_ref::const_iterator. I suppose I should have included raw char pointers as well.
Can't you use the (const char*, size_t) overload with (s.data(), s.size())? Or, what's the exact use case? -- Olaf

On 29 December 2012 18:08, Olaf van der Spek <ml@vdspek.org> wrote:
On Sat, Dec 29, 2012 at 6:33 PM, Daniel James <dnljms@gmail.com> wrote:
If I wasn't clear enough, I'm not suggesting supporting arbitrary iterators. Just std::basic_string::const_iterator and boost::basic_string_ref::const_iterator. I suppose I should have included raw char pointers as well.
Can't you use the (const char*, size_t) overload with (s.data(), s.size())? Or, what's the exact use case?
In quickbook I use string_ref to pass around substrings of quickbook and xml source code. I use iterators everywhere. Using iterators like this is standard practice in STL based code.

On Dec 29, 2012, at 9:33 AM, Daniel James <dnljms@gmail.com> wrote:
On 29 December 2012 16:56, Marshall Clow <mclow.lists@gmail.com> wrote:
On Dec 29, 2012, at 6:33 AM, Daniel James <dnljms@gmail.com> wrote:
and there are a few missing features that might be useful:
Construction from a pair of iterators, either std::basic_string::const_iterator, or boost::basic_string_ref::const_iterator. Which of course can be the same type, which is annoying but not too hard to work around.
Hrm. Given a pair of iterators (first, last), how do you ensure that the range [first, last) is contiguous in memory?
If you (the caller) know that, then you can construct thus: string_ref (&*first, std::distance(first, last))
This is technically more dangerous, as it will work for std::list<char>. There's also the risk of using different variables for 'first' (and yes, I did exactly that).
If I wasn't clear enough, I'm not suggesting supporting arbitrary iterators. Just std::basic_string::const_iterator and boost::basic_string_ref::const_iterator. I suppose I should have included raw char pointers as well.
but how would a string_ref constructor know that?
How does your existing constructor know that the length doesn't overflow?
I think we're talking about different things here. I'm talking about how a string_ref constructor would know that the range defined by two iterators is contiguous.
Easy conversion to std::basic_string. There's a C++11 explicit string operator, but that isn't much good for portable code. Something like a 'string' or 'to_string' member would do the trick.
Makes all kinds of sense. However, this is problematic in C++03.
I tried: template<typename Allocator = std::allocator <charT> > std::basic_string<charT, traits, Allocator> to_string () const { return std::basic_string<charT, traits, Allocator> ( ptr_, len_ ); }
But that doesn't work, because you can't have default template arguments in functions.
I wouldn't expect support for arbitrary allocators. Olaf pointed out std::to_string which doesn't seem to support such things.
I think that std::to_string is just dumb. Not the idea (which I like) but the limited implementation. The fact that std::to_wstring exists (for example) seems just wrong to me.
But if you must have support, how about returning a type which is implicitly convertible to std::basic_string?
Hrm. (Not saying no, just thinking)
Support for comparisons with std::basic_string. The comparison operators are templates so they aren't called with an implicit conversion. Could use a template parameter for the type so that it picks up anything which is (implicitly?) convertible to boost::basic_string_ref, maybe using SFINAE.
I like this idea. I'll have to think about the best way to do it.
On the other hand, you can write: string_ref sr1; string str1; if ( sr1 == string_ref (str1)) today - for any of the relational operators. How important is the notational convenience of being able to write: if ( sr1 == str1 ) and if ( str1 == sr1 ) Just wondering. -- Marshall Marshall Clow Idio Software <mailto:mclow.lists@gmail.com> A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait). -- Yu Suzuki

On 29 December 2012 18:37, Marshall Clow <mclow.lists@gmail.com> wrote:
On Dec 29, 2012, at 9:33 AM, Daniel James <dnljms@gmail.com> wrote:
On 29 December 2012 16:56, Marshall Clow <mclow.lists@gmail.com> wrote:
but how would a string_ref constructor know that?
How does your existing constructor know that the length doesn't overflow?
I think we're talking about different things here. I'm talking about how a string_ref constructor would know that the range defined by two iterators is contiguous
The answer to both questions is, "It doesn't. It's the programmers responsibility". If you use a pair of string iterators that aren't contiguous anywhere you will have problems. I don't see why this is a special case.
How important is the notational convenience of being able to write:
if ( sr1 == str1 ) and if ( str1 == sr1 )
Just wondering.
The code in question is: boost::find(id_attributes, name) Where id_attributes is a string of vectors, name is a string_ref, boost::find is the range version of std::find.

On 29/12/12 15:33, Daniel James wrote:
Easy conversion to std::basic_string. There's a C++11 explicit string operator, but that isn't much good for portable code. Something like a 'string' or 'to_string' member would do the trick.
Why not make string_ref implicitly convertible when in C++03?

On Sun, Dec 30, 2012 at 10:37 AM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
On 29/12/12 15:33, Daniel James wrote:
Easy conversion to std::basic_string. There's a C++11 explicit string operator, but that isn't much good for portable code. Something like a 'string' or 'to_string' member would do the trick.
Why not make string_ref implicitly convertible when in C++03?
And why not do the same in C++11? -- Olaf

On 30/12/12 13:11, Olaf van der Spek wrote:
On Sun, Dec 30, 2012 at 10:37 AM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
On 29/12/12 15:33, Daniel James wrote:
Easy conversion to std::basic_string. There's a C++11 explicit string operator, but that isn't much good for portable code. Something like a 'string' or 'to_string' member would do the trick.
Why not make string_ref implicitly convertible when in C++03?
And why not do the same in C++11?
Because that's not what the specification says. I suppose the reason is that a conversion to string is costly, and that you usually want to avoid it.

On Dec 30, 2012, at 1:37 AM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
On 29/12/12 15:33, Daniel James wrote:
Easy conversion to std::basic_string. There's a C++11 explicit string operator, but that isn't much good for portable code. Something like a 'string' or 'to_string' member would do the trick.
Why not make string_ref implicitly convertible when in C++03?
Because it's an expensive conversion. -- Marshall Marshall Clow Idio Software <mailto:mclow.lists@gmail.com> A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait). -- Yu Suzuki

On 31/12/12 00:33, Marshall Clow wrote:
On Dec 30, 2012, at 1:37 AM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
On 29/12/12 15:33, Daniel James wrote:
Easy conversion to std::basic_string. There's a C++11 explicit string operator, but that isn't much good for portable code. Something like a 'string' or 'to_string' member would do the trick.
Why not make string_ref implicitly convertible when in C++03?
Because it's an expensive conversion.
But making it implicitly convertible is the closest thing there is to explicitly convertible in C++11. It seems much cleaner than polluting the interface.

On Mon, Dec 31, 2012 at 12:33 AM, Marshall Clow <mclow.lists@gmail.com> wrote:
On Dec 30, 2012, at 1:37 AM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
On 29/12/12 15:33, Daniel James wrote:
Easy conversion to std::basic_string. There's a C++11 explicit string operator, but that isn't much good for portable code. Something like a 'string' or 'to_string' member would do the trick.
Why not make string_ref implicitly convertible when in C++03?
Because it's an expensive conversion.
So is construction from a literal (AFAIK), which isn't explicit either. Explicit is in the proposal though. -- Olaf
participants (4)
-
Daniel James
-
Marshall Clow
-
Mathias Gaunard
-
Olaf van der Spek