[boost] [c++TR2] N3334, Proposing array_ref<T> and string_ref

30 Jan 2012

      On Sat, Jan 28, 2012 at 8:12 PM, Mathias Gaunard
<mathias.gaunard@ens-lyon.org> wrote:
...
On 01/28/2012 05:46 PM, Beman Dawes wrote:
...
Beman.github.com/string-interoperability/interop_white_paper.html
describes Boost components intended to ease string interoperability in
general and Unicode string interoperability in particular.
These proposals are the Boost version of the TR2 proposals made in
N3336, Adapting Standard Library Strings and I/O to a Unicode World.
See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3336.html.
I'm very interested in hearing comments about either the Boost or the
TR2 proposal. Are these useful additions? Is there a better way to
achieve the same easy interoperability goals?
I think you should consider the points being made in N3334.
See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3334.html

While this proposal isn't from Boost, it impacts interests of Boost
developers enough that I think it is worth discussing here as a
separate topic.

Mathias continues:
...
While that proposal is in my opinion not good enough, it raises an important
issue that is often present with std::string-based or similar designs.
A function that takes a std::string, or a boost::filesystem::path for that
matter, necessarily causes the [caller] to copy the data into a heap-allocated
buffer, even if there is no need to.
Some std library string implementations avoid the heap allocation for
small strings, but still there is an unnecessary copy happening even
in those implementations. Your point is well taken and I've often
worried about it with boost::filesystem::path.
...
Use of the range concept would solve that issue, but then that requires
making the function a template. A type-erased range would be possible, but
that has significant performance overhead.
a string_ref or path_ref is maybe the lesser evil.
One of my blink reactions is that array_ref<T> and
basic_string_ref<charT, traits> are range generators and I was a bit
surprised to see the implementation was a pointer and length rather
than two pointers. Or better yet, two iterators or an explicit range
component. With iterators, a basic_string_ref could do encoding
conversions on-the-fly without need of temporary strings. But I have
no idea if that is workable or actually is better.

What do other Boosters think?

--Beman

--Beman