[c++TR2] N3334, Proposing array_ref<T> and string_ref

On Sat, Jan 28, 2012 at 8:12 PM, Mathias Gaunard <mathias.gaunard@ens-lyon.org> wrote:
See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3334.html While this proposal isn't from Boost, it impacts interests of Boost developers enough that I think it is worth discussing here as a separate topic. Mathias continues:
Some std library string implementations avoid the heap allocation for small strings, but still there is an unnecessary copy happening even in those implementations. Your point is well taken and I've often worried about it with boost::filesystem::path.
One of my blink reactions is that array_ref<T> and basic_string_ref<charT, traits> are range generators and I was a bit surprised to see the implementation was a pointer and length rather than two pointers. Or better yet, two iterators or an explicit range component. With iterators, a basic_string_ref could do encoding conversions on-the-fly without need of temporary strings. But I have no idea if that is workable or actually is better. What do other Boosters think? --Beman --Beman

2012/1/30 Beman Dawes <bdawes@acm.org>
Implements are free to use two pointers if that's faster on some platform.
I don't see how this could work unless each access to basic_string_ref involves a virtual function call or all functions accepting basic_string_ref must be templates. // Writes a string to file. This function doesn't care who owns // the string. void WriteToFile(basic_string_ref<char, ...> s); What should go in ... if the string is allowed to do encoding conversion on-the-fly? This is the same problem we have with ranges. If you want to write a function that accepts a range of integers, you either have to implement it as a template or use any_iterator/any_range which could be too inefficient. Roman Perepelitsa.

On Mon, Jan 30, 2012 at 9:30 AM, Roman Perepelitsa <roman.perepelitsa@gmail.com> wrote:
Sure.
The other alternative is to use the boost::filesystem::path/N3336 approach, which avoids conversion inefficiency if no conversion is necessary, but does come at the cost of creating a temporary when a conversion is required. Anyhow, that's all an aside to the real questions: What are the pros and cons of N3334 in general and basic_string_ref in particular? Thanks, --Beman

On Mon, Jan 30, 2012 at 3:20 PM, Beman Dawes <bdawes@acm.org> wrote:
I think the idea is great. In fact, I've written similiar classes: http://code.google.com/p/xbt/source/browse/trunk/xbt/misc/xbt/data_ref.h I've posted about the idea on this list before, but received few responses. The idea is that you've got a non-template function that takes an array. Often types used are (const void*, size_t) or (const char*, size_t), which is cumbersome. Iterators instead of pointers wouldn't really work. std::string with small string optimizations is sub-optimal if input is not an std::string, but for example std::array. N3334 does not really address the (const void*, size_t) case. BTW, isn't there a forum / mailing list to discuss these proposals? Olaf

On Mon, Jan 30, 2012 at 6:34 PM, Olaf van der Spek <ml@vdspek.org> wrote:
Yes, and they are usually better choices than the Boost list for discussion about committee proposals. https://groups.google.com/forum/#!forum/comp.std.c++ for one, but it hasn't been very active recently. The committee has its own discussion lists, but they are available only to members. I particularly value the feedback from Boost members, and thought there might be a lot of Boost interest in this particular proposal. Thanks, --Beman

Olaf van der Spek wrote:
I wrote and use a string_ref class. I made no attempt to support wchar_t. It provides a constructor for std::vector<char>, besides the constructors given in N3334. I also have a const_substring type for which there's a constructor. The list can grow large, of course, so relying on conversions to string_ref from other string-like types is reasonable. I don't agree with replicating std::string's too large interface in string_ref. I'd prefer to use free functions to augment the functionality.
N3334 does not really address the (const void*, size_t) case.
They support that for string_ref, but not array_ref. Go figure. _____ Rob Stewart robert.stewart@sig.com Software Engineer using std::disclaimer; Dev Tools & Components Susquehanna International Group, LLP http://www.sig.com ________________________________ IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Feedback on the paper: 1. I support the additions especially since I encountered similar concepts implemented independently in multiple places, including myself. It seems like a natural compromise between separate compilation and generality. 2. Instead of adding a bunch of implicit constructors and implicit conversion operators to each container with contiguous storage one can proceed as follows. a) std::contiguous_iterator_tag can be added to the standard (see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html for example). It would be a great addition by its own since generic code can optimize appropriately when ContiguousIterators to is_trivially_copyable<value_type> are used. b) Add *one* implicit constructor to array_ref: template<class R> array_ref(const R& x, typename enable_if< is_base_of< contiguous_iterator_tag, typename iterator_traits< decltype(begin(x)) >::iterator_category >::value, int >::type = 0); // use &*begin(x), (end(x) - begin(x)) to initialize. On Mon, Jan 30, 2012 at 16:20, Beman Dawes <bdawes@acm.org> wrote:
-- Yakov
participants (5)
-
Beman Dawes
-
Olaf van der Spek
-
Roman Perepelitsa
-
Stewart, Robert
-
Yakov Galka