
16.11.2012, 14:45, "Olaf van der Spek" <ml@vdspek.org>:
On Fri, Nov 16, 2012 at 11:31 AM, Yanchenko Maxim <maximyanchenko@yandex.ru> wrote:
Not only subscripted access. Taking a subrange also requires knowing size. Copying from/to (read memcpy) - same. Filling (read memset) - same. Comparing (read memcmp) - same.
Those are C-style constructs. The C++-style equivalents are iterator-based.
Those are high-performance constructs. We can only pray that a compiler will be smart enough to convert our iterator-based code to memcpy/memcmp/memset, and from my experience compilers are not nearly as smart if it's slightly beyond trivial cases. (char_range is an optimization technique so we aim for maximum speed. If you don't maximize speed you'd be happy with simple and safe std::string copies.)
Suppose you have two pointers, 0xa0 (begin) and 0xb0 (end). The size in bytes is 0x10. Suppose you have one pointer (0xa0) and one size (0x10). Does this point to the same memory? "this" means 0xa0+0x10? By construction - yes, they do. We trust the caller that he gave us correct size (or correct pair of begin/end pointers from which we compute size in our ctor). std::string makes same assumptions.
Yes if sizeof(value_type) == 1, no otherwise. You can't tell to what memory range it points without knowing sizeof(value_type)
Ah. The first pointer (0xa0) is typed, so we surely know value_type. That's why your 0xa0 - 0xb0 works. They are not void*, they are value_type*.
Shouldn't they be implicit? Not from std::string. Same argument as for not having implicit conversion to char*. What argument would that be? You are giving away a reference to string internals that are subject to change/die anytime.
Isn't that by definition for a reference? It applies to const string& too. I don't think that's a good reason.
It's not a reference to std::string, it's a reference to *internals* of std::string. Those internals are managed by std::string exclusively. I.e. if you have a reference to std::string and you expand the string, the reference will continue to work with no problem, while a reference to internals will be invalidated (the simplest example of a reference to internals are invalidating iterators). But when you give away iterators, you do it explicitly via begin/end. Same way, if you give away a reference to std::string internals, you do it explicitly via data/c_str. This make potentially dangerous code visible. Same should be done with char_range construction from std::string::data - it should be explicit. Btw, const references are not that harmless, consider this innocent-looking code: struct S { const std::string& ref_; S(const std::string& ref): ref_(ref) {} }; S s1("foo"); S s2(std::string("bar"));
Making it explicit and visible in the caller code ensures that the programmer will take special measures to make sure that the string doesn't change/die while there's a char_range looking into it.
Consider std::vector<char_range>, for example. Back to this example:
// std::vector<std::string> v; - too slow, upgrading to our new char_range! std::vector<char_range> v; v.push_back( "foo" ); v.push_back( std::string("bar") ); // BOOM When pushing stuff to this vector, we want to be 100% sure that strings that gave away their char_ranges will live longer than the vector and live unchanged. And for this we need all the help a compiler can give us, namely - force us to explicitly declare the give-away and fail to compile otherwise. char_range is an efficient, but dangerous technique. I'm not a particular fan of Python, but when it comes to ownership management in C++, I prefer their maxima "explicit is better than implicit".
For the same reason we have explicit char_range::literal and char_range::from_array. I'd like this to work: void f(str_ref); f("Olaf"); f( char_range::literal("Olaf") ); Explicit and with size known at compile-time (so compiler can utilize this knowledge).
Thanks, Maxim