Dear All, Here is my brief review of Peter Dimov's proposed Boost.Variant2. Sorry that this is not a particularly thorough review. I have read the docs, scanned the code, and followed the discussion on the mailing list; I have not tried to compile or use it. I have recently been using the original Boost.Variant for the first time. I actually started writing code based on std::variant, but then realised that I don't have support for it on one C++14-only platform, so changed to Boost.Variant. It's unfortunate that Boost.Variant has a somewhat different interface from std::variant, so it would be useful to have a Boost variant that is (more) compatible with std::variant. It would also be useful if it were somewhat backward compatible with the old Boost.Variant. The proposed Boost.Variant2 is compatible with std::variant; could some thin layer of compatibility be added to make it possible to port from Boost.Variant code? Specifically, the differences I have noticed are apply_visitor vs. visit and get(V*) vs. get_if. In my case, my stored types are typically raw pointers or small POD structs; none of these can throw during construction/assignment so the complicated arguments about exception guarantees are not applicable. Personally for my current application I'd be happy with a simpler variant that requires noexcept types, or disables assignment for types that might throw, or just doesn't offer an exception safety guarantee. Ideally, perhaps a Boost.Variants *library* should provide a selection of different variants with different features? That leads to the question of naming; we have boost::variant, boost::variant2 and std::variant and those names tell us nothing about how they differ. I'm unsure how or if this can be improved; I'm not suggesting that variant2 should be renamed variant_never_empty_maybe_double_size! But maybe the long-name of the library, i.e. what people will see when they look at the list of Boost libraries, could say "Never empty variant" rather than "Variant 2"? Prior to the review I was familiar with the issue of dynamic allocation, the fact that is is needed in some cases by some variant implementations, and the desire to avoid it. However I was not aware of the alternative of storing two copies of the union, doubling the size of the variant. Much of my work is with memory-constrained systems and unknowingly using twice as much memory as expected would be very undesirable. (I noticed Niall Douglas' comment earlier today "There is nothing wrong, *whatsoever*, with a variant which is sized the largest of its possible states multiplied by two" - I strongly disagree with that.) I'd like to at least get a warning if that were going to happen. A specific concern is that I might accidentally get this fallback due to forgetting to write "noexcept" somewhere (e.g. when using types from older code). On the subject of size, I note that the index is stored in an int, while virtually all uses would surely need fewer than 256 (or certainly 2^16) alternatives. Of course changing the index to uint16_t would not actually help on many platforms due to structure alignment requirements. This makes me wonder about a specialised variant for pointers that uses the spare bits in (most/some) 64-bit pointers to store the variant index. This could usefully be part of a variants *library*. I read Andrzej Krzemienski's comments about the documentation and largely agree with them - though I would say that this library is close enough to std::variant that perhaps the right thing to do is simply to say "go and read about std::variant and then come back here to see the differences". Peter posted a link to some benchmark results last week; those numbers are encouraging and it is also good to know that Peter actually made the effort to measure them! I do wonder how a "trivial" variant supporting only noexcept types would compare, though. Boost.Variant has something called polymorphic_get. I am unclear whether Boost.Variant2 (and/or std::variant) somehow support this functionality. The situation re. expected is unfortunate. I believe really the review should have waited for it to be completed. Peter's comment that if it is decided that it needs to be reviewed then he would remove it and never resubmit it was ... surprisingly undiplomatic? I have always found it a bit odd that Boost requires a thorough review of an initial library submission but nothing at all for later additions. In summary: - I believe the library should be accepted. - It would be great if some compatibility features with Boost.Variant could be added, to make it possible for users of that older library to transition more easily. That is not a condition for acceptance. - I would love to have some way to get a warning or error if the double storage mode had been triggered, or to disable that mode (with an error). - It would be great to see some alternative variants, with different features / trade-offs, included in the library. Ideally, naming of types and headers should allow for that. I'm not suggesting that Peter should be personally expected to supply these alternatives, but perhaps he would accept offerings from others. - I think we should allow Peter to "slip in" his expected when it is ready; I say this largely because Peter has a long history with Boost. Future review managers should try to avoid situations like this arising. Regards, Phil.
Phil Endecott wrote:
The situation re. expected is unfortunate. I believe really the review should have waited for it to be completed. Peter's comment that if it is decided that it needs to be reviewed then he would remove it and never resubmit it was ... surprisingly undiplomatic?
To be clear, what I meant was that if the acceptance conditions for `variant` included that `expected` should not be added without a review, then I would not add it without a review, not that I would never resubmit if there was interest.
Thank you, Phil, for your review. Phil Endecott wrote:
- I would love to have some way to get a warning or error if the double storage mode had been triggered, or to disable that mode (with an error).
I was thinking of addressing this by adding either static constexpr bool variant<T...>::is_single_buffered(); or static constexpr bool variant<T...>::is_double_buffered(); (not sure which spelling is better), so that you can static_assert that your variants are never double-buffered.
On 8/04/2019 05:03, Peter Dimov wrote:
Phil Endecott wrote:
- I would love to have some way to get a warning or error if the double storage mode had been triggered, or to disable that mode (with an error).
I was thinking of addressing this by adding either
static constexpr bool variant<T...>::is_single_buffered();
or
static constexpr bool variant<T...>::is_double_buffered();
(not sure which spelling is better), so that you can static_assert that your variants are never double-buffered.
Something like this would definitely be nice. Especially when something as simple as accidentally omitting "noexcept" can cause a significant behavioural and storage size change, it's great to be able to tell the compiler that this was unintended. (Of course, as I've said before I would have preferred that this choice was made by providing two different front-end "variant" types -- that way the assert is already inside the library and the user can't forget to make it.)
Gavin Lambert wrote:
Especially when something as simple as accidentally omitting "noexcept" can cause a significant behavioural and storage size change, it's great to be able to tell the compiler that this was unintended.
This is, by the way, already the case with std::vector<X>. It will silently fall back to using the copy constructor if you forget noexcept on the move constructor of X. See f.ex. https://godbolt.org/z/gv1fN5
On 8/04/2019 13:53, Peter Dimov wrote:
Gavin Lambert wrote:
Especially when something as simple as accidentally omitting "noexcept" can cause a significant behavioural and storage size change, it's great to be able to tell the compiler that this was unintended.
This is, by the way, already the case with std::vector<X>. It will silently fall back to using the copy constructor if you forget noexcept on the move constructor of X. See f.ex. https://godbolt.org/z/gv1fN5
Just because it's in the Standard doesn't mean that (I think) it's a good design choice. ;)
pon., 8 kwi 2019 o 02:56 Gavin Lambert via Boost <boost@lists.boost.org> napisał(a):
On 8/04/2019 05:03, Peter Dimov wrote:
Phil Endecott wrote:
- I would love to have some way to get a warning or error if the double storage mode had been triggered, or to disable that mode (with an error).
I was thinking of addressing this by adding either
static constexpr bool variant<T...>::is_single_buffered();
or
static constexpr bool variant<T...>::is_double_buffered();
(not sure which spelling is better), so that you can static_assert that your variants are never double-buffered.
Something like this would definitely be nice.
Especially when something as simple as accidentally omitting "noexcept" can cause a significant behavioural and storage size change, it's great to be able to tell the compiler that this was unintended.
(Of course, as I've said before I would have preferred that this choice was made by providing two different front-end "variant" types -- that way the assert is already inside the library and the user can't forget to make it.)
Such thing could be achieved if the behavior of variant's assignment and emplacement was controlled by a type trait. The default implementation for unfriendly types would be compile-time error; but users would be able to specialize it for their instantiations of variant in order to assign different semantics: double buffering, or setting the valueless_by_exception state. Regards, &rzej;
Andrzej Krzemienski wrote:
Such thing could be achieved if the behavior of variant's assignment and emplacement was controlled by a type trait. The default implementation for unfriendly types would be compile-time error; but users would be able to specialize it for their instantiations of variant in order to assign different semantics: double buffering, or setting the valueless_by_exception state.
This can't work in general. If you have template<class... T> class lib1::X { private: variant<T...> state_; }; and template<class... T> class lib2::Y { private: variant<T...> state_; }; it's not possible for lib1 and lib2 to choose their preferred variant behaviors independently of one another.
pon., 8 kwi 2019 o 19:55 Peter Dimov via Boost <boost@lists.boost.org> napisał(a):
Andrzej Krzemienski wrote:
Such thing could be achieved if the behavior of variant's assignment and emplacement was controlled by a type trait. The default implementation for unfriendly types would be compile-time error; but users would be able to specialize it for their instantiations of variant in order to assign different semantics: double buffering, or setting the valueless_by_exception state.
This can't work in general. If you have
template<class... T> class lib1::X { private:
variant<T...> state_; };
and
template<class... T> class lib2::Y { private:
variant<T...> state_; };
it's not possible for lib1 and lib2 to choose their preferred variant behaviors independently of one another.
Indeed. Good point. Regards, &rzej;
On Mon, Apr 8, 2019 at 12:55 PM Peter Dimov via Boost <boost@lists.boost.org> wrote:
Andrzej Krzemienski wrote:
Such thing could be achieved if the behavior of variant's assignment and emplacement was controlled by a type trait.
This can't work in general.
One thing you could do is have a magic alternative in the typelist which controls this behavior. monostate is already a little magical in this design, so there is precedent.
-- Nevin ":-)" Liber <mailto:nevin@cplusplusguy.com <nevin@eviloverlord.com>> +1-847-691-1404
Much of my work is with memory-constrained systems and unknowingly using twice as much memory as expected would be very undesirable. (I noticed Niall Douglas' comment earlier today "There is nothing wrong, *whatsoever*, with a variant which is sized the largest of its possible states multiplied by two" - I strongly disagree with that.) I'd like to at least get a warning if that were going to happen. A specific concern is that I might accidentally get this fallback due to forgetting to write "noexcept" somewhere (e.g. when using types from older code).
Something which I have found is surprisingly uncommon amongst even expert C++ programmers is the judicious use of static_assert to ensure no surprises down the line when junior programmers go modify something that they don't realise the consequences of. In my own code, you'll find lots of stanzas like this: ``` #ifdef NDEBUG static_assert(sizeof(X) <= cmax(sizeof(A), sizeof(B), sizeof(C), ...)); static_assert(noexcept(std::declval<A>() = std::move(std::declval<A>())); #endif ``` ... and so on. Yes there is a strong argument that these really ought to live in unit tests. However these are *very* cheap tests, so cheap as to be virtually no cost to sprinkle them all over your headers, right next to the type definitions. And yet they will strongly encourage every junior (or senior!) programmer to never commit accidental "inconsequential" changes which blow out the size, or break nothrow, in a hot code path. Coming back to my point, there really isn't anything wrong with a variant which defines its size to be twice that of its largest alternative state. If it is *documented* to be so. If the small-size optimisation is extremely important to you, then static assert it! Write the requirement into your code! Then you are *guaranteed* the requirement is met. Else the code won't compile. Now back to ACCU talk writing! :) Niall
pon., 8 kwi 2019 o 22:42 Niall Douglas via Boost <boost@lists.boost.org> napisał(a):
Much of my work is with memory-constrained systems and unknowingly using twice as much memory as expected would be very undesirable. (I noticed Niall Douglas' comment earlier today "There is nothing wrong, *whatsoever*, with a variant which is sized the largest of its possible states multiplied by two" - I strongly disagree with that.) I'd like to at least get a warning if that were going to happen. A specific concern is that I might accidentally get this fallback due to forgetting to write "noexcept" somewhere (e.g. when using types from older code).
Something which I have found is surprisingly uncommon amongst even expert C++ programmers is the judicious use of static_assert to ensure no surprises down the line when junior programmers go modify something that they don't realise the consequences of.
In my own code, you'll find lots of stanzas like this:
``` #ifdef NDEBUG static_assert(sizeof(X) <= cmax(sizeof(A), sizeof(B), sizeof(C), ...));
static_assert(noexcept(std::declval<A>() = std::move(std::declval<A>())); #endif ```
... and so on.
Yes there is a strong argument that these really ought to live in unit tests. However these are *very* cheap tests, so cheap as to be virtually no cost to sprinkle them all over your headers, right next to the type definitions.
And yet they will strongly encourage every junior (or senior!) programmer to never commit accidental "inconsequential" changes which blow out the size, or break nothrow, in a hot code path.
Coming back to my point, there really isn't anything wrong with a variant which defines its size to be twice that of its largest alternative state. If it is *documented* to be so.
Well, the size of the variant is wrong: we know how to implement the variant that occupies half of this double storage. It is inefficient in terms of space consumption. The fact that this inefficiency is documented does not change the fact that it is inefficient. If the small-size optimisation is extremely important to you, then
static assert it! Write the requirement into your code! Then you are *guaranteed* the requirement is met. Else the code won't compile.
With this advice what I get is: either a space-efficient type or a compiler error. This is inferior to solution where I get a space-efficient type always. Regards, &rzej;
Coming back to my point, there really isn't anything wrong with a variant which defines its size to be twice that of its largest alternative state. If it is *documented* to be so.
Well, the size of the variant is wrong: we know how to implement the variant that occupies half of this double storage. It is inefficient in terms of space consumption. The fact that this inefficiency is documented does not change the fact that it is inefficient.> [snip] With this advice what I get is: either a space-efficient type or a compiler error. This is inferior to solution where I get a space-efficient type always.
I do not currently believe that it is possible to make an alternative to std::variant, which is worth the difference to std::variant, which is not mostly double-buffered. Or, put another way, I think the primary failing in the design of std::variant is its lack of default-to-double-buffering. Single buffering ought to be an optimisation. Double buffering ought to be the default, with all the many benefits that that design choice brings. (Don't get me wrong, my opinion is this for a union-storage variant only. I personally think an upper bound of 2x sizeof is an excellent tradeoff for all the benefits double buffering brings. I am highly unconvinced that anything less than an upper bound of 2x sizeof is worth the very substantial tradeoffs in terms of corner case surprise. I *personally* think anything else is an over-eager optimisation on any out-of-order CPU made in the past decade) Niall
On 4/8/19 1:42 PM, Niall Douglas via Boost wrote:
Something which I have found is surprisingly uncommon amongst even expert C++ programmers is the judicious use of static_assert to ensure no surprises down the line when junior programmers go modify something that they don't realise the consequences of.
In my own code, you'll find lots of stanzas like this:
``` #ifdef NDEBUG static_assert(sizeof(X) <= cmax(sizeof(A), sizeof(B), sizeof(C), ...));
static_assert(noexcept(std::declval<A>() = std::move(std::declval<A>())); #endif ```
... and so on.
Why do you bother with the #ifdef NDEBUG ? Robert Ramey
In my own code, you'll find lots of stanzas like this:
``` #ifdef NDEBUG static_assert(sizeof(X) <= cmax(sizeof(A), sizeof(B), sizeof(C), ...));
static_assert(noexcept(std::declval<A>() = std::move(std::declval<A>())); #endif ```
... and so on.
Why do you bother with the #ifdef NDEBUG ?
It signifies "this is important for the final release binary" i.e. please ask one of the tech leadership if you break this. It has no technical meaning. But it is *very* effective at worrying people. Niall
On 9/04/2019 11:13, Niall Douglas wrote:
Why do you bother with the #ifdef NDEBUG ?
It signifies "this is important for the final release binary" i.e. please ask one of the tech leadership if you break this.
It has no technical meaning. But it is *very* effective at worrying people.
Its technical meaning is that during development in debug builds (which presumably people should be using most of the time in order to verify runtime asserts and other things like debug iterators and debug heaps), all of these asserts are disabled, so people don't discover that they did something wrong at the time that they broke it, which makes it harder to discover what they broke. So it seems like an odd choice, unless something about a debug build makes the assertion untrue (eg. debug components changing sizeof). Why not use #ifndef JUNIOR_DEVS_DONT_TOUCH or something similarly never defined (and so never disabling the check) but expressing your real intent?
On 4/8/19 6:08 PM, Gavin Lambert via Boost wrote:
On 9/04/2019 11:13, Niall Douglas wrote:
Why do you bother with the #ifdef NDEBUG ?
It signifies "this is important for the final release binary" i.e. please ask one of the tech leadership if you break this.
It has no technical meaning. But it is *very* effective at worrying people.
Its technical meaning is that during development in debug builds (which presumably people should be using most of the time in order to verify runtime asserts and other things like debug iterators and debug heaps), all of these asserts are disabled, so people don't discover that they did something wrong at the time that they broke it, which makes it harder to discover what they broke.+1
So it seems like an odd choice, unless something about a debug build makes the assertion untrue (eg. debug components changing sizeof).
Why not use #ifndef JUNIOR_DEVS_DONT_TOUCH or something similarly never defined (and so never disabling the check) but expressing your real intent? -1
First of all I'm with Naill in using static_assert alot. But I come at it from a different place. A big concern of mine is the explicit specification of type requirements. This has now been subsumed in C++ as "concepts". Ideally, I like to see the type requirements as explicitly list for each type in the documentation and mirrored in the code. It might seem redundant, but it's not. One is the interface design and the other is the implementation. using static_assert for this purpose is a perfect match. In my opinion - C++ concepts as specified for C++20 are superfluous. Short version: a) I think you're doing it basically right b) It's a mistake to characterize this iast "protection against other developers. Actually it could be considered offensive to other developers. We all need to do our part for creating a safe space for those other developers. c) It's better to see it as using the compiler to explicity enforce type requirements in order to get correct code. Robert Ramey.
On 9/04/2019 13:40, Robert Ramey wrote:
Why not use #ifndef JUNIOR_DEVS_DONT_TOUCH or something similarly never defined (and so never disabling the check) but expressing your real intent? -1
I don't particularly agree with that either, but that's what Niall said was the real reason for the #ifdef. I did paraphrase, of course. FWIW though I don't think it was intended as a disparagement of junior developers in particular. It's easy enough for anyone, no matter how junior or senior, to trip over an assumption in the code -- and in that case it's best if the assumption is declared in a fashion that the compiler can alert you about, so that you can decide if the violation was intentional or not and correct related code accordingly if needed.
a) I think you're doing it basically right b) It's a mistake to characterize this iast "protection against other developers. Actually it could be considered offensive to other developers. We all need to do our part for creating a safe space for those other developers. c) It's better to see it as using the compiler to explicity enforce type requirements in order to get correct code.
It's better to have the static_asserts without any #if at all. (Unless, as I said, there's a fundamental requirement for it, eg. with types changing size in debug builds.)
On 09/04/2019 03:58, Gavin Lambert via Boost wrote:
On 9/04/2019 13:40, Robert Ramey wrote:
Why not use #ifndef JUNIOR_DEVS_DONT_TOUCH or something similarly never defined (and so never disabling the check) but expressing your real intent? -1
I don't particularly agree with that either, but that's what Niall said was the real reason for the #ifdef. I did paraphrase, of course.
FWIW though I don't think it was intended as a disparagement of junior developers in particular. It's easy enough for anyone, no matter how junior or senior, to trip over an assumption in the code -- and in that case it's best if the assumption is declared in a fashion that the compiler can alert you about, so that you can decide if the violation was intentional or not and correct related code accordingly if needed.
Indeed. Not long ago I broke one of those NDEBUG static asserts in a codebase I wholly wrote. Even better, around five years ago I had written the comment: // Hard required to prevent segfault #ifdef NDEBUG ... ... just before a list of static asserts in a header file. Now I, for the life of me, could not figure out what I had meant by that. I spent half a day checking and making sure that I could safely flip that static assert, and in fact there were no segfaults at risk there. I had in my head maybe some potential aliasing bug under optimisation, or something. In that sense, #ifdef NDEBUG is powerful. It says "be very careful messing with these asserts, because they're asserting on production code". Equally, if overused, they can quickly lose their power.
It's better to have the static_asserts without any #if at all. (Unless, as I said, there's a fundamental requirement for it, eg. with types changing size in debug builds.)
For me personally, I may disturb data layout during a set of changes, which will be developed in debug build. I don't want the static asserts firing during those changes, nor do I want to have to go through and disable them all, lest I forget to reenable them. I do want them to fire when I think I'm finished, and I'm doing final validation in release build. They usually need adjusting, and sometimes show I've made a critical mistake. One of the things I particularly frequently mess up is breaking trivial copyability. Sprinking static asserts of trivial copyability immediately after class definition is an *excellent* habit to get into. Anyway, in general it's a great technique to get in the habit of deploying. You'll keep a low latency codebase low latency over its lifetime much easier if you do. Niall
participants (7)
-
Andrzej Krzemienski
-
Gavin Lambert
-
Nevin Liber
-
Niall Douglas
-
Peter Dimov
-
Phil Endecott
-
Robert Ramey