std::launder and its usage in Boost
Hi, Initially, I posted this question to the boost-users list, but then I was advised to repost it here. I wonder, what is the Boost community's opinion of std::launder? In particular, of its necessity when accessing an object that was created via placement-new in an aligned_storage. As I can see, it's used by Boost.Beast in its implementation of the variant type, but not in other parts of Boost. The reason I'm asking this is that I'm working on a C++14 code-base that uses Boost.Variant and Boost.Optional extensively (as well as other parts of Boost that use Variant and Optional internally). Now we're trying to switch to C++17 at least, and I worry whether it could potentially break things. To put things into context, C++ standard states in the [basic.life] section that if you had an object and then created a new one in the same location, you can use the pointer to the old object to refer to the new one only if (among other things) "the new object is of the same type as the original object (ignoring the top-level cv-qualifiers)". This means, as I understand it, that it's technically illegal to reinterpret_cast the pointer to aligned_storage to the pointer to the actual type of the object created via placement-new, because placement-new has already ended the lifetime of the aligned_storage object. So, implementations of Boost.Optional and Boost.Variant are illegal then (?). But it's been like that since C++03 and Boost worked fine all that time, so it looks like this UB existed only "on paper". But C++17 then added "Note: If these conditions are not met, a pointer to the new object can be obtained from a pointer that represents the address of its storage by calling std::launder". So now the language has the ability to deal with that UB, and a question arises, is it possible that compilers could start to use the UB to perform additional optimizations and make it a real UB? Also, I've seen a couple of times on stackoverflow.com people saying that it's actually fine to reinterpret_cast the storage in C++14, but in C++17 it's not (they didn't explain why though). So, can switching from -std=c++14 to -std=c++17 be a breaking change when using Boost? The fact that Boost.Variant and Boost.Optional don't use std::launder - is it an oversight or a conscious decision? Regards, Mikhail.
On 2/18/22 13:33, Mikhail Kremniov via Boost wrote:
Hi,
Initially, I posted this question to the boost-users list, but then I was advised to repost it here.
I wonder, what is the Boost community's opinion of std::launder? In particular, of its necessity when accessing an object that was created via placement-new in an aligned_storage. As I can see, it's used by Boost.Beast in its implementation of the variant type, but not in other parts of Boost. The reason I'm asking this is that I'm working on a C++14 code-base that uses Boost.Variant and Boost.Optional extensively (as well as other parts of Boost that use Variant and Optional internally). Now we're trying to switch to C++17 at least, and I worry whether it could potentially break things.
To put things into context, C++ standard states in the [basic.life] section that if you had an object and then created a new one in the same location, you can use the pointer to the old object to refer to the new one only if (among other things) "the new object is of the same type as the original object (ignoring the top-level cv-qualifiers)".
This means, as I understand it, that it's technically illegal to reinterpret_cast the pointer to aligned_storage to the pointer to the actual type of the object created via placement-new, because placement-new has already ended the lifetime of the aligned_storage object. So, implementations of Boost.Optional and Boost.Variant are illegal then (?). But it's been like that since C++03 and Boost worked fine all that time, so it looks like this UB existed only "on paper".
Casting a pointer is not illegal. Accessing the pointed object is, if the object does not actually exist at the pointed location. That is, if at the aligned_storage location an object of type T was constructed (through any means described by the standard), you can cast the pointer to aligned_storage to a pointer to T and dereference it, and this is ok. If such an object does not exist, then dereferencing the pointer is UB.
But C++17 then added "Note: If these conditions are not met, a pointer to the new object can be obtained from a pointer that represents the address of its storage by calling std::launder". So now the language has the ability to deal with that UB, and a question arises, is it possible that compilers could start to use the UB to perform additional optimizations and make it a real UB? Also, I've seen a couple of times on stackoverflow.com people saying that it's actually fine to reinterpret_cast the storage in C++14, but in C++17 it's not (they didn't explain why though).
What std::launder is intended to be is a fence for compiler speculations as to the possible value in the dereferenced location. That is, in the following example: alignas(int) unsigned char storage[sizeof(int)]; int* p1 = new (storage) int(1); float* p2 = new (storage) float(2.0f); int* p3 = new (storage) int(3); std::printf("%d\n", *p1); the program is allowed to print 1 because the compiler may speculate that this code does not modify the value pointed to by p1. Adding this line immediately before printf: p1 = std::launder(p1); ensures that the compiler "forgets" whatever assumptions it had about the value pointed to by p1, so printf has to reload the value (or, at least, the compiler has to re-evaluate its assumptions given that the value could have been modified through other means). Note that p3 in this example is guaranteed to point to the int of 3 regardless of the launder. Given the above, I don't think Boost.Optional or Boost.Variant are affected. For both of these components, pointers and references to the stored value are invalidated if the value is destroyed (e.g. by resetting boost::optional or changing the current value type of boost::variant). It is also illegal (or simply not possible) to obtain a pointer or reference to the value that doesn't exist in the internal storage. boost::optional<int> opt; opt = 1; int* p1 = &*opt; opt = boost::none; // makes p1 invalid opt = 2; // p1 is still invalid p1 = &*opt; // now p1 points to 2 std::launder may be useful to the users of these components, if one keeps around a pointer to the stored object without re-obtaining it from boost::optional or boost::variant. But frankly, I don't see the point, as obtaining the pointer is a trivial and logical thing to do anyway.
Hi Andrey, now I'm curious... is alignas(int) unsigned char storage[sizeof(int)]; int* p1 = new (storage) int(1); int* p3 = new (storage) int(3); std::printf("%d\n", *p1); different from int* p1 = new int(1); int* p3 = new (p1) int(3); std::printf("%d\n", *p1); or even int p1 = 1; int* p3 = new (&p1) int(3); std::printf("%d\n", p1); ? Ciao, .Andrea
On 2/18/22 18:15, Andrea Bocci wrote:
Hi Andrey, now I'm curious... is
alignas(int) unsigned char storage[sizeof(int)]; int* p1 = new (storage) int(1); int* p3 = new (storage) int(3); std::printf("%d\n", *p1);
different from
int* p1 = new int(1); int* p3 = new (p1) int(3); std::printf("%d\n", *p1);
or even
int p1 = 1; int* p3 = new (&p1) int(3); std::printf("%d\n", p1);
?
In the examples above, you're reusing storage of the object with a different object of the same type, that is the old object is "transparently replaceable" by the new one. See here: https://en.cppreference.com/w/cpp/language/lifetime#Storage_reuse In this case, p1 remains valid and can be used to access the new object.
Andrea Bocci wrote:
Hi Andrey, now I'm curious... is
alignas(int) unsigned char storage[sizeof(int)]; int* p1 = new (storage) int(1); int* p3 = new (storage) int(3); std::printf("%d\n", *p1);
different from
int* p1 = new int(1); int* p3 = new (p1) int(3); std::printf("%d\n", *p1);
or even
int p1 = 1; int* p3 = new (&p1) int(3); std::printf("%d\n", p1);
?
I don't think there's any difference here.
Andrey Semashev wrote:
What std::launder is intended to be is a fence for compiler speculations as to the possible value in the dereferenced location. That is, in the following example:
alignas(int) unsigned char storage[sizeof(int)]; int* p1 = new (storage) int(1); float* p2 = new (storage) float(2.0f); int* p3 = new (storage) int(3); std::printf("%d\n", *p1);
the program is allowed to print 1...
I don't think it is. The "will automatically apply to the new object" clause applies (even in its C++14 form) and all its requirements seem to be met. https://eel.is/c++draft/basic.life#8
On 2/18/22 19:36, Peter Dimov via Boost wrote:
Andrey Semashev wrote:
What std::launder is intended to be is a fence for compiler speculations as to the possible value in the dereferenced location. That is, in the following example:
alignas(int) unsigned char storage[sizeof(int)]; int* p1 = new (storage) int(1); float* p2 = new (storage) float(2.0f); int* p3 = new (storage) int(3); std::printf("%d\n", *p1);
the program is allowed to print 1...
I don't think it is. The "will automatically apply to the new object" clause applies (even in its C++14 form) and all its requirements seem to be met. https://eel.is/c++draft/basic.life#8
The requirement "o1 and o2 are of the same type" is not satisfied, as the types int and float are different. Yes, you eventually create an int in the same storage, but I don't see where the standard requires this exemption rule to work across creation of an object of an incompatible type. Basically, my understanding is that whenever the pointer stops pointing at an object that it used to point to, you need to use std::launder on it. The only exception is construction of a transparently replaceable object an that location, which isn't the case above.
Andrey Semashev wrote:
alignas(int) unsigned char storage[sizeof(int)]; int* p1 = new (storage) int(1); float* p2 = new (storage) float(2.0f); int* p3 = new (storage) int(3); std::printf("%d\n", *p1);
the program is allowed to print 1...
I don't think it is. The "will automatically apply to the new object" clause applies (even in its C++14 form) and all its requirements seem to be met. https://eel.is/c++draft/basic.life#8
The requirement "o1 and o2 are of the same type" is not satisfied, as the types int and float are different.
Yes, you eventually create an int in the same storage, but I don't see where the standard requires this exemption rule to work across creation of an object of an incompatible type.
Interesting interpretation. I don't see where it requires it to be invalidated by the intermediate creation of the float, though. It says "after" but it doesn't say "immediately after".
Mikhail Kremniov wrote:
Hi,
Initially, I posted this question to the boost-users list, but then I was advised to repost it here.
Hello, and thanks for that, as this is the proper list for this question.
I wonder, what is the Boost community's opinion of std::launder? In particular, of its necessity when accessing an object that was created via placement-new in an aligned_storage. As I can see, it's used by Boost.Beast in its implementation of the variant type, but not in other parts of Boost. The reason I'm asking this is that I'm working on a C++14 code-base that uses Boost.Variant and Boost.Optional extensively (as well as other parts of Boost that use Variant and Optional internally). Now we're trying to switch to C++17 at least, and I worry whether it could potentially break things.
The standard has changed several times with respect to these lifetime issues, and I'm not sure launder is useful for anything today. But I don't quite know for sure. We have three potential problems in Optional, two of which are related to launder and one that isn't. First, there's the object replacement issue. If we have (for demonstration purposes only) struct X { int m; }; optional<X> opt( X{1} ); X const& rx = *opt; opt.reset(); opt.emplace( X{2} ); std::cout << rx.m << std::endl; the C++14 standard says we're fine as long as X doesn't contain const or reference members, and the C++17 standard used to say the same thing, but was changed before publication as a result of a national body comment to be less restrictive and we're fine there as long as the entire X object isn't const. So this would be undefined struct X { int m; }; optional<const X> opt( X{1} ); X const& rx = *opt; opt.reset(); opt.emplace( X{2} ); std::cout << rx.m << std::endl; without laundering _unless_ optional strips const before doing the placement new (I haven't checked whether it already does). So that's one use of launder taken care of. Second, there's the pointer provenance issue with placement new. If we have alignas(X) unsigned char storage[sizeof(X)]; X* p1 = (X1*)storage; X* p2 = new(p1) X{1}; cppreference says we can't use p1 and need to use launder(p1). But I'm not so sure about that. The exact same thing happens in std::vector. When you do push_back in vector<X>, an object of type X is created using placement new, but the result of new isn't stored anywhere. Instead, in op[] for instance, the "old" pointer is returned. And I don't see `launder` anywhere in the libstdc++ `<vector>`. I haven't checked the others but I suspect they don't have it either. So that's the second potential use of launder. The third issue we're having is with our aligned_storage. Its address() member function doesn't return the address of the unsigned char[] array, but the address of the aligned_storage object itself (or rather, to its aligned_storage_impl base, which contains the char array as its first member. Inside a union.) This means that we aren't in the clear with respect to the "provides storage" wording in https://eel.is/c++draft/basic.memobj#intro.object-3. I'm not entirely sure that what we're doing is undefined, but it looks like we can avoid this issue by just returning the address of the char array in address(). Note that this potential source of UB can't be fixed with launder.
But C++17 then added "Note: If these conditions are not met, a pointer to the new object can be obtained from a pointer that represents the address of its storage by calling std::launder". So now the language has the ability to deal with that UB, and a question arises, is it possible that compilers could start to use the UB to perform additional optimizations and make it a real UB? Also, I've seen a couple of times on stackoverflow.com people saying that it's actually fine to reinterpret_cast the storage in C++14, but in C++17 it's not (they didn't explain why though).
So, can switching from -std=c++14 to -std=c++17 be a breaking change when using Boost? The fact that Boost.Variant and Boost.Optional don't use std::launder - is it an oversight or a conscious decision?
In principle, C++17 doesn't introduce any new UB that C++14 already didn't have. It's theoretically possible for compilers to start doing more aggressive optimizations in C++17 mode and above, but I don't think they do. Not sure what LLVM plans are, though.
The third issue we're having is with our aligned_storage. Its address() member function doesn't return the address of the unsigned char[] array, but the address of the aligned_storage object itself (or rather, to its aligned_storage_impl base, which contains the char array as its first member. Inside a union.)
This means that we aren't in the clear with respect to the "provides storage" wording in https://eel.is/c++draft/basic.memobj#intro.object-3. I'm not entirely sure that what we're doing is undefined, but it looks like we can avoid this issue by just returning the address of the char array in address().
Note that this potential source of UB can't be fixed with launder.
Actually, Boost.Optional uses its own aligned_storage: https://github.com/boostorg/optional/blob/develop/include/boost/optional/det... which does return the address of the char[] buffer, but only if the compiler doesn't have the may_alias attribute, in which case the address of the union is returned, but the union is marked with may_alias. What effect this has is anyone's guess. I suppose it's effectively equivalent to laundering the address each time, but as the attribute is nonstandard, there's no way to know.
The third issue we're having is with our aligned_storage. Its address() member function doesn't return the address of the unsigned char[] array, but the address of the aligned_storage object itself (or rather, to its aligned_storage_impl base, which contains the char array as its first member. Inside a union.)
This means that we aren't in the clear with respect to the "provides storage" wording in https://eel.is/c++draft/basic.memobj#intro.object-3. I'm not entirely sure that what we're doing is undefined, but it looks like we can avoid
On 18/02/2022 19:01, Peter Dimov via Boost wrote: this
issue by just returning the address of the char array in address().
Note that this potential source of UB can't be fixed with launder.
Actually, Boost.Optional uses its own aligned_storage:
https://github.com/boostorg/optional/blob/develop/include/boost/optional/det...
which does return the address of the char[] buffer, but only if the
compiler
doesn't have the may_alias attribute, in which case the address of the union is returned, but the union is marked with may_alias.
What effect this has is anyone's guess. I suppose it's effectively equivalent to laundering the address each time, but as the attribute is nonstandard, there's no way to know.
Surely laundering the address each time is rather bad for optimisation and codegen? I mean, effectively launder invokes "escaped" during escape analysis right? Niall
Niall Douglas wrote:
Surely laundering the address each time is rather bad for optimisation and codegen?
I mean, effectively launder invokes "escaped" during escape analysis right?
I honestly have no idea. However, looking at https://godbolt.org/z/oKj5c5x7v, it doesn't seem so.
Niall Douglas wrote:
Surely laundering the address each time is rather bad for optimisation and codegen?
I mean, effectively launder invokes "escaped" during escape analysis right?
I honestly have no idea.
However, looking at https://godbolt.org/z/oKj5c5x7v, it doesn't seem so.
And in the case in which the address actually escapes, there doesn't seem to be any difference either: https://godbolt.org/z/j93oe3cr8
On 18/02/2022 20:36, Peter Dimov via Boost wrote:
Niall Douglas wrote:
Surely laundering the address each time is rather bad for optimisation and codegen?
I mean, effectively launder invokes "escaped" during escape analysis right?
I honestly have no idea.
However, looking at https://godbolt.org/z/oKj5c5x7v, it doesn't seem so.
Your examples aren't right - launder is for telling the compiler that an escaped value must be assumed to have changed, even if the compiler is allowed to assume it has not. I modified your example thusly: https://godbolt.org/z/oosdK6GMo That's not clear, so I made a clearer example of when launder is actually needed: https://godbolt.org/z/na78G77jY. You can see for test1 that even though ext() is called and it could modify the value returned by foo() because foo() is extern, the compiler is allowed by the standard to assume that the value won't change, becuase it is const and const values have special rules about immutability. In test2 we explicitly tell the compiler it can't assume immutability using launder, and it correctly reloads the const value after the ext() call. I think clang's implementation of launder is bugged because it doesn't inhibit compile time constant folding. I vaguely remember submitting that as a bug yonks back, and Richard Smith telling me yes it was a known issue and a timely fix would be unlikely for various reasons. MSVC gets launder right, same as GCC. Niall
Niall Douglas wrote:
Your examples aren't right - launder is for telling the compiler that an escaped value must be assumed to have changed, even if the compiler is allowed to assume it has not. I modified your example thusly: https://godbolt.org/z/oosdK6GMo
This example isn't right either, because `f` can easily `emplace` something into the passed `optional`. It needs to be https://godbolt.org/z/8qb79hK4E. Or, for completeness, https://godbolt.org/z/xxWToPo6M.
Niall Douglas wrote:
I mean, effectively launder invokes "escaped" during escape analysis right? ... That's not clear, so I made a clearer example of when launder is actually needed: https://godbolt.org/z/na78G77jY.
That's a good demonstration of how launder() is needed because of https://eel.is/c++draft/basic.memobj#basic.life-8, but it doesn't mean that your claim above is correct. The address is considered escaped in both cases. It's just that the compiler is allowed to assume that this doesn't matter in the no- launder case, because of the aforementioned basic.life#8.
The third issue we're having is with our aligned_storage. Its address() member function doesn't return the address of the unsigned char[] array, but the address of the aligned_storage object itself (or rather, to its aligned_storage_impl base, which contains the char array as its first member. Inside a union.)
This means that we aren't in the clear with respect to the "provides storage" wording in https://eel.is/c++draft/basic.memobj#intro.object-3. I'm not entirely sure that what we're doing is undefined, but it looks like we can avoid this issue by just returning the address of the char array in address().
Note that this potential source of UB can't be fixed with launder.
https://github.com/boostorg/type_traits/pull/168 As already noted, this doesn't affect Boost.Optional, which has its own aligned_storage.
https://github.com/boostorg/type_traits/pull/168
As already noted, this doesn't affect Boost.Optional, which has its own aligned_storage.
For Optional, please see https://github.com/boostorg/optional/pull/100 and https://github.com/boostorg/optional/pull/101
On 18/02/2022 10:33, Mikhail Kremniov via Boost wrote:
I wonder, what is the Boost community's opinion of std::launder? In particular, of its necessity when accessing an object that was created via placement-new in an aligned_storage. As I can see, it's used by Boost.Beast in its implementation of the variant type, but not in other parts of Boost.
Beast likely doesn't need to use it. He probably added it back when it was thought it would be needed, but CWG realised very very late on in the 17 release cycle that it was overly dangerous to existing code and they undid the need for it for the 17 release.
So, can switching from -std=c++14 to -std=c++17 be a breaking change when using Boost? The fact that Boost.Variant and Boost.Optional don't use std::launder - is it an oversight or a conscious decision?
In practice, you only need to launder const types. Otherwise you don't need to, in practice (except for corner cases, which don't matter for anything Optional or Variant or Outcome do). You also only need to launder const types if and only if you ever mutate them. So now the subset where laundering is needed is if and only if your type is const, and you mutate it. One solution is to not store your type as const, but otherwise treat it as const. Then laundering is not necessary. Another solution is don't allow mutation without enclosing type lifetime change. This is what Outcome does, and almost certainly what Variant and Optional also do. So tl;dr; I think you're safe unless it is YOU mutating const types stored in Boost objects. Or you're in a corner case, which are exceedingly rare and probably never will occur in your professional career. If anybody is about to ask me what those corner cases are, I'd suggest go ask Richard Smith. I know the only one I care about is vptr reload elision, and it's just about the only place where launder really is needed and there is no way of avoiding it. Niall
participants (5)
-
Andrea Bocci
-
Andrey Semashev
-
Mikhail Kremniov
-
Niall Douglas
-
Peter Dimov