[config/multiprecision/units/general] Do we have a policy for user-defined-literals?

newer
Re: [boost] [gsoc-2013]...

older
[chrono/date] Performance goals...

John Maddock

27 Apr 2013 27 Apr '13

12:29 p.m.

Folks, I've been experimenting with the new C++11 feature "user-defined-literals", and they're pretty impressive ;-) So far I've been able to write code such as: auto i = 0x1234567890abcdef1234567890abcdef_cppi; and generate value i as a signed 128-bit integer. It's particulary remarkable because: * It actually works ;-) * The value is generated as a constexpr - all evaluation and initialization of the multiprecision integer is done at compile time. * The actual meta-code is remarkably short and concise once you've figured out how on earth to get started! Note however my code is limited to hexadecimal constants, because it can't do compile time radix conversion - that would require a huge meta-program for each constant :-( This is obviously useful to the multiprecision library, and I'm sure for Units as well, but that leaves a couple of questions: 1) We have no config macro for this new feature (I've only tested with GCC, and suspect Clang is the only other compiler with support at present). What should it be called? Would everyone be happy with BOOST_NO_CXX11_USER_DEFINED_LITERALS ? 2) How should libraries handle these user defined suffixes? The essencial problem is that they have to be in current scope at point of use, you can never explicitly qualify them. So I suggest we use: namespace boost{ namespace mylib{ namespace literals{ mytype operator "" _mysuffix(args...); }}} Then users can import the whole namespace easily into current scope right at point of use: int main() { using namespace boost::mylib::literals; boost::mylib::mytype t = 1234_mysuffix; } 3) How should the suffixes be named? There is an obvious possibility for clashes here - for example the units lib would probably want to use _s for seconds, but no doubt other users might use it for strings and such like. We could insist that all such names added to a boost lib are suitably mangled, so "_bu_s" for boost.units.seconds, but I'm not convinced by that. Seems to make the feature much less useful? Many thanks in advance for your thoughts, John.

Show replies by date

Steven Watanabe

27 Apr 27 Apr

2:06 p.m.

New subject: [config/multiprecision/units/general] Do we have a policy for user-defined-literals?

AMDG On 04/27/2013 05:29 AM, John Maddock wrote:

...

1) We have no config macro for this new feature (I've only tested with GCC, and suspect Clang is the only other compiler with support at present). What should it be called? Would everyone be happy with BOOST_NO_CXX11_USER_DEFINED_LITERALS ?

Yes.

...

2) How should libraries handle these user defined suffixes? The essencial problem is that they have to be in current scope at point of use, you can never explicitly qualify them. So I suggest we use:

namespace boost{ namespace mylib{ namespace literals{

mytype operator "" _mysuffix(args...);

}}}

<snip>

We need a nested namespace. I think using a convention of calling it literals:: is fine.

...

3) How should the suffixes be named? There is an obvious possibility for clashes here - for example the units lib would probably want to use _s for seconds, but no doubt other users might use it for strings and such like. We could insist that all such names added to a boost lib are suitably mangled, so "_bu_s" for boost.units.seconds, but I'm not convinced by that. Seems to make the feature much less useful?

I think we should just use the obvious short names, and rely on users not to bring conflicting suffixes into scope. If there's a conflict they can always fall back on normal constructors. In Christ, Steven Watanabe

John Maddock

28 Apr 28 Apr

8:04 a.m.

New subject: [config/multiprecision/units/general] Do we have a policy for user-defined-literals?

...

...
3) How should the suffixes be named? There is an obvious possibility for clashes here - for example the units lib would probably want to use _s for seconds, but no doubt other users might use it for strings and such like. We could insist that all such names added to a boost lib are suitably mangled, so "_bu_s" for boost.units.seconds, but I'm not convinced by that. Seems to make the feature much less useful?

I think we should just use the obvious short names, and rely on users not to bring conflicting suffixes into scope. If there's a conflict they can always fall back on normal constructors.

Nod. Short names are my preference too. Note however, that constructors may be less efficient in general - cpp_int users would have to fall back on a construct-from-string rather than constexpr initialisation (the issue is you can't write a number with enough digits unless it has a user-defined-suffix). John.

Marc Glisse

10:37 a.m.

New subject: [config/multiprecision/units/general] Do we have a policy for user-defined-literals?

On Sun, 28 Apr 2013, John Maddock wrote:

...

...
I think we should just use the obvious short names, and rely on users not to bring conflicting suffixes into scope. If there's a conflict they can always fall back on normal constructors.

Nod. Short names are my preference too.

Note however, that constructors may be less efficient in general - cpp_int users would have to fall back on a construct-from-string rather than constexpr initialisation (the issue is you can't write a number with enough digits unless it has a user-defined-suffix).

Can't you construct from string constexpr? Both gcc and clang are happy with code like this (just an experiment to see what constexpr accepts): struct uint128 { unsigned long h, l; constexpr uint128(unsigned long h_, unsigned long l_):h(h_),l(l_){} constexpr uint128 lshift(int i)const{ return uint128{h<<4|l>>60,l<<4|i}; } static constexpr int chartoint(char c){ return (c>='0'&&c<='9')?c-'0':(c-'a'+10); } static constexpr uint128 from_string(uint128 tmp, const char* s){ return (*s==0)?tmp:from_string(tmp.lshift(chartoint(*s)),s+1); } }; int main(){ constexpr uint128 a=uint128::from_string(uint128{0,0},"1234567890abcdef123"); static_assert(a.l==0x4567890abcdef123,""); static_assert(a.h==0x123,""); } -- Marc Glisse

John Maddock

12:46 p.m.

New subject: [config/multiprecision/units/general] Do we have a policy for user-defined-literals?

...

...
Note however, that constructors may be less efficient in general - cpp_int users would have to fall back on a construct-from-string rather than constexpr initialisation (the issue is you can't write a number with enough digits unless it has a user-defined-suffix).

Can't you construct from string constexpr? Both gcc and clang are happy with code like this (just an experiment to see what constexpr accepts):

struct uint128 { unsigned long h, l; constexpr uint128(unsigned long h_, unsigned long l_):h(h_),l(l_){} constexpr uint128 lshift(int i)const{ return uint128{h<<4|l>>60,l<<4|i}; } static constexpr int chartoint(char c){ return (c>='0'&&c<='9')?c-'0':(c-'a'+10); } static constexpr uint128 from_string(uint128 tmp, const char* s){ return (*s==0)?tmp:from_string(tmp.lshift(chartoint(*s)),s+1); } };

int main(){ constexpr uint128 a=uint128::from_string(uint128{0,0},"1234567890abcdef123"); static_assert(a.l==0x4567890abcdef123,""); static_assert(a.h==0x123,""); }

I had no idea you could do that! It's still a lot easier to code a user-defined-literal though ;-) Thanks, John.

Vicente J. Botet Escriba

27 Apr 27 Apr

3 p.m.

New subject: [config/multiprecision/units/general] Do we have a policy for user-defined-literals?

Le 27/04/13 14:29, John Maddock a écrit :

...

Folks,

I've been experimenting with the new C++11 feature "user-defined-literals", and they're pretty impressive ;-)

So far I've been able to write code such as:

auto i = 0x1234567890abcdef1234567890abcdef_cppi;

and generate value i as a signed 128-bit integer. It's particulary remarkable because:

* It actually works ;-) * The value is generated as a constexpr - all evaluation and initialization of the multiprecision integer is done at compile time. * The actual meta-code is remarkably short and concise once you've figured out how on earth to get started!

Note however my code is limited to hexadecimal constants, because it can't do compile time radix conversion - that would require a huge meta-program for each constant :-(

This is obviously useful to the multiprecision library, and I'm sure for Units as well, but that leaves a couple of questions:

1) We have no config macro for this new feature (I've only tested with GCC, and suspect Clang is the only other compiler with support at present). What should it be called? Would everyone be happy with BOOST_NO_CXX11_USER_DEFINED_LITERALS ? This is fine. clang uses __has_feature(cxx_user_literals) 2) How should libraries handle these user defined suffixes? The essencial problem is that they have to be in current scope at point of use, you can never explicitly qualify them. So I suggest we use:

namespace boost{ namespace mylib{ namespace literals{

mytype operator "" _mysuffix(args...);

}}}

Then users can import the whole namespace easily into current scope right at point of use:

int main() { using namespace boost::mylib::literals; boost::mylib::mytype t = 1234_mysuffix; }

If http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3402.pdf is the final version accepted for C++14, the standard will use, e.g. namespace std { namespace suffixes { namespace chrono { One of the advantages is that we can add some utilities. I helped Peter Sommerlad to port his reference implementation to Boost (See https://github.com/PeterSommerlad/UDLSuffixBoost/tree/master/boost/suffixes). There are some interesting utilities that make easier implementing suffixes. I guess it would like to name his library Boost.Suffixes. This doesn't means that we can not choose your option.

...

3) How should the suffixes be named? There is an obvious possibility for clashes here - for example the units lib would probably want to use _s for seconds, but no doubt other users might use it for strings and such like. We could insist that all such names added to a boost lib are suitably mangled, so "_bu_s" for boost.units.seconds, but I'm not convinced by that. Seems to make the feature much less useful?

I agree with Steven. We should choose the better suffixes for the specific domain independently of other suffixes on other libraries. Best, Vicente

John Maddock

28 Apr 28 Apr

8:12 a.m.

New subject: [config/multiprecision/units/general] Do we have a policy for user-defined-literals?

...

If http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3402.pdf is the final version accepted for C++14, the standard will use, e.g.

namespace std { namespace suffixes { namespace chrono {

One of the advantages is that we can add some utilities. I helped Peter Sommerlad to port his reference implementation to Boost (See https://github.com/PeterSommerlad/UDLSuffixBoost/tree/master/boost/suffixes). There are some interesting utilities that make easier implementing suffixes. I guess it would like to name his library Boost.Suffixes. This doesn't means that we can not choose your option.

I hadn't seen that before, thanks for the heads up. If we have UDL utilities in boost then I agree they should have their own top-level namespace in Boost, whether it makes sense to group all literals in there is another matter. My gut feeling is that users will find boost::mylib::literals or boost::mylib::suffixes easier, but I can see how it would make sense for the std to go your way. BTW, I believe your implementation of parse_int is unnessarily complex, looks like that whole file can be reduced to just: template <unsigned base, unsigned long long val, char... Digits> struct parse_int { // The default specialization is also the termination condition: // it gets invoked only when sizeof...Digits == 0. static_assert(base<=16u,"only support up to hexadecimal"); static constexpr unsigned long long value{ val }; }; template <unsigned base, unsigned long long val, char c, char... Digits> struct parse_int<base, val, c, Digits...> { static constexpr unsigned long long char_value = (c >= '0' && c <= '9') ? c - '0' : (c >= 'a' && c <= 'f') ? c - 'a' : (c >= 'A' && c <= 'F') ? c - 'A' : 400u; static_assert(char_value < base, "Encountered a digit out of range"); static constexpr unsigned long long value{ parse_int<base, val * base + char_value, Digits...>::value }; }; Typical usage is: template <char...PACK> constexpr unsigned long long operator "" _b() { return parse_int<2, 0, PACK...>::value; } constexpr unsigned bt = 1001_b; static_assert(bt == 9, ""); More than that though: I can't help but feel that base 8, 10 and 16 parsing is much better (faster) handled by the compiler, so parse_int could be reduced to base-2 parsing only which would simplify it still further. What's the rationale for the chrono integer literals parsing the ints themselves rather than using cooked literals? Cheers, John.

Peter Sommerlad

6 May 6 May

10:06 a.m.

New subject: [config/multiprecision/units/general] Do we have a policy for user-defined-literals?

Hi, Vicente made me aware of that post. Since I am guilty, I'd like to answer. First of all, John Maddock is right in almost everything... my fault. However, let me defend what's there and confess what shouldn't be there. As John wrote:

...

I've been experimenting with the new C++11 feature "user-defined-literals", and they're pretty impressive ;-)

that's how I started as well.... 1. using the parsing version for chrono literals is bullshit, John is right. The standard proposal I submitted does only define the regular numeric literal operators: operator"" h(unsigned long long) and operator"" h(long double), etc. 2. one can use the template version of operator"" for two things: * parsing integer values in non-standard bases, i.e., ternary * determining the best fitting type for integral values like the compiler does for integers, this can not be done through the cooked version, because it requires a template parameter for the meta function. 2.a the latter is a reason where it can make sense for a concrete chrono implementation to use the parsing version for the suffixes, since it depends on the implementation which integral range is useful/used for representing the duration (at least in the standard version), where it is open which integral type is used for the representation (i.e. at least 23 bits for hours). But that is really only interesting when you actually use that many hours. 3. the parsing can be "simplified" with using a more complex expression like John proposes, instead of the monstrous template dispatching I implemented. I haven't tested which is faster for which compiler yet and unfortunately doesn't have the time to do so soon. John's version is definitely shorter than the many overloads but requires an additional template parameter, so it is not a direct replacement to my coded version. However, this way it avoids the pow multiplications. (see also inline below). So thank you for teaching me something. Regards Peter. On 28.04.2013, at 10:12, John Maddock <john@johnmaddock.co.uk> wrote:

...

...
If http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3402.pdf is the final version accepted for C++14, the standard will use, e.g.

namespace std { namespace suffixes { namespace chrono {

One of the advantages is that we can add some utilities. I helped Peter Sommerlad to port his reference implementation to Boost (See https://github.com/PeterSommerlad/UDLSuffixBoost/tree/master/boost/suffixes). There are some interesting utilities that make easier implementing suffixes. I guess it would like to name his library Boost.Suffixes. This doesn't means that we can not choose your option.

I hadn't seen that before, thanks for the heads up.

If we have UDL utilities in boost then I agree they should have their own top-level namespace in Boost, whether it makes sense to group all literals in there is another matter. My gut feeling is that users will find boost::mylib::literals or boost::mylib::suffixes easier, but I can see how it would make sense for the std to go your way.

BTW, I believe your implementation of parse_int is unnessarily complex, looks like that whole file can be reduced to just:

template <unsigned base, unsigned long long val, char... Digits> struct parse_int { // The default specialization is also the termination condition: // it gets invoked only when sizeof...Digits == 0. static_assert(base<=16u,"only support up to hexadecimal"); static constexpr unsigned long long value{ val }; };

template <unsigned base, unsigned long long val, char c, char... Digits> struct parse_int<base, val, c, Digits...> { static constexpr unsigned long long char_value = (c >= '0' && c <= '9') ? c - '0' : (c >= 'a' && c <= 'f') ? c - 'a' : (c >= 'A' && c <= 'F') ? c - 'A' : 400u; static_assert(char_value < base, "Encountered a digit out of range"); static constexpr unsigned long long value{ parse_int<base, val * base + char_value, Digits...>::value }; };

Typical usage is:

template <char...PACK> constexpr unsigned long long operator "" _b() { return parse_int<2, 0, PACK...>::value; }

constexpr unsigned bt = 1001_b;

static_assert(bt == 9, "");

More than that though: I can't help but feel that base 8, 10 and 16 parsing is much better (faster) handled by the compiler, so parse_int could be reduced to base-2 parsing only which would simplify it still further. The latter will be standardized to be 0b101010 in C++14. So not many useful things remain, unless you want ternary literals :-)

...

What's the rationale for the chrono integer literals parsing the ints themselves rather than using cooked literals? none. It was a ridiculous experiment by me and I forgot to adapt it back again to the cooked version.

Cheers, John.

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

-- Prof. Peter Sommerlad Institut für Software: Bessere Software - Einfach, Schneller! HSR Hochschule für Technik Rapperswil Oberseestr 10, Postfach 1475, CH-8640 Rapperswil http://ifs.hsr.ch http://cute-test.com http://linticator.com http://includator.com tel:+41 55 222 49 84 == mobile:+41 79 432 23 32 fax:+41 55 222 46 29 == mailto:peter.sommerlad@hsr.ch

Christopher Kormanyos

28 Apr 28 Apr

8:18 a.m.

New subject: [config/multiprecision/units/general] Do we have a policy for user-defined-literals?

...

Folks, I've been experimenting with the new C++11 feature "user-defined-literals", and they're pretty impressive ;-)

...

So far I've been able to write code such as:

...

auto i = 0x1234567890abcdef1234567890abcdef_cppi;

...

and generate value i as a signed 128-bit integer. It's particulary remarkable because:

...

* It actually works ;-)

<snip>

...

This is obviously useful to the multiprecision library, and I'm sure for Units as well, but that leaves a couple of questions:

Wow! If I ever get my behind in gear on the radix-2 floating-point backend, do you think it potentially be used there in combination with constexpr as well? Sincerely, Chris.

John Maddock

12:55 p.m.

New subject: [config/multiprecision/units/general] Do we have a policy for user-defined-literals?

...

Wow!

If I ever get my behind in gear on the radix-2 floating-point backend, do you think it potentially be used there in combination with constexpr as well?

Floating point is tough because you have the exponent to deal with. For integers it's simply: * Accept a sequence of 4-bit values (the hex bytes). * Shuffle the 4-bit value sequence into a limb-bit value sequence. * Initialise your array of limbs with the sequence. Once you get your head around what you're trying to do, there's actually very little code involved at all :-) That said, I don't see why it couldn't be done for hexadecimal floating point constants - at least in princpal! Decimal conversion would be next to impossible because you'd need a complete compile-time-arbitary-precision-arithmetic library :-( Cheers, John.

Marc Glisse

1:13 p.m.

New subject: [config/multiprecision/units/general] Do we have a policy for user-defined-literals?

On Sun, 28 Apr 2013, John Maddock wrote:

...

Decimal conversion would be next to impossible because you'd need a complete compile-time-arbitary-precision-arithmetic library :-(

You only need addition and shift by 1 bit, that doesn't seem so bad. Although of course, with C++14 arriving, it is more useful to spend energy on something else. -- Marc Glisse

Marc Glisse

2:14 p.m.

New subject: [config/multiprecision/units/general] Do we have a policy for user-defined-literals?

On Sun, 28 Apr 2013, Marc Glisse wrote:

...

On Sun, 28 Apr 2013, John Maddock wrote:

...
Decimal conversion would be next to impossible because you'd need a complete compile-time-arbitary-precision-arithmetic library :-(

You only need addition and shift by 1 bit, that doesn't seem so bad.

Not pretty, but here is an example, tested with gcc and clang. Now as to how you can determine if the context is constexpr to use this instead of more efficient runtime code... -- Marc Glisse

Christopher Kormanyos

3:50 p.m.

New subject: [config/multiprecision/units/general] Do we have a policy for user-defined-literals?

...

...
If I ever get my behind in gear on the radix-2

...

...
floating-point backend, do you think it potentially be used there in combination with constexpr as well?

...

Floating point is tough because you have the exponent to deal with.

Oh, I see what you mean. And then there's the conversion from radix-10 to radix-2. One would need a hefty compile-time parser. Still, if I ever get that far, I would like to play around with constexpr and see what the language offers, maybe even in an non-anticipated way. <snip>

...

Cheers, John.

4494

Age (days ago)

4503

Last active (days ago)

List overview

Download

12 comments

6 participants

participants (6)

Christopher Kormanyos
John Maddock
Marc Glisse
Peter Sommerlad
Steven Watanabe
Vicente J. Botet Escriba