
On Fri, Sep 2, 2011 at 12:44 PM, Nevin Liber <nevin@eviloverlord.com> wrote:
On 2 September 2011 11:01, Ilya Bobir <ilya.bobir@gmail.com> wrote:
unsigned int next_id() { static unsigned int previous_id = 0; //0 is not assigned to a type
++previous_id; return previous_id; }
[...]
Would not this be non-thread safe?
That's the second problem that has to be tackled with this code.
Would this work:
unsigned next_id() { static std::atomic<unsigned> previous_id; return ++previous_id; }
It is not exactly the second problem. The only reason for the new library is speed, but if one uses atomics or full-flagged locking it may become slower than the current boost::any. If this is the case it does not make any sense to look at the new library at all. What follows is a benchmarking of Boost.Any and the new any with some tweaks. I have included the actual output so that someone interested my double check my logic but I also summarized the numbers after every run, so if you read just the text, skipping the benchmarks output, you should get the picture anyway. And there is a summary in the last two paragraphs if you really want to look just at the end result. OK, so I started this because I was wondering how is it possible that we can skip a virtual function call when we need to figure out our real type id. Maybe I missed an explanation somewhere earlier in the thread but there is a tradeoff going on. We increase the size of the any instances and store the type id directly instead of relying on a virtual function call to figure it out. And this tradeoff is IIUC unrelated to the way we actually tag types. I have run the benchmark on my machine and figured out that the 50 000% gain on MSVC is a reasult of an optimization. After "fixing" the benchmark a little (attached as any_becnhmark.cpp) MS C++ gives me numbers of the same order of magnitude as GCC. I was compiling against Boost 1.44. Here is the output on my box: $ g++ --version g++ (GCC) 4.3.4 20090804 (release) 1 Ilya@Ilya-PC ~/works/tests/any $ g++ -O3 -I /d/works/boost/ any_benchmark.cpp Ilya@Ilya-PC ~/works/tests/any $ ./a.exe Testing any with int copying: old any: 249 new any: 250 moving: old any: 281 new any: 452 any_cast: old any: 187 new any: 63 Testing any with double copying: old any: 234 new any: 249 moving: old any: 312 new any: 437 any_cast: old any: 187 new any: 47 Testing any with std::string copying: old any: 265 new any: 265 moving: old any: 344 new any: 483 any_cast: old any: 203 new any: 47 sizeof(old any): 4 sizeof(new any): 8 Moving is actually ~30% slower in the new version, but any_cast is ~4 times faster. And the instance size is doubled. Ilya@Ilya-PC ~/works/tests/any $ cl Microsoft (R) C/C++ Optimizing Compiler Version 16.00.30319.01 for x64 Copyright (C) Microsoft Corporation. All rights reserved. This is a MS Visual Studio 2010 compiler. Ilya@Ilya-PC ~/works/tests/any $ cl -EHs -O2 -I 'D:\works\boost' any_benchmark.cpp Ilya@Ilya-PC ~/works/tests/any $ ./any_benchmark.exe Testing any with int copying: old any: 61 new any: 61 moving: old any: 75 new any: 102 any_cast: old any: 477 new any: 59 Testing any with double copying: old any: 61 new any: 61 moving: old any: 75 new any: 104 any_cast: old any: 476 new any: 59 Testing any with std::string copying: old any: 78 new any: 79 moving: old any: 88 new any: 124 any_cast: old any: 480 new any: 58 sizeof(old any): 8 sizeof(new any): 16 Note that this is a 64 bit compiler. Move is ~30% slower, any_cast is ~8 times faster. Instance size is doubled. Now about the generations of ids for types. If RTTI is available typeid(T).name() will be different for all types and at the same time it returns a pointer that would have the same value for all Ts across all compilation units (not considering dynamic libraries) and will be thread safe. So, I replaced the unsigned integers with const char pointers (any.typeid.name.hpp). Here are the numbers: Ilya@Ilya-PC ~/works/tests/any $ g++ -O3 -I /d/works/boost/ any_benchmark.cpp Ilya@Ilya-PC ~/works/tests/any $ ./a.exe Testing any with int copying: old any: 234 new any: 249 moving: old any: 297 new any: 452 any_cast: old any: 187 new any: 63 Testing any with double copying: old any: 249 new any: 250 moving: old any: 296 new any: 453 any_cast: old any: 187 new any: 62 Testing any with std::string copying: old any: 250 new any: 249 moving: old any: 344 new any: 468 any_cast: old any: 187 new any: 62 sizeof(old any): 4 sizeof(new any): 8 For the new version of any_cast test I was getting "62" or "47" as the average time for both unsinged interger and char pointer versions depending on the run, so for GCC this change did not actually affect the run time. At the same time this version is thread safe, but requires typeid to be accessible. Ilya@Ilya-PC ~/works/tests/any $ cl -EHs -O2 -I 'D:\works\boost' any_benchmark.cpp Ilya@Ilya-PC ~/works/tests/any $ ./any_benchmark.exe Testing any with int copying: old any: 61 new any: 67 moving: old any: 79 new any: 115 any_cast: old any: 488 new any: 95 Testing any with double copying: old any: 65 new any: 65 moving: old any: 78 new any: 115 any_cast: old any: 491 new any: 94 Testing any with std::string copying: old any: 79 new any: 80 moving: old any: 90 new any: 132 any_cast: old any: 486 new any: 93 sizeof(old any): 8 sizeof(new any): 16 64 bit cl on the other hand is persistently ~30% slower for the char pointer case. I guess it is again some kind of optimization. But I did not look at the generated code. The new version any_cast is still ~5 times faster. Then I though, well, why not use the type_info objects themselves? They are guaranteed to exist through the application lifetime. id is now "const std::type_info *". Here are the numbers (any.typeid.hpp): Ilya@Ilya-PC ~/works/tests/any $ g++ -O3 -I /d/works/boost/ any_benchmark.cpp Ilya@Ilya-PC ~/works/tests/any $ ./a.exe Testing any with int copying: old any: 249 new any: 250 moving: old any: 281 new any: 436 any_cast: old any: 188 new any: 62 Testing any with double copying: old any: 234 new any: 250 moving: old any: 296 new any: 421 any_cast: old any: 187 new any: 63 Testing any with std::string copying: old any: 249 new any: 266 moving: old any: 343 new any: 468 any_cast: old any: 187 new any: 47 sizeof(old any): 4 sizeof(new any): 8 No change from the char pointer case for GCC. Ilya@Ilya-PC ~/works/tests/any $ cl -EHs -O2 -I 'D:\works\boost' any_benchmark.cpp Ilya@Ilya-PC ~/works/tests/any $ ./any_benchmark.exe Testing any with int copying: old any: 60 new any: 61 moving: old any: 74 new any: 103 any_cast: old any: 464 new any: 59 Testing any with double copying: old any: 60 new any: 60 moving: old any: 74 new any: 102 any_cast: old any: 467 new any: 59 Testing any with std::string copying: old any: 77 new any: 79 moving: old any: 88 new any: 124 any_cast: old any: 461 new any: 58 sizeof(old any): 8 sizeof(new any): 16 MS C++ on the other hand was able to perform better than for the char pointer. Essentially the picture is the same as for the unsigned int case. OK, there is a trade-off that can be done to make any_cast 4 to 8 times faster by making move ~30% slower and increasing the size of any instances from one to two pointers. Let's try doing that with the Boost.Any source (boost.any.doubleSize.hpp): Ilya@Ilya-PC ~/works/tests/any $ g++ -O3 -I /d/works/boost/ any_benchmark.cpp Ilya@Ilya-PC ~/works/tests/any $ ./a.exe Testing any with int copying: old any: 234 new any: 249 moving: old any: 297 new any: 421 any_cast: old any: 187 new any: 62 Testing any with double copying: old any: 234 new any: 250 moving: old any: 296 new any: 437 any_cast: old any: 172 new any: 62 Testing any with std::string copying: old any: 250 new any: 281 moving: old any: 327 new any: 468 any_cast: old any: 187 new any: 47 sizeof(old any): 8 sizeof(new any): 8 No change?! I guess, GCC can optimize to a level when it does not care for a change like this. Ilya@Ilya-PC ~/works/tests/any $ cl -EHs -O2 -I 'D:\works\boost' any_benchmark.cpp Ilya@Ilya-PC ~/works/tests/any $ ./any_benchmark.exe Testing any with int copying: old any: 61 new any: 64 moving: old any: 78 new any: 110 any_cast: old any: 125 new any: 57 Testing any with double copying: old any: 63 new any: 63 moving: old any: 77 new any: 107 any_cast: old any: 124 new any: 57 Testing any with std::string copying: old any: 80 new any: 81 moving: old any: 91 new any: 128 any_cast: old any: 127 new any: 57 sizeof(old any): 16 sizeof(new any): 16 MS C++ on the other hand does care. Note that the change only affects the any_cast speed - almost 4 times faster, nor copy, nor move are affected. So, it seems that for GCC (at least 4.3.4 on Cyginw at O3) the difference has nothing to do with the way we store and retrieve the type information. Lets compare just Boost.Any with and without the patch side by side (boost.any_benchmark.cpp): Ilya@Ilya-PC ~/works/tests/any $ g++ -O3 -I /d/works/boost/ boost.any_benchmark.cpp Ilya@Ilya-PC ~/works/tests/any $ ./a.exe Testing any with int copying: old any: 249 new any: 234 moving: old any: 297 new any: 296 any_cast: old any: 187 new any: 172 Testing any with double copying: old any: 249 new any: 234 moving: old any: 297 new any: 296 any_cast: old any: 187 new any: 188 Testing any with std::string copying: old any: 265 new any: 249 moving: old any: 328 new any: 343 any_cast: old any: 187 new any: 172 sizeof(old any): 4 sizeof(new any): 8 Same times for GCC. Only the instance sizes are different. Ilya@Ilya-PC ~/works/tests/any $ cl -EHs -O2 -I 'D:\works\boost' boost.any_benchmark.cpp Ilya@Ilya-PC ~/works/tests/any $ ./boost.any_benchmark.exe Testing any with int copying: old any: 60 new any: 61 moving: old any: 75 new any: 74 any_cast: old any: 459 new any: 125 Testing any with double copying: old any: 60 new any: 62 moving: old any: 73 new any: 73 any_cast: old any: 465 new any: 122 Testing any with std::string copying: old any: 78 new any: 78 moving: old any: 87 new any: 88 any_cast: old any: 462 new any: 124 sizeof(old any): 8 sizeof(new any): 16 MS C++ on the other hand likes the change a lot. An almost 4 times speed increase for any_cast with no speed changes for other operations. I guess that now it is time to look at the generated code to figure out why GCC does not care for this change or what else is different between Boost.Any and the version presented by Martin, but I will stop here for now. The bottom line: there is a change that will make any_cast almost 4 times faster for MS C++, does not change speed on GCC and increases size of all any instances from one pointer to two pointers. The change is thread safe. I think that in order to compare apples to apples one need to change the "unsigned integers as type ids" case to be thread safe and only then benchmark. A non-thread safe any is probably of a limited use, isn't it? Ilya Bobyr