
John, Ah, you probably misunderstood what I meant. Yes, I misunderstood that you changed your fixed int implementation to a new one. There are other similar examples, but basically my case is that to make the
integer type general purpose there are actually lots of these special case checks, each of which are there for a very good reason, but which collectively easily account for that 10% slowdown in my code compared to yours.
Well, as I noticed before it's not just 10%, because algorithm execution time should be subtracted. Nevertheless I completely understand that it is not possible to get as fast code as mine. I would consider generalized implementation that is 50% slower to be pretty good (but not 3 times slower as before). It seems that you manage to do that so accept my congratulations! I am willing to have a look at your implementation, is it already in sandbox? Again, another compromise: sign magnitude is actually very slightly slower
than the old 2's complement code for some operations, but has enough other advantages to make it worthwhile (so you sold me on that)
Could you list those? I am willing to test :-) It also has a template parameter that lets you tune the size of the
internal cache (the bit that's used before any memory allocation takes place).
I am not sure what is internal cache here. Need to have a look at your implementation. Did you implement two separate backends: stack allocated, heap allocated? Or did you come up with generalized implementation that uses fixed array for small numbers and switches to dynamic array if exceeds its capacity? Basically what I'm saying here, is there is no one true configuration,
that offers the very best performance in all cases, if there was, we'd probably all be using GMP ;-)
Completely agree. But from advanced user perspective it's good to be able to specify configuration (stack or heap allocation, limb size, etc). For example it is possible to rewrite all the Voronoi code using GMP almost without any performance lose. But this will require to split huge math expressions onto atomic formulas, thus reducing code readability and maintainability. That's the main reason I decided to switch to fixed bigint. PS, the next step is to start stealing some more special case code from the
SOC project.... I fully expect that to make your benchmarks run very slightly more slowly, but is necessary for other use cases...
I'd like to have a look at your backend implementation. It seems to be already well done from your Voronoi benchmark. So it will be good to move parts of that SOC projects to your current implementation. One of the useful things we can think of is FFT multiplication (which is fast for large numbers). For medium size numbers I would think of Karatsuba multiplication. I would suggest generalized multiplication that will combine those two plus usual quadratic multiplication for small numbers. I can help on testing those and would ask Arseniy if he's interested on migrating his FFT code. Does it sound good for you? Andrii Unsubscribe & other changes: http://lists.boost.org/** mailman/listinfo.cgi/boost<http://lists.boost.org/mailman/listinfo.cgi/boost>