
On Thu, Jun 10, 2010 at 2:49 AM, Chad Nelson <chad.thecomfychair@gmail.com> wrote:
On 06/09/2010 03:23 PM, DE wrote:
[...]
Unfortunately I don't see a way to safely and automatically provide entirely stack-based integers yet.
but now it's look like i need to do the following to achieve a fixed size int:
template<size_t size> class my_stack_allocator {/*...*/};
typedef xint::fixed_integer<512, my_stack_allocator<512> > int512;
in my opinion this is redudant since a the size of a given fixed integer is known in advance (at compile time) and it is intended to NOT allocate on the heap (as i understand it) and even not supposed to use an allocator
A stack-based integer's magnitude would always have to be stored directly adjacent to the integer itself on the stack, or risk having one deallocated before the other. That means no sharing of memory (even temporarily) for integers with identical magnitudes, and no moving magnitudes around by pointer-swapping -- every time you copy an integer for any reason, you'd have to copy each and every byte of the magnitude along with it. Swapping two 512-bit integers, just as an example, would mean three copies of 64+ bytes each, rather than a simple pointer swap.
I assume you wanted to use stack-based memory for speed, but due to the above, it would likely be slower than the equivalent heap-based integer. Especially with a good caching allocator.
In my experience, stack allocated memory easily beat heap allocation even for large buffer sizes. In the past I have used a fixed string type with a ridiculously large size for some specialized application and it outperformed the heap based string type it replaced (I didn't try to manually optimize copies away). The reason is that first of all modern cpu are pretty good at copying stuff around, much better than at chasing pointers, also stack allocated objects usually give more freedom to the compiler optimizer (more precise aliasing information and no globally visible side effects). Finally, 512 bit is not that anyway, on a modern x86 processor that's 4 registers or a single cacheline. Did you consider separating the big int data structure itself from the big int algorithms, a la STL? This way anybody can roll their own big int class according to their own need and you need to provide only a default one, optimized for the common case. I haven't read the code or documentation nor followed the discussion in depth, so this might actually already be the case. Just my 0.02€, -- gpd