Re: [boost] Accelerating algorithms with SIMD - Segmented iterators and alternatives

13 Oct 2010

      On 12/10/2010 22:34, Daniel Herring wrote:
...
Presuming a sufficiently advanced jit architecture,
- static code generation
- may not have access to custom opcodes or hardware on the user's computer
- may not be able to produce all optimal variants
- cannot change after delivery
- can only guess the actual runtime code paths
- a jit on the user's machine can support
- local benchmarking and profiling
- user-side optimization (including profile hints)
- use an upgraded compiler
- save compiled results
- ...
Basically, think of it as moving the compiler from the developer's box
to the user's executable. Both are "the same compiler", but the user's
copy presumably knows more about the actual input data and hardware
available.
Unlike Joel, I do see the advantages of using a JIT to do 
high-performance computing, and the potential benefit of including such 
technology into NT2, albeit those benefits would be minimal in the 
current market NT2 is targeting.

I think, however, that this is getting way off-topic. We are only 
looking at submitting the SIMD abstraction layer to Boost, nothing more, 
and certainly not the whole of NT2. There is much more to a numerical 
computation library or runtime than just SIMD.

People have expressed interest in using a SIMD abstraction layer to 
speed up uBLAS, but it has various implications beyond that, and can 
also be used to give a significant speedup to text processing, 
compression algorithms, or whatever else you might think of.

I want to repeat what the SIMD library is in case it hasn't been clear: 
it only provides an abstraction that allows one to model SIMD registers, 
and tools to help in packetizing data into register-sized packets while 
taking care of alignment considerations; all of it while ensuring 
operations on the packets will directly translate in the expected SIMD 
instructions, for many different architectures on all major C++ compilers.

As a Boost library, it intends to do only one job, in a generic and 
minimal fashion, and do it well.
NT2.SIMD or Boost.SIMD would take all its code generation decisions at 
compile-time according to what the compiler claims to support, like a 
regular C++ library. If the user of that library wants to take those 
decisions at runtime, it will be up to him to JIT the code (which 
shouldn't be too difficult, since we aim at providing strong integration 
with clang), use a strategy similar to that of the Intel compiler (and 
as already said in this thread, it already integrates with that strategy 
very well), or use an environment or runtime built upon the library that 
already does it.

Re: [boost] Accelerating algorithms with SIMD - Segmented iterators and alternatives

Mathias Gaunard