
On 12/10/2010 22:34, Daniel Herring wrote:
Presuming a sufficiently advanced jit architecture,
- static code generation - may not have access to custom opcodes or hardware on the user's computer - may not be able to produce all optimal variants - cannot change after delivery - can only guess the actual runtime code paths
- a jit on the user's machine can support - local benchmarking and profiling - user-side optimization (including profile hints) - use an upgraded compiler - save compiled results - ...
Basically, think of it as moving the compiler from the developer's box to the user's executable. Both are "the same compiler", but the user's copy presumably knows more about the actual input data and hardware available.
Unlike Joel, I do see the advantages of using a JIT to do high-performance computing, and the potential benefit of including such technology into NT2, albeit those benefits would be minimal in the current market NT2 is targeting. I think, however, that this is getting way off-topic. We are only looking at submitting the SIMD abstraction layer to Boost, nothing more, and certainly not the whole of NT2. There is much more to a numerical computation library or runtime than just SIMD. People have expressed interest in using a SIMD abstraction layer to speed up uBLAS, but it has various implications beyond that, and can also be used to give a significant speedup to text processing, compression algorithms, or whatever else you might think of. I want to repeat what the SIMD library is in case it hasn't been clear: it only provides an abstraction that allows one to model SIMD registers, and tools to help in packetizing data into register-sized packets while taking care of alignment considerations; all of it while ensuring operations on the packets will directly translate in the expected SIMD instructions, for many different architectures on all major C++ compilers. As a Boost library, it intends to do only one job, in a generic and minimal fashion, and do it well. NT2.SIMD or Boost.SIMD would take all its code generation decisions at compile-time according to what the compiler claims to support, like a regular C++ library. If the user of that library wants to take those decisions at runtime, it will be up to him to JIT the code (which shouldn't be too difficult, since we aim at providing strong integration with clang), use a strategy similar to that of the Intel compiler (and as already said in this thread, it already integrates with that strategy very well), or use an environment or runtime built upon the library that already does it.