Re: [boost] Going forward with Boost.SIMD

18 Apr 2013

      On Thu, 18 Apr 2013, Mathias Gaunard wrote:
...
Development of Boost.SIMD will still proceed, aiming for integration in 
Boost, but standardization appears to be definitely out of the question.
Any feedback of the API presented in the proposal is welcome.
<http://open-std.org/JTC1/SC22/WG21/docs/papers/2013/n3571.pdf>
Copying here my earlier comments so they are in the same place as others'.
Some of them are only relevant for standardization, not for a boost library.

Hello,

a few comments while reading N3571.

pack<T,N> seems more similar to std::array than std::tuple to me. We could 
even dream of merging pack and array into a single type. One reason 
existing std::array implementations (at least those I know) do not use 
vector registers is the ABI. An efficient implementation would pass 
function arguments in vector registers, but that would for instance make 
x86, mmx, sse, sse2 and avx five incompatible ABIs.

As much as possible, I would like to avoid having a different interface 
for vectors and scalars. We have std::min for scalars, we can overload it 
for vectors instead of having simd::min. We have ?: for scalars, you don't 
need to restrict yourself to a pure library, you can use ?: for vectors as 
well instead of if_else, like OpenCL (g++-4.8 also implements that). Why 
forbid & for logical? Doesn't hurt to make it equivalent to &&.

For the logical class, you may want to consider sparc VIS as an example 
that doesn't use the same registers.

Masking: it is a bit strange to be able to do pack<double> & int but not 
double & int. Currently in gcc we require that you (reinterpret) cast the 
pack<double> to a pack<some integer>, do the masking and go back. Not very 
important though.

Any policy on reinterpret_cast-ing a pack to a pack of a different type?

The description of some overloads of shuffle are hard to read: missing 
indices, missing F parameter.

aligned_malloc doesn't exist. aligned_alloc is C11. posix has 
posix_memalign. The closest in name is Microsoft's _aligned_malloc.
N3396 might be relevant here.

template < class T , std :: size_t N = unspecified >
struct alignas ( sizeof ( T ) * N ) pack

Do you really want to specify that large an alignment? You give examples 
with N=100...

Maybe operator[] const could return by value if it wants to?

Since the splat constructor is implicit, you may not need to document all 
the mixed operations.

Calling splat both the idea of copying a single element in all places, and 
the idea of converting elementwise, is confusing.

Any notion of a subvector?

For gather and others, the proposal accepts mixing vector sizes. However, 
for better performance, we will usually want to use vectors of the same 
size. Any convenient way, given a pack type, to ask for a signed integer 
pack type of the same size and number of elements?

Reduction: people sometimes come up with proposals for a variadic min/max, 
which might interact in funny ways.

cmath functions: it is not clear what signatures are supported, in 
particular for functions that use several types (ldexp has double and 
int). The list doesn't seem to exactly match cmath, actually. frexp takes 
an int*, does the vector version take a pointer to a vector, or some 
scatter-like vector?

Traits: are those supposed to be template aliases? Or to derive from what 
they are supposed to "return"? Or have a typedef ... type; inside?

For transform and accumulate, I have seen other proposals that specify new 
versions more in terms of permissions (what the compiler is allowed to do) 
and less implementation. Depending on the tag argument you pass to 
transform/accumulate, you give the compiler permission to reorder the 
operations, or do other transformations and it then deduces that it can 
parallelize and/or vectorize. Looks nice. Note that it doesn't contradict 
this proposal, simd::transform can always forward to std::transform(..., 
vectorizable_tag()) (or in the reverse direction).

constexpr, noexcept?

That's it for now :-)

-- 
Marc Glisse