C++ AMP is in my opinion a more promising proposal compared to SYCL. Developers opt for C++ AMP today. But both SYCL and C++ AMP are higher level tools and have the disadvantages of any higher level library compared to a lower level library.
How is SYCL a higher-level tool? Have a look at the provisional spec:
https://www.khronos.org/registry/sycl
It has equivalents of everything you find in Boost.Compute, *except* for
Gruenke,Matt
Moreover, they introduce the notion of command groups, and possibly other low level features.
I have and I judge it to be a higher level compared to OpenCL (or Boost.Compute). SYCL abstracts memory access through 'accessors' and I'm not sure that you can issue explicit memory copies in SYCL. Both OpenCl and Boost.Compute have explicit copy functions with synchronous or asynchronous semantics. This is also one of my major points of critique of C++ AMP - it is unclear when data is transferred between host and device.
In fact, that has been the primary factor fueling my interest, over the years. But there are many systems that still don't support it. And it's only one solution to this problem.
Agreed.
I urge you to not open this can of worms.
I didn't mean to imply that it *needed* to have a backend for XYZ. I am merely *suggesting* backends such as a threadpool or possibly OpenMP. My point was about the design - that it should facilitate the addition of backends, in order to address both existing and future systems where OpenCL support is absent or inefficient.
Again, the key point is that the design should accommodate different backends. Whether a given backend is developed depends on whether there's enough interest for someone to write and maintain it. And perhaps some backends will exist only as proprietary patches maintained in private repositories of users. The main contribution of Boost.Compute would then be the framework, interface, and high-level algorithms.
Such a multi-backend support would be at the STL level and could not go lower than that. Devices, command queues and contexts don't make any sense when thinking of a OpenMP backend. Let alone a compute::vector with associated compute::copy. The library you're asking for is very different from library proposed here.