On Sun, Dec 21, 2014 at 1:45 PM, Pavan Yalamanchili
*Design and Implementation*
The library provides an STL-like library for OpenCL devices. Although there are other libraries offering similar behavior, Boost.Compute is the most complete and the most generalized.
The directory structure is well organized and easy to dig through to understand the implementation or debug an issue.
Others have pointed out the existence of the OpenCL C++ wrapper and the fact that Compute library does not use it. Here are a few reasons why I think the author was *right* in his design decisions.
- The C++ wrapper begins to look similar to the original OpenCL C interface once you start using it in a generalized fashion.
- cl.hpp is monolithic (single file 12k+ LOC) and is not modular enough to only include the parts that you need.
- Compute and cl.hpp have two different goals. The biggest selling point of Compute is the set of algorithms it supports not its C++ wrapper around OpenCL.
Having said that, I think the Compute library might be better off with an additional interface that accepts the OpenCL-C++ wrapped objects.
I completely agree. This has been the main issue raised so far during the review. I've been working towards simplifying interoperability between Boost.Compute and the Khronos C++ wrappers and this should hopefully something ready for testing within the next week or two.
*Documentation*
The documentation is comprehensive and the API is documented well. There are a few improvements that can be done.
For example, the reference is at the bottom of the TOC and is hard to see immediately. The API reference in the TOC can also be expanded out a bit more to show the general categories of algorithms that are supported.
Will do. And definitely point me towards any other areas of the documentation that you think need work.
*Potential Usefulness*
It is fairly easy to transition from applications using vector algorithms in STL to Compute. The wide availability of OpenCL devices makes the library useful to a large user base.
*Domain Knowledge*
I am a developer of the ArrayFire library[1]. I am the lead engineer of a company that specializes in this domain. I consider myself to be fairly knowledgeable in this domain.
*Experience with the library*
ArrayFire depends on Boost.Compute for a few algorithms in our OpenCL backend. We explicitly and implicitly (via ArrayFire) test Boost.Compute on a variety of hardware / compilers / operating systems.
Compilers and Operating systems we use:
- gcc 4.8, 4.9 on various Linux distributions - clang 3.4 on OSX - Visual Studio 2013 on Windows.
We have found some bugs through our usage that have been mostly resolved by the author or by patches sent by us.
Thanks for testing Boost.Compute so rigorously! And thanks especially for reporting bugs and submitting patches!
*Conclusion*
The experience overall has been fairly positive. But there are certainly *some* rough edges that need resolving.
An issue that will plague any OpenCL library is the performance portability across various devices. It would be nice if the author can talk about how he plans to eventually address this issue.
Recently I have be working infrastructure for automatic parameter-tuning which will allow algorithms to better adapt and optimize themselves for the underlying hardware (currently via a manually run tuning script which caches the optimal kernel execution parameters on disk). This should hopefully be ready for testing soon. Other ideas include developing more optimized kernels for specific hardware configurations. For example, the reduce() algorithm will currently use a warp-synchronous reduction kernel on NVIDIA hardware which improves performance 5-10% over the generic version. In the future I plan on improving other algorithms to detect and optimize themselves better to the underlying hardware (all while keeping the same user-facing, high-level interface). I'm also planning on specializing some of the core algorithms to automatically take advantage of some of the new built-in work-group reduction and scan functions from OpenCL 2.0.
Overall, Boost.Compute will be a great addition to Boost. But before it is accepted the following issues need to be addressed.
- The tests need to be a bit more comprehensive. - Due to the general nature of OpenCL, there needs to be a list of "officially supported" devices. - Make sure all the tests are passing on the supported devices.
Thanks! I'll definitely work on improving the test-suite. Also, there is currently a list of supported platforms here [1] (though it doesn't yet have a specific list of devices). Thanks for the review! -kyle [1] http://kylelutz.github.io/compute/boost_compute/platforms_and_compilers.html