Re: [boost] [compute] Some questions

23 Dec 2014

      On Tue, Dec 23, 2014 at 1:20 AM, Andrey Semashev
<andrey.semashev@gmail.com> wrote:
...
Hi,
I have no experience with OpenCL or GPU computing in general, so bear
with me if my questions sound silly. I have a few questions regarding
Boost.Compute:
1. When you define a kernel (e.g. with the BOOST_COMPUTE_FUNCTION
macro), is this kernel supposed to be in C? Can it reference global
(namespace scope) objects and other functions? Other kernels?
Yes, the source code for OpenCL kernels and functions is specified in
OpenCL C which is a dialect of C99 with extensions for vectorized
operations.

There are a few ways to specific kernel functions which reference
global C++ values. One is the BOOST_COMPUTE_CLOSURE() macro [1] which
works similarly to BOOST_COMPUTE_FUNCTION(), but also allows a
lambda-like capture list of C++ values.

Another option is to specify your function with extra arguments for
the global objects and then bind them to the function with
boost::compute::bind() [2].
...
2. When is the kernel compiled and uploaded to the device? Is it
possible to cache and reuse the compiled kernel?
If writing a custom kernel, the kernel is built when the
"program::build()" method is called. Internally, the higher-level
algorithms compile programs when they're needed and store them in a
global program cache.

And yes, compiled program and kernel objects can be stored and re-used
(this is strongly recommended). Boost.Compute provides the
program_cache class [3] which is used stores frequently used programs
as compiled objects.
...
3. Why is the library not thread-safe by default? I'd say, we're long
past single-threaded systems now, and having to always define the
config macro is a nuisance.
I would very much like to have it thread-safe by default. This is a
problem however with keeping the library header-only and useable with
C++03 compilers. The BOOST_COMPUTE_THREAD_SAFE macro basically just
instructs Boost.Compute to use the C++11 "thread_local" specifier for
global objects instead of "static". With C++03 compilers, this will
use boost::thread_specific_ptr<> which then requires users to also
link to Boost.Thread.

That said, I still don't think it's ideal and I am very open to
ideas/patches which improve this.
...
4. Is it possible to upload the data to process to the device's local
memroy from a user-provided buffer, without copying it to
boost::compute::vector? Same for downloading. What I'd like to do is
move some of data processing to the GPU while the rest is performed on
the CPU (possibly with other libraries), and avoid excessive copying.
Yes, that is what the mapped_view class [4] is for. It maps a region
of host-memory to device-memory and provides a std::vector-like
interface on top of it so it may be used with Boost.Compute algorithms
or custom kernels.
...
5. Is it possible to pass buffers in the device-local memory between
different processes (on the CPU) without downloading/uploading data
to/from the CPU memory?
This is not supported by OpenCL (at least not in any standard or
portable way). Memory buffers belong to OpenCL contexts, and contexts
are created per-process without any mechanisms to share them with
other processes.

If anyone has any experience/ideas with sharing OpenCL contexts
between processes I'd be very interested in trying to get this work.
...
6. Is it possible to discover device capabilities? E.g. the amount of
local memory (total/used/free), execution units, vendor and device
name?
Yes, the device class [5] provides a number of methods for returning
information about the device including the generic get_info()
function.

Specifically for those cases you listed you could use:

* Local memory: device.local_memory_size()
* Execution units: device.compute_units()
* Vendor name: device.vendor()
* Device name: device.name()

Thanks for the questions. Let me know if I can explain anything better.

-kyle

[1] http://kylelutz.github.io/compute/BOOST_COMPUTE_CLOSURE.html
[2] http://kylelutz.github.io/compute/boost/compute/bind.html
[3] http://kylelutz.github.io/compute/boost/compute/program_cache.html
[4] http://kylelutz.github.io/compute/boost/compute/mapped_view.html
[5] http://kylelutz.github.io/compute/boost/compute/device.html

Re: [boost] [compute] Some questions

Kyle Lutz