On Mon, Dec 29, 2014 at 11:53 AM, Thomas M
On 29/12/2014 17:56, Kyle Lutz wrote:
On Mon, Dec 29, 2014 at 1:21 AM, Thomas M
wrote: On 29/12/2014 03:23, Kyle Lutz wrote:
On Sun, Dec 28, 2014 at 6:16 PM, Gruenke,Matt
wrote: -----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Kyle Lutz Sent: Sunday, December 28, 2014 20:36 To: boost@lists.boost.org List Subject: Re: [boost] Synchronization (RE: [compute] review)
On Sun, Dec 28, 2014 at 4:46 PM, Gruenke,Matt wrote:
> My understanding, based on comments you've made to other reviewers, > is > that functions like boost::compute::transform() are asynchronous when > the result is on the device, but block when the result is on the > host. > This is what I'm concerned about. Is it true?
Yes this is correct. In general, algorithms like transform() are asynchronous when the input/output ranges are both on the device and synchronous when one of the ranges is on the host. I'll work on better ways to allow asynchrony in the latter case. One of my current ideas is add asynchronous memory-mapping support to the mapped_view class [1] which can then be used with any of the algorithms in an asynchronous fashion.
When you speak of input/output ranges on the host, to what kinds of iterators do you refer to? Any input/output iterator kind (e.g. iterators from a std:: container -> just tried a std::vector on boost::compute::transform, didn't compile if provided as input range), or iterators that are part of your library?
In general, all of the algorithms operate on Boost.Compute iterators rather than STL iterators. The main exception to this rule is the boost::compute::copy() algorithm which copies between ranges of STL iterators on the host and Boost.Compute iterators on the device. Anther exception is the boost::compute::sort() algorithm which can directly sort ranges of random-access iterators on the host (as long as the data is contiguous, e.g. std::vector<T>). I am working to add more direct support for host ranges to the other algorithms. Currently the best way to use the Boost.Compute algorithms together with host memory is with the mapped_view [1] class.
I would find it very helpful to forbid invalid iterators as arguments as much as possible at compile time. For example I can use a std::vector::iterator as output range argument:
std::vector<int> stdvec(100, 0); boost::compute::transform(some_begin(), some_end(), stdvec.begin(), ...);
I guess such a use would be invalid; if transform accepts only iterators from your library then I suppose such compile-time checks should be possible for both input and output (including targeted error message).
Fully agree. And Boost.Compute already has an internal `is_device_iterator` trait which we could use to check for proper device iterators. I've opened an issue for this [1]. -kyle [1] https://github.com/kylelutz/compute/issues/392