Le 30/12/14 14:48, Gruenke,Matt a écrit :
-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Thomas M Sent: Tuesday, December 30, 2014 7:37 To: boost@lists.boost.org Subject: Re: [boost] Synchronization (RE: [compute] review)
If you are going to implement such RAII guards here's a short wish-list of features / guard classes:
a) make guards "transferable" across functions I agree they should be movable, but it makes no sense for them to be copyable.
b) a container of guards and/or a guard for a wait_list as whole Hmmm... I can see the benefits (convenience). I'd make it a different type, though.
I assume it should hold a reference to the list? Since the guarantee is designed to block when the wait_list goes out of scope, I think it's reasonable to assume its scope is a superset of the guarantee's.
c) a guard for a command-queue as whole [possibly guards for other classes as well] Why? Convenience?
Unless you're using it as a shorthand for waiting on individual events or wait_lists, there's no need. The event_queue is internally refcounted. When the refcount goes to zero, the destructor will block on all outstanding commands.
a) + b) because something like this is really useful: Um... how about this:
void foo() { // setup all memory objects etc.
wait_list wl; wait_list::guarantee wlg(wl);
// send data to device wl.insert(cq.enqueue_write_buffer_async(devmem, 0, size, host_ptr)); wl.insert(cq.enqueue_write_buffer_async(devmem2, 0, size, host_ptr2));
// a kernel that reads devmem and devmem2 and writes to devmem wl.insert(cq.enqueue_task(kern, wl)); // Note: wl is copied by enqueue funcs
// copy result back to host wl.insert(cq.enqueue_read_buffer_async(devmem, 0, size, host_ptr, wl));
// wl.wait() would only be necessary if you wanted to access the results, here.
// Enqueue an independent set of operations with another wait_list wait_list wl_b; wait_list::guarantee wlg_b(wl);
// send data to device wl_b.insert(cq.enqueue_write_buffer_async(devmem_b, 0, size_b, host_ptr_b));
// ... }
Maybe you can follow the task_region design (See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4088.pdf).
With c) I have something like this in mind: What about this?
{ command_queue cq(cntx, dev); command_queue::guarantee cqg(cq); cq.enqueue_write_buffer_async(devmem, 0, size, host_ptr) transform(..., cq); // implicitly async cq.enqueue_read_buffer_async(...);
// here automatic synchronization occurs }
It does presume that command_queues are local and tied to related batches of computations. Those assumptions won't always hold. The same here.
Best, Vicente