
"Helge Bahmann" <hcb@chaoticmind.net> wrote in message news:alpine.DEB.1.10.1001191553030.25332@m65s28.vlinux.de...
On Mon, 18 Jan 2010, Chris M. Thomasson wrote:
"Helge Bahmann" <hcb@chaoticmind.net> wrote in message news:201001181953.33044.hcb@chaoticmind.net...
Am Saturday 16 January 2010 03:15:44 schrieb Chris M. Thomasson:
FWIW Helge, here is a new proxy garbage collector algorithm of mine that I am currently experimenting with:
http://groups.google.com/group/comp.programming.threads/browse_frm/thread/a 53f24de178b419f
interesting concept, but AFAICT it shares two undesirable properties with my implementation: Reclamation can be delayed indefinitely if
- one thread is "stuck" within the critical section (e.g. scheduled away)
Well, if a thread actually deadlocks or "permanently livelocks" within the critical section, then no reclamation can occur.
this means that reclamation (and by extension writing with bounded amount of memory) is not wait-free (this is not a problem I think, just pointing out)
Yes; very good point Helge. I simply _failed_ to clarify that the wait-free properties come from the fact that I am not using loops (e.g., CAS loop) to acquire/release a reference to a collector object. This has been a point of interest in the past over on `comp.programming.threads'. Previous collectors used CAS-loop to either acquire and/or release a reference. This does not work out to well for a reader-writer pattern because only one reader thread can gain access while the others fail their CAS and have to try again. This lock-free property can result in fairly poor performance for readers. Therefore, we have been coming up with wait-free techniques to allow readers to gain access without looping around in software.
- threads interlock such that at least one thread is always within the critical section
Right?
No. My proxy algorithm is N = 2, while yours is N = 1 where N is the number of collector objects. This means that you only have a _single_ reference count. During periods of load, this single reference count might never reach zero and the garbage will keep piling up.
[snip explanation]
Okay thanks for your explanation, I did not understand your code correctly the first time, but now things are much more clear.
That's great! :^)
The concept is very nice indeed, and I agree that N=1 and N=2 makes quite a difference. I *think* that your approach is much more valuable as a building block for "object-local RCU" for data structures with long-lived read access than ABA prevention for producer/consumer data structures (where the read-side critical section is very short).
I think I agree with that statement.
Thanks for sharing the code,
You are welcome! If you can think of any optimizations I would LOVE to hear all about them. One thing I can think of would be to remove the need for threads to actually destroy objects. This can be moved to a separate thread or the objects can simply be stuck in a cache to be reused. For instance, if the `proxy::release(...)' procedure was turned into a function which returned a pointer to a stack of nodes that are in a quiescent state, then the thread could decide what to do with them. I think that this would make the algorithm more flexible. Please note that this is only an initial implementation and it's more of a proof-of-concept than anything. There are probably many improvements that can be made... ;^)
I can already think of an application for this in my own code :)
Cool. IMVHO, the proxy collector is fairly generic in that you can use it to solve memory reclamation in many different types of non-blocking algorithms and *usage patterns. I think that if Boost does include a non-blocking library, then something along the lines of the collector algorithm should probably be included. [*] - I am experimenting with the collector using the following usage pattern. Imagine a database server that get very many sustained query on a regular basis. Now, reader threads receive search requests and then execute the query on a dynamic shared data-structure. Here is high-level pseudo-code of how I am constructing the logic of the reader threads: __________________________________________________________________ #define SYNC_INTERVAL 256 #define DEFER_THRESHOLD 128 typedef proxy<DEFER_THRESHOLD> proxy_type; static proxy_type g_proxy; void reader_thread() { proxy_type::collector* c = &g_proxy.acquire(); for (unsigned i = 1 ;; ++i) { query_type* q = try_to_get_a_query_request(); if (! q) { g_proxy.release(*c); q = wait_for_a_query_request(); c = &g_proxy.acquire(); } execute_query(q); if (! (i % SYNC_INTERVAL)) { g_proxy.release(*c); c = &g_proxy.acquire(); } } g_proxy.release(*c); } __________________________________________________________________ Instead of acquiring/releasing a collector object for every single execution of a query this technique instead attempts to amortize the number of times a reader thread needs to acquire/release a collector. Now, I can think of a possible optimization/enhancement to the proxy algorithm itself. Take note that an odd reference count denotes that a collection cycle is in progress. Therefore, if a reader thread can simply test if the reference count of it's acquired collector object is even, then it simple does not actually "need" to release a reference to it. If it's odd, then a release is necessary in order to move along the current collection process. This would allow the reader threads acquire/release pattern to be amortized across the actual garbage deferment threshold. I think I am going to alter the algorithm in order to support the following usage pattern: __________________________________________________________________ #define DEFER_THRESHOLD 256 typedef proxy<DEFER_THRESHOLD> proxy_type; static proxy_type g_proxy; void reader_thread() { proxy_type::collector* c = &g_proxy.acquire(); for (;;) { query_type* q = try_to_get_a_query_request(); if (! q) { g_proxy.release(*c); q = wait_for_a_query_request(); c = &g_proxy.acquire(); } execute_query(q); if (c->is_collecting()) { g_proxy.release(*c); c = &g_proxy.acquire(); } } g_proxy.release(*c); } __________________________________________________________________ This would allow reader threads to curn along during periods of load with fairly low overhead. I say this because the only time the readers would have to release their collector object and re-acquire it is when they must explicitly wait for query requests, or when the garbage threshold was reached and a collection cycle was initiated by a writer thread. There are some fairly promising enhancements I think I can make to the collector, and that is one of them. Humm...