
On 09/18/07 09:22, Achilleas Margaritis wrote:
Larry Evans wrote: [snip]
How does this collector determine whether a pointer in the database is a root pointer? Wouldn't that method have to be non-portable or at least dispatch on the OS type to the appropriate method?
The collector keeps internally a map of memory page information structure allocated on demand from the heap. Each page information structure contains:
1) a bit map of pointers. 2) a heap reference count.
When a heap block is allocated on a page, the page's reference count is increased. When a heap block is freed, the page's reference count is decreased.
At first I thought, "make perfect sense", then I thought "if a page is on the heap, it remains there; so, why not use a bitflag instead of refcount?". Then I thought about the stack and heap growing to meet each other in the middle so I thought "a page could be on the heap at one point, but then, after all objects on that page were deleted, it could, after more functions calls, be on the stack" so I guess the refcount is used to detect when "all objects on that page were deleted" in which case, it would be put back into the gc::_page_pool. Is that the reason refcount (actually `size_t _page::_heap_ref`) is used instead of a `bool _page::_on_heap;` ? Since other people trying to understand the method will probably have the same question, it would be nice to have such an explanation as detailed (or more detailed) than the one above.
When a pointer is created, the bit that corresponds to the pointer's address is set. When a pointer is destroyed, the same bit is reset.
By pointer, you actually mean the smart-pointer, _pgr, from cppgc.hpp: //base class for gc_ptr struct _ptr { //the actual pointer value void *_value; //constructor from value; registers the pointer in the gc by setting the relevant bit inline _ptr(void *v) : _value(v) { gc::_init(); gc::_set_ptr(this); } ... //destructor; unregisters the pointer in the gc by resetting the relevant bit inline ~_ptr() { gc::_reset_ptr(this); } ... }; where the gc::_{set,reset}_ptr(this) does the registration and deregistration in "bitmap of pointers", implemented, at least partly, in gc::_pages[_MAX_PAGES]. BTW, couldn't 1048576 in: //number of 4K pages that fit inside the 32-bit address space enum { _MAX_PAGES = 1048576 }; be calculated as some function of values in <limits> rather than hard-coded. This would make the code more portable, IIUC.
When we scan for root pointers, we look at the heap reference count of the page: if it is greater than 0, the page contains heap data, and therefore we do not check if the page has any pointers, because even if it does, it is not a root page.
The whole mechanism is based on the fact that heap, global data and thread data can not be on the same page.
This *sounds* like it works; however, it also sounds so simple (although I couldn't think of it :( ) I'm surprised it hasn't been tried before. Maybe you should post this method to some gc newsgroup: http://dir.gmane.org/index.php?prefix=gmane.comp.programming.garbage-collect... and ask for feedback.