New subject: Proposal for a precise STL-based garbage collection library

11 Mar 2007

      Dear Boost developers,

I have posted in the Vault a small library (named "CPPGC") that 
implements a precise garbage collector for C++.  You can find it here:

http://www.boost-consulting.com/vault/index.php?PHPSESSID=hqle9ubanl575jgl87ajobeed4&direction=0&order=&directory=Memory

The zipped file contains the following readme file (along with the 
sources and an example):

--BEGIN README.TXT--

Introduction
------------

CPPGC is a garbage collection library for C++. It has the following 
features:

1) precise collector.
2) uses the mark & sweep algorithm.
3) supports pointers to middle of objects.
4) supports arrays of objects.
5) uses only standard C++; no platform-specific code.
6) it can be used to retrofit non-GC'd programs with GC capabilities.
7) very small (under 500 lines of code).

Usage
-----

Include the following files in your project:

     cppgc.hpp
     cppgc.cpp

Use the following code to include GC functionality:

     #include "cppgc.hpp"
     using namespace cppgc;

Wrap your classes and pointers inside the class gc<T>. For example:

     gc<std::list<int> *> gc_list = new gc<std::list<int> >;
     gc_list->push_back(10);

     class Foo {
     public:
         gc<Foo *> m_other;
     };

     gc<Foo *> foo1 = new gc<Foo>[10];
     foo1[5].m_other = new gc<Foo>;
     gc<Foo> foo2[3];
     foo2[1].m_other = foo1;

Portability
-----------

As we speak (March 2007), the classes hash_multimap, hash_set are not 
yet part of the std namespace.
But it will be, and most compilers offer these classes in their STL 
implementation anyway.

The code is currently tested under Visual C++ 2005, so the classes 
hash_multimap and hash_set are
found in the stdext namespace. If you use a different compiler, feel 
free to replace the following code:

     using namespace stdext;

with the namespace your compiler has for the hash classes.

Threads
-------

Threads are not supported. Future versions may support multithreading by 
providing
a set of functions the library user can fill with the appropriate 
environment-specific code
which allow the operation of the collector in a multithreaded environment.

Internals
---------

The idea behind the library is very simple:

In order to know where the collector's pointers are, the collector 
maintains a memory map.

The first entry of the memory map is all the memory (from 0x00000000 to 
0xFFFFFFFF);
this entry contains the root set.

When an object is allocated on the heap, a relevant memory region entry 
is added to the memory map.

When a pointer is created, it adds itself to the relevant memory map entry.
Upon destruction, a pointer removes itself from its memory map entry.

When a heap-allocated object is destroyed, the memory region entry of 
the object is removed from the memory map.

If a pointer is not a member of a heap-allocated block, then it is added 
to the first entry of the memory map,
which is the root set.

The class used for the memory map is a hash_multimap, which allows 
efficient lookup of a memory region
based on an memory address. This allows for pointers to the middle of 
blocks.
A multimap is used instead of a map because keys (i.e. memory regions) 
can overlap one another.

The collector is a classic mark & sweep implementation with 2 phases:

phase 1:

The graph of objects is traversed in order to mark the objects as 
reachable.
The scanning starts from the root set, i.e. the first entry of the 
memory map.

phase 2:

the heap blocks are traversed; blocks that are not touched are deleted.

A collection happens automatically when the collector's memory limit is 
exceeded
or manually when the user invokes the 'collect_garbage' function.

Performance
-----------

It is not the fastest thing around, but it is better than nothing :-). 
Since it uses a hash map
for its memory region, lookup is an O(1) operation on average (or so 
says the theory).

It can be a nice altenative for projects that do not have great 
expectations on performance
or where a commercial collector is not available.

Notes
-----

The file 'main.cpp' contains an example.

--END README.TXT--

Please feel free to suggest improvements, post critisisms, discuss 
anything you want about it.

Thank you in advance.

Achilleas Margaritis

Proposal for a precise STL-based garbage collection library

Achilleas Margaritis

Giovanni Piero Deretta

Larry Evans

tags

participants (3)