request for interest in a garbage collection library

newer
[graph] Request for help with BGL...

Achilleas Margaritis

13 Apr 2009 13 Apr '09

8:50 p.m.

Hi, I have placed in boost vault a garbage collection library: http://www.boostpro.com/vault/index.php?action=downloadfile&filename=gc.zip&directory=Memory& The major features of this library are: 1) it supports threads; a thread can allocate gc objects and send them to other threads. 2) it is 100% portable c++; only boost and stl are used. 3) gc pointers can point to middle of objects and arrays. 4) block allocation and finalization are customizable. The collector does not do any allocation actually, it only manages blocks. 5) the collector uses the classic mark & sweep algorithm. 6) it has good (or not that bad :-)) performance. On my test machine (a dual core Pentium at 2.5 GHz with 2 GB ram), 50000 objects out of 100000 are swept in 63 milliseconds. Of course, this is due to boost::pool :-). 7) it works with STL containers (the 'how' is described in the readme). The zip file contains a set of unit tests, the API documentation and a readme file which explains the inner workings and the design of the collector. I would appreciate it if people could take a look and share their opinions on it. Best regards Achilleas Margaritis

Show replies by date

jbosboom＠uci.edu

13 Apr 13 Apr

9:50 p.m.

I'm very interested in portable garbage collection.

...

1) it supports threads; a thread can allocate gc objects and send them to other threads.

You mention in the readme that performance suffers with large numbers of threads due to a global mutex. Would it be possible to have collectors specific to a thread or groups of threads (essentially giving each thread/group of threads a piece of the heap to allocate from/manage with GC)?

...

3) gc pointers can point to middle of objects and arrays.

I don't really understand how this works. Would the middle of an object/array get collected separately from the rest of it? That seems ridiculous, so I'm confused.

...

The zip file contains a set of unit tests, the API documentation and a readme file which explains the inner workings and the design of the collector.

As is often the case, the readme could be improved. Looking at gc_object, it seems that any class that wants to be gc-allocated needs to inherit from gc_object. Can this be avoided (e.g., so I can gc-allocate classes I can't modify)? Can classes that do inherit from gc_object still be stack-allocated or allocated on the free store without garbage collection? I'm a bit unclear as to how finalization works. Suppose I simply want to invoke the destructor to ensure that resources have been reclaimed (from a class that uses RAII when stack-allocated and an explicit close() method when heap-allocated) when the object is finalized. How do I do that? Despite these problems, I'm still excited to see the subject considered. --Jeffrey Bosboom

Achilleas Margaritis

14 Apr 14 Apr

9:26 a.m.

...

You mention in the readme that performance suffers with large numbers of threads due to a global mutex. Would it be possible to have collectors specific to a thread or groups of threads (essentially giving each thread/group of threads a piece of the heap to allocate from/manage with GC)?

Yes, it is possible, by using a thread-specific ptr to hold a separate instance of the gc context per each thread. But it will complicate the collection process a little.

...

...
3) gc pointers can point to middle of objects and arrays.

I don't really understand how this works. Would the middle of an object/array get collected separately from the rest of it? That seems ridiculous, so I'm confused.

An object should not be collected if there is a pointer to anywhere in it. In other programming languages that pointers to the middle of objects are not allowed, this is not an issue.

...

...
The zip file contains a set of unit tests, the API documentation and a readme file which explains the inner workings and the design of the collector.

As is often the case, the readme could be improved.

Sure. If there is enough interest and if there are any specific topics that need to be addressed, please feel free to indicate them.

...

Looking at gc_object, it seems that any class that wants to be gc-allocated needs to inherit from gc_object. Can this be avoided (e.g., so I can gc-allocate classes I can't modify)?

There are three ways to add garbage collection to an object: 1) inherit from gc_object<T>. 2) wrap your class in gc_wrapper<T>. 3) allocate your object with new(T_deleter).. But you should modify your pointers to be gc_ptr<T> instances for this to work.

...

Can classes that do inherit from gc_object still be stack-allocated

Yes. The member pointers of stack allocated objects will be considered as root pointers.

...

or allocated on the free store without garbage collection?

Only if you use placement new/delete.

...

I'm a bit unclear as to how finalization works. Suppose I simply want to invoke the destructor to ensure that resources have been reclaimed (from a class that uses RAII when stack-allocated and an explicit close() method when heap-allocated) when the object is finalized. How do I do that?

If your object is stack allocated, you don't have to do anything. If you want to customize the finalizer, you specialize the template gc_finalize<T> for your type, and perform any finalization in the function. The default behavior of gc_finalize<T> is to invoke the destructor. You can always delete the object at any time or manually collect garbage to release any resources.

Mathias Gaunard

4:19 p.m.

Achilleas Margaritis wrote:

...

There are three ways to add garbage collection to an object:

1) inherit from gc_object<T>. 2) wrap your class in gc_wrapper<T>. 3) allocate your object with new(T_deleter)..

But you should modify your pointers to be gc_ptr<T> instances for this to work.

So, comparing your framework to C++/CLI: - new(T_deleter) is like gcnew - gc_ptr<T> is like T^ - inheriting from gc_object<T> is like declaring a class as being a ref class Is that right?

Achilleas Margaritis

7:47 p.m.

Mathias Gaunard wrote:

...

Achilleas Margaritis wrote:

...
There are three ways to add garbage collection to an object:

1) inherit from gc_object<T>. 2) wrap your class in gc_wrapper<T>. 3) allocate your object with new(T_deleter)..

But you should modify your pointers to be gc_ptr<T> instances for this to work.

So, comparing your framework to C++/CLI: - new(T_deleter) is like gcnew - gc_ptr<T> is like T^ - inheriting from gc_object<T> is like declaring a class as being a ref class

Is that right?

Indeed.

David Abrahams

16 Apr 16 Apr

1:16 a.m.

on Mon Apr 13 2009, Achilleas Margaritis <axilmar-AT-gmail.com> wrote:

...

I would appreciate it if people could take a look and share their opinions on it.

I haven't looked at it, but IMO the biggest obstacle to effective GC for C++ isn't the lack of a library (Boehm's collector is pretty good AFAIK). It's the lack of a programming model that integrates destructors and peoples' legitimate expectations that they will be called to release non-memory resxources. What can you say about the non-memory resources owned by a GC'd object that contains, say, a mutex? -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Achilleas Margaritis

8:19 p.m.

David Abrahams wrote:

...

on Mon Apr 13 2009, Achilleas Margaritis <axilmar-AT-gmail.com> wrote:

...
I would appreciate it if people could take a look and share their opinions on it.

I haven't looked at it, but IMO the biggest obstacle to effective GC for C++ isn't the lack of a library (Boehm's collector is pretty good AFAIK).

I have to disagree with that. According to Boehm himself (http://www.hpl.hp.com/personal/Hans_Boehm/gc/gcinterface.html): "The C++ interface is implemented as a thin layer on the C interface. Unfortunately, this thin layer appears to be very sensitive to variations in C++ implementations, particularly since it tries to replace the global ::new operator, something that appears to not be well-standardized. Your platform may need minor adjustments in this layer (gc_cpp.cc, gc_cpp.h, and possibly gc_allocator.h). Such changes do not require understanding of collector internals, though they may require a good understanding of your platform. (Patches enhancing portability are welcome. But it's easy to break one platform by fixing another.) " I have tried to use the Boehm GC in C++ and there were many problems: 1) can not be integrated with threading libraries such as Qt threads or boost threads. For example, under Windows, Boehm GC supposes you use Win32 threads, and it provides functions that replace Win32 calls. 2) the replacement of the global operator new creates many problems with 3rd party libraries which provide their own operator global new (MFC, for example). 3) there are few details on what is supposed to work with dlls and globals initialized before main(). I tried asking on the newsgroup, they did not know the answer. They said that it's platform specific, and sometimes allocating objects before main is ok, sometimes it is not, sometimes you have to call gc_init(), sometimes you have not to etc. I can point you to the relevant discussions on the boehm gc newsgroup, if you wish.

...

It's the lack of a programming model that integrates destructors and peoples' legitimate expectations that they will be called to release non-memory resxources. What can you say about the non-memory resources owned by a GC'd object that contains, say, a mutex?

Personally, I don't see why resources like mutexes or files should be managed like memory. Resources like that are "binary" resources: they have two states (locked/unlocked, opened/closed), which is perfectly suited for RAII. Can you please elaborate on the problem of destructors? I have read some papers that say that gc and destructors don't play well together, but I did not really agree with that. An example would be appreciated.

Sid Sacek

11:01 p.m.

...

...
It's the lack of a programming model that integrates destructors and peoples' legitimate expectations that they will be called to release non-memory resxources. What can you say about the non-memory resources owned by a GC'd object that contains, say, a mutex?

...

Personally, I don't see why resources like mutexes or files should be managed like memory. Resources like that are "binary" resources: they have two states (locked/unlocked, opened/closed), which is perfectly suited for RAII.

I think the question David was asking is; if a GC object is holding a mutex that is currently holding a lock, then when does that lock release, or how does that lock release? The GC may run in the future, and in the meanwhile, that lock is frozen.

Achilleas Margaritis

17 Apr 17 Apr

8:08 a.m.

Sid Sacek wrote:

...

...
...
It's the lack of a programming model that integrates destructors and peoples' legitimate expectations that they will be called to release non-memory resxources. What can you say about the non-memory resources owned by a GC'd object that contains, say, a mutex?

...
Personally, I don't see why resources like mutexes or files should be managed like memory. Resources like that are "binary" resources: they have two states (locked/unlocked, opened/closed), which is perfectly suited for RAII.

I think the question David was asking is; if a GC object is holding a mutex that is currently holding a lock, then when does that lock release, or how does that lock release? The GC may run in the future, and in the meanwhile, that lock is frozen.

Aren't scoped locks a better way to handle such issues? The big advantage of C++ is scoped construction/destruction.

Vicente Botet Escriba

9:47 a.m.

New subject: re quest for interest in a garbage collection library

Achilleas Margaritis-4 wrote:

...

Sid Sacek wrote:

...
...
...
It's the lack of a programming model that integrates destructors and peoples' legitimate expectations that they will be called to release non-memory resxources. What can you say about the non-memory resources owned by a GC'd object that contains, say, a mutex?

...
Personally, I don't see why resources like mutexes or files should be managed like memory. Resources like that are "binary" resources: they have two states (locked/unlocked, opened/closed), which is perfectly suited for RAII.

I think the question David was asking is; if a GC object is holding a mutex that is currently holding a lock, then when does that lock release, or how does that lock release? The GC may run in the future, and in the meanwhile, that lock is frozen.

Aren't scoped locks a better way to handle such issues? The big advantage of C++ is scoped construction/destruction.

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Hi, The user can use already RAII on its application. Suppose that you have a RAII class X that is garbage collected (inherits from gc_object<X>) that holds a mutex (or a file handle) and lock (open) it on construction and unlock (close) on destruction. How the following behaves? scoped_ptr<X> p(nex X()); When the unlock (close) function will be called? Does it means that classes like X should not inherit from gc_object<X> and the user of X must decide whether the new X() must be garbage collected or not? What are the criteria X must satisfy to inherit from gc_object<X> without any trouble? The GC RAII combination rises clearly some issues, and the user must be aware of them. The question is, what the GC library can provide to the user to make user life easier and safer. HTH, Vicente -- View this message in context: http://www.nabble.com/request-for-interest-in-a-garbage-collection-library-t... Sent from the Boost - Dev mailing list archive at Nabble.com.

Achilleas Margaritis

20 Apr 20 Apr

4:50 p.m.

New subject: re quest for interest in a garbage collection library

Vicente Botet Escriba wrote:

...

Achilleas Margaritis-4 wrote:

...
Sid Sacek wrote:

...
...
...
It's the lack of a programming model that integrates destructors and peoples' legitimate expectations that they will be called to release non-memory resxources. What can you say about the non-memory resources owned by a GC'd object that contains, say, a mutex?

...
Personally, I don't see why resources like mutexes or files should be managed like memory. Resources like that are "binary" resources: they have two states (locked/unlocked, opened/closed), which is perfectly suited for RAII.

I think the question David was asking is; if a GC object is holding a mutex that is currently holding a lock, then when does that lock release, or how does that lock release? The GC may run in the future, and in the meanwhile, that lock is frozen.

Aren't scoped locks a better way to handle such issues? The big advantage of C++ is scoped construction/destruction.

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Hi,

The user can use already RAII on its application. Suppose that you have a RAII class X that is garbage collected (inherits from gc_object<X>) that holds a mutex (or a file handle) and lock (open) it on construction and unlock (close) on destruction.

How the following behaves?

scoped_ptr<X> p(nex X());

When the unlock (close) function will be called?

Does it means that classes like X should not inherit from gc_object<X> and the user of X must decide whether the new X() must be garbage collected or not?

What are the criteria X must satisfy to inherit from gc_object<X> without any trouble?

The GC RAII combination rises clearly some issues, and the user must be aware of them. The question is, what the GC library can provide to the user to make user life easier and safer.

HTH, Vicente

The gc'd objects can be deleted any time, and therefore RAII works on gc'd objects as well.

Beman Dawes

17 Apr 17 Apr

2:10 p.m.

On Fri, Apr 17, 2009 at 4:08 AM, Achilleas Margaritis <axilmar@gmail.com> wrote:

...

...
I think the question David was asking is; if a GC object is holding a mutex that is currently holding a lock, then when does that lock release, or how does that lock release? The GC may run in the future, and in the meanwhile, that lock is frozen.

Aren't scoped locks a better way to handle such issues? The big advantage of C++ is scoped construction/destruction.

Achilleas, you are missing Dave and Sid's point. It is a common and very appropriate programming practice to place resources like files and mutexes in an object and to dynamically allocate that object. If GC is used to reclaim the containing that object, then sub-objects it contains like mutexes or files that release on destruction may not get released soon enough. You need to address this concern. Telling folks not to place non-memory resources in types that may be dynamically allocated isn't likely to fly. --Beman

Achilleas Margaritis

20 Apr 20 Apr

4:56 p.m.

Beman Dawes wrote:

...

On Fri, Apr 17, 2009 at 4:08 AM, Achilleas Margaritis <axilmar@gmail.com> wrote:

...
...
I think the question David was asking is; if a GC object is holding a mutex that is currently holding a lock, then when does that lock release, or how does that lock release? The GC may run in the future, and in the meanwhile, that lock is frozen.

Aren't scoped locks a better way to handle such issues? The big advantage of C++ is scoped construction/destruction.

Achilleas, you are missing Dave and Sid's point. It is a common and very appropriate programming practice to place resources like files and mutexes in an object and to dynamically allocate that object. If GC is used to reclaim the containing that object, then sub-objects it contains like mutexes or files that release on destruction may not get released soon enough. You need to address this concern. Telling folks not to place non-memory resources in types that may be dynamically allocated isn't likely to fly.

If objects like files and mutexes are dynamically allocated, then they are deleted at some point, aren't they? so, if the programmer wants to add garbage collection to these objects, so he/she doesn't have to manually track memory, he/she can replace the delete operations with RAII and let the GC handle the memory part. The order or finalization is not a solvable problem by any means, so I don't see how a solution can be provided. In c++, the order of creation is not guaranteed to be the same to the order of allocation, so it is not possible to have a real generic solution for that. Still, RAII can easily replace delete calls: instead of deleting an object, you simply close it or unlock it and you let the gc decide if the object is reachable or not.

jbosboom＠uci.edu

5:56 p.m.

...

If objects like files and mutexes are dynamically allocated, then they are deleted at some point, aren't they? so, if the programmer wants to add garbage collection to these objects, so he/she doesn't have to manually track memory, he/she can replace the delete operations with RAII and let the GC handle the memory part.

The point of garbage collection is that the programmer does not need to manually free resources by deleting. However, non-memory resources need to be freed deterministically, so garbage collection is not a good solution. So simply don't use it for that. The obvious solution is to continue managing the lifetime of objects representing or holding non-memory resources manually, either by allocating them with automatic duration or holding them by a reference-counted pointer. Just because garbage collection is there doesn't mean you have to use it exclusively. You can still get the benefits of garbage collection on everything else, so it's still a win. --Jeffrey Bosboom

David Abrahams

30 Apr 30 Apr

5:50 p.m.

on Mon Apr 20 2009, Achilleas Margaritis <axilmar-AT-gmail.com> wrote:

...

Beman Dawes wrote:

...
On Fri, Apr 17, 2009 at 4:08 AM, Achilleas Margaritis <axilmar@gmail.com> wrote:

...
...
I think the question David was asking is; if a GC object is holding a mutex that is currently holding a lock, then when does that lock release, or how does that lock release? The GC may run in the future, and in the meanwhile, that lock is frozen.

Aren't scoped locks a better way to handle such issues? The big advantage of C++ is scoped construction/destruction.

Achilleas, you are missing Dave and Sid's point. It is a common and very appropriate programming practice to place resources like files and mutexes in an object and to dynamically allocate that object. If GC is used to reclaim the containing that object, then sub-objects it contains like mutexes or files that release on destruction may not get released soon enough. You need to address this concern. Telling folks not to place non-memory resources in types that may be dynamically allocated isn't likely to fly.

If objects like files and mutexes are dynamically allocated, then they are deleted at some point, aren't they?

I'm not sure how to answer that. In a correct program without GC, yes, they are.

...

so, if the programmer wants to add garbage collection to these objects, so he/she doesn't have to manually track memory, he/she can replace the delete operations with RAII

Sorry, I don't know what "replace the delete operations with RAII" means.

...

and let the GC handle the memory part.

The order or finalization is not a solvable problem by any means, so I don't see how a solution can be provided. In c++, the order of creation is not guaranteed to be the same to the order of allocation, so it is not possible to have a real generic solution for that.

Still, RAII can easily replace delete calls: instead of deleting an object, you simply close it or unlock it and you let the gc decide if the object is reachable or not.

You're still missing the point. We went through this discussion in great detail in the C++ committee; let me try to sketch it for you: Without GC, I can write a class X that manages a non-memory resource and frees that non-memory resource in its destructor. The fact of that non-memory resource can be an implementation detail of the class; no client of X ever needs to know, so long as their program doesn't leak memory. X can be placed in a container, aggregated into other objects, etc., and none of those other objects needs to know it manages a non-memory resource. People commonly believe a GC'd environment gives them the right to leak dynamically-allocated objects without worry. If you add GC to C++, the fact that a type (or some other type it owns, etc.) manages a non-memory resource is no longer an implementation detail, because you have to know about the presence of that resource in order to know whether you can leak objects of that type. So in C++ with GC, you don't have a blanket right to leak a dynamically-allocated object. It also means that you can't come along later and add a non-memory resource as an implementation detail to any class, because someone might be legitimately leaking objects of that type, and the resource won't be freed. So unless you know that a class will *never* need such a resource (e.g. a mutex), you can't afford to leak it. Therefore, "don't worry about delete; it's taken care of" is not a legitimate programming guideline for C++ with GC. The question is, ******************************************************************** * what /is/ the guideline that explains when I can afford to leak? * ******************************************************************** The best we could come up with in general was, "do everything exactly the same way you were doing it before GC came along," at which point the only benefit of GC seems to be that it may keep a program that leaks from running out of memory as soon as it would have otherwise. That's a pretty marginal benefit, it seems to me. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Thorsten Ottosen

9:04 p.m.

David Abrahams skrev:

...

The best we could come up with in general was, "do everything exactly the same way you were doing it before GC came along," at which point the only benefit of GC seems to be that it may keep a program that leaks from running out of memory as soon as it would have otherwise. That's a pretty marginal benefit, it seems to me.

Well, as Herb Sutter puts it: GC is the only thing that makes .Net go faster. So GC is certainly interesting from the point of view that it can enable faster memory operations. As for the leaking of resources, then there is no other way that each class documents that it it's destructor must be called to reclaim resources. Perhaps a base class is the best way to advertise this. Just like in .Net classes implement Disposable. -Thorsten

Achilleas Margaritis

2 May 2 May

9:15 a.m.

David Abrahams wrote:

...

on Mon Apr 20 2009, Achilleas Margaritis <axilmar-AT-gmail.com> wrote:

...
Beman Dawes wrote:

...
On Fri, Apr 17, 2009 at 4:08 AM, Achilleas Margaritis <axilmar@gmail.com> wrote:

...
...
I think the question David was asking is; if a GC object is holding a mutex that is currently holding a lock, then when does that lock release, or how does that lock release? The GC may run in the future, and in the meanwhile, that lock is frozen.

Aren't scoped locks a better way to handle such issues? The big advantage of C++ is scoped construction/destruction. Achilleas, you are missing Dave and Sid's point. It is a common and very appropriate programming practice to place resources like files and mutexes in an object and to dynamically allocate that object. If GC is used to reclaim the containing that object, then sub-objects it contains like mutexes or files that release on destruction may not get released soon enough. You need to address this concern. Telling folks not to place non-memory resources in types that may be dynamically allocated isn't likely to fly. If objects like files and mutexes are dynamically allocated, then they are deleted at some point, aren't they?

I'm not sure how to answer that. In a correct program without GC, yes, they are.

...
so, if the programmer wants to add garbage collection to these objects, so he/she doesn't have to manually track memory, he/she can replace the delete operations with RAII

Sorry, I don't know what "replace the delete operations with RAII" means.

...
and let the GC handle the memory part.

The order or finalization is not a solvable problem by any means, so I don't see how a solution can be provided. In c++, the order of creation is not guaranteed to be the same to the order of allocation, so it is not possible to have a real generic solution for that.

Still, RAII can easily replace delete calls: instead of deleting an object, you simply close it or unlock it and you let the gc decide if the object is reachable or not.

You're still missing the point.

We went through this discussion in great detail in the C++ committee; let me try to sketch it for you:

Without GC, I can write a class X that manages a non-memory resource and frees that non-memory resource in its destructor. The fact of that non-memory resource can be an implementation detail of the class; no client of X ever needs to know, so long as their program doesn't leak memory. X can be placed in a container, aggregated into other objects, etc., and none of those other objects needs to know it manages a non-memory resource.

People commonly believe a GC'd environment gives them the right to leak dynamically-allocated objects without worry. If you add GC to C++, the fact that a type (or some other type it owns, etc.) manages a non-memory resource is no longer an implementation detail, because you have to know about the presence of that resource in order to know whether you can leak objects of that type. So in C++ with GC, you don't have a blanket right to leak a dynamically-allocated object.

I don't think that's a great problem. I am not proposing to add GC to C++. I am proposing a GC solution to boost, to sit alongside shared_ptr and weak_ptr. The problem you are mentioning is real, but very small. In all of my programming career, I have never seen an implementation detail like a mutex lock not be documented. The reason is that even with destructors, you have to take care of your non-memory resources in order not to introduce problems like deadlocks etc.

...

It also means that you can't come along later and add a non-memory resource as an implementation detail to any class, because someone might be legitimately leaking objects of that type, and the resource won't be freed. So unless you know that a class will *never* need such a resource (e.g. a mutex), you can't afford to leak it.

I think this is a design issue and therefore a solution should not be forced.

...

Therefore, "don't worry about delete; it's taken care of" is not a legitimate programming guideline for C++ with GC. The question is,

******************************************************************** * what /is/ the guideline that explains when I can afford to leak? * ********************************************************************

The best we could come up with in general was, "do everything exactly the same way you were doing it before GC came along," at which point the only benefit of GC seems to be that it may keep a program that leaks from running out of memory as soon as it would have otherwise. That's a pretty marginal benefit, it seems to me.

Yes, I believe that as well: with or without GC, the semantics of a C++ program should not be changed. But the benefit is not marginal, as you say. GC solves a lot of problems and, most importantly, increases productivity.

David Abrahams

29 May 29 May

4:02 p.m.

on Sat May 02 2009, Achilleas Margaritis <axilmar-AT-gmail.com> wrote:

...

...
Therefore, "don't worry about delete; it's taken care of" is not a legitimate programming guideline for C++ with GC. The question is,

******************************************************************** * what /is/ the guideline that explains when I can afford to leak? * ********************************************************************

The best we could come up with in general was, "do everything exactly the same way you were doing it before GC came along," at which point the only benefit of GC seems to be that it may keep a program that leaks from running out of memory as soon as it would have otherwise. That's a pretty marginal benefit, it seems to me.

Yes, I believe that as well: with or without GC, the semantics of a C++ program should not be changed.

Do you realize that "do everything exactly the same way you were doing it before GC came along" means all dynamically-allocated objects must be deleted by explicit code and you can't rely on GC to do it for you? Are you really agreeing with that guideline?

...

But the benefit is not marginal, as you say. GC solves a lot of problems and, most importantly, increases productivity.

How can it increase productivity if it doesn't change how we write programs? -- Dave Abrahams BoostPro Computing http://www.boostpro.com

Edward Diener

17 Apr 17 Apr

11:59 a.m.

Achilleas Margaritis wrote:

...

David Abrahams wrote:

...
on Mon Apr 13 2009, Achilleas Margaritis <axilmar-AT-gmail.com> wrote:

...
I would appreciate it if people could take a look and share their opinions on it.

...
It's the lack of a programming model that integrates destructors and peoples' legitimate expectations that they will be called to release non-memory resxources. What can you say about the non-memory resources owned by a GC'd object that contains, say, a mutex?

Personally, I don't see why resources like mutexes or files should be managed like memory. Resources like that are "binary" resources: they have two states (locked/unlocked, opened/closed), which is perfectly suited for RAII.

Jumping in here, the problem is that having to mix GC code, which automatically releases memory in a non-deterministic way, and normal RAII destructor code, which releases all resources in a deterministic way, creates two different syntaxes in a C++ program. This would currently require the programmer to "know" which type of destruction belongs with which object. Unless the same external syntax can automatically be used on all objects, where the classes of those objects which need deterministic destruction automatically use RAII destruction whereas the classes of those objects which only need non-deterministic destruction use GC, I do not think C++ programmers will ever be enthusiastic about using a garbage collector. At least I know that I won't and will stick with shared_ptr with the occasional scoped_ptr/weak_ptr until a way to syntactically unite RAII and GC is accomplished. GC has successfully masked the fact, in langages/environments like Python, Ruby, Java, .Net, that it is a very poor system for dealing with any resource other than memory.

Bastek

3:57 p.m.

...

Jumping in here, the problem is that having to mix GC code, which automatically releases memory in a non-deterministic way, and normal RAII destructor code, which releases all resources in a deterministic way, creates two different syntaxes in a C++ program. This would currently require the programmer to "know" which type of destruction belongs with which object.

Unless the same external syntax can automatically be used on all objects, where the classes of those objects which need deterministic destruction automatically use RAII destruction whereas the classes of those objects which only need non-deterministic destruction use GC, I do not think C++ programmers will ever be enthusiastic about using a garbage collector. At least I know that I won't and will stick with shared_ptr with the occasional scoped_ptr/weak_ptr until a way to syntactically unite RAII and GC is accomplished.

GC has successfully masked the fact, in langages/environments like Python, Ruby, Java, .Net, that it is a very poor system for dealing with any resource other than memory.

The way of releasing resources may be dependent on the type of object. If you use my GC (http://sourceforge.net/projects/sgcl), you can write: class Foo {}; class Bar : public Limited {}; gc<Foo> foo = gcnew Foo; gc<Bar> bar = gcnew Bar; Bar-class destructor will be called in a deterministic way.

Achilleas Margaritis

20 Apr 20 Apr

4:59 p.m.

...

Jumping in here, the problem is that having to mix GC code, which automatically releases memory in a non-deterministic way, and normal RAII destructor code, which releases all resources in a deterministic way, creates two different syntaxes in a C++ program. This would currently require the programmer to "know" which type of destruction belongs with which object.

I think the overhead of knowing which type of destruction belongs to which object is minimal when compared to the overhead of manually tracking memory. Resources like memory should be tracked by a GC, resources like mutexes or files should be tracked by RAII. What I like in C++ is that I can use the best of both worlds at the same time.

Simonson, Lucanus J

6:43 p.m.

Achilleas Margaritis wrote:

...

...
Jumping in here, the problem is that having to mix GC code, which automatically releases memory in a non-deterministic way, and normal RAII destructor code, which releases all resources in a deterministic way, creates two different syntaxes in a C++ program. This would currently require the programmer to "know" which type of destruction belongs with which object.

I think the overhead of knowing which type of destruction belongs to which object is minimal when compared to the overhead of manually tracking memory. Resources like memory should be tracked by a GC, resources like mutexes or files should be tracked by RAII.

What I like in C++ is that I can use the best of both worlds at the same time.

Really? Memory should be tracked by GC and not RAII? Why? The only reason GC is even acceptable for managing memory is because memory is a very plentiful resource in most systems. RAII covers 90% of cases of memory allocation and in the other 10% I don't find managing it manually to be too much of a burden, and if I wanted to be lazy I could use shared pointers to do the reference counting instead of using a GC library with equal extra effort. I don't want to be lazy, and I don't want to debug problems in code that mixes and matches GC and non-GC objects and leaves it very ambiguous whether someone forgot to call delete.

...

But gc'd objects are are not supposed to touch other gc'd objects in their destructor (if you check the readme, I explicitly say that the order of finalization is random). Is there a realistic need to call another gc'd object from a destructor?

Saying your documentation states that GC objects shouldn't manipulate GC objects in their destructors is all well and good, but that won't stop people from doing it, and it won't stop their code from working when they test it and breaking later when it is used. If you could enforce such a thing with a compilation error using some template meta-programming trick that would be mildly interesting, but just saying its not supported and letting everyone who won't bother to read the documentation find that out the hard way is not good design because it is error prone. Just because the problem can't be solved doesn't mean there isn't a real problem here. I can't think of a way to get a syntax error in the case that code is compiled when in the context of a destructor, but not otherwise. You are giving people a hammer, a blindfold and a box of nails and telling them to be careful not to hit their thumbs. That's not the best of both worlds. Luke

Frank Mori Hess

7:58 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Monday 20 April 2009, Simonson, Lucanus J wrote:

...

be too much of a burden, and if I wanted to be lazy I could use shared pointers to do the reference counting instead of using a GC library with equal extra effort. I don't want to be lazy, and I don't want to debug

A garbage collector does have an advantage over shared_ptr reference counting, in that it can automatically detect cycles. Although, it does seem like it should be possible to detect shared_ptr cycles by keeping track of the memory regions owned by shared_ptrs and comparing them with the addresses of shared_ptr objects. Then you could build up a graph which could be checked for cycles and connectedness. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkns0+EACgkQ5vihyNWuA4VZPgCePlGKibaauHBkU+V7ntH+H/3n 6JUAn33H8zOSdgHfEUjPcbl1SSUWV5T5 =wPWq -----END PGP SIGNATURE-----

Patrick Mihelich

9:50 p.m.

Detecting cycles is the usual argument for garbage collection, but not the only one. For example, it is useful for implementing persistent data structures, where the memory dependencies quickly get very complicated. These data structures are common (often required) in functional languages, which is one reason those languages are garbage collected as a rule. In some cases it is even possible for garbage collection to be faster than explicit allocation and deallocation, as the allocation strategy can be extremely simple (just bump a pointer). I agree that RAII is an excellent solution to at least 90% of memory management, but I think there is enough reason to investigate opt-in garbage collection approaches for C++. Patrick On Mon, Apr 20, 2009 at 12:58 PM, Frank Mori Hess <frank.hess@nist.gov>wrote:

...

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On Monday 20 April 2009, Simonson, Lucanus J wrote:

...
be too much of a burden, and if I wanted to be lazy I could use shared pointers to do the reference counting instead of using a GC library with equal extra effort. I don't want to be lazy, and I don't want to debug

A garbage collector does have an advantage over shared_ptr reference counting, in that it can automatically detect cycles. Although, it does seem like it should be possible to detect shared_ptr cycles by keeping track of the memory regions owned by shared_ptrs and comparing them with the addresses of shared_ptr objects. Then you could build up a graph which could be checked for cycles and connectedness. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkns0+EACgkQ5vihyNWuA4VZPgCePlGKibaauHBkU+V7ntH+H/3n 6JUAn33H8zOSdgHfEUjPcbl1SSUWV5T5 =wPWq -----END PGP SIGNATURE----- _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Achilleas Margaritis

9:27 p.m.

Simonson, Lucanus J wrote:

...

Achilleas Margaritis wrote:

...
...
Jumping in here, the problem is that having to mix GC code, which automatically releases memory in a non-deterministic way, and normal RAII destructor code, which releases all resources in a deterministic way, creates two different syntaxes in a C++ program. This would currently require the programmer to "know" which type of destruction belongs with which object. I think the overhead of knowing which type of destruction belongs to which object is minimal when compared to the overhead of manually tracking memory. Resources like memory should be tracked by a GC, resources like mutexes or files should be tracked by RAII.

What I like in C++ is that I can use the best of both worlds at the same time.

Really? Memory should be tracked by GC and not RAII? Why? The only reason GC is even acceptable for managing memory is because memory is a very plentiful resource in most systems. RAII covers 90% of cases of memory allocation and in the other 10% I don't find managing it manually to be too much of a burden, and if I wanted to be lazy I could use shared pointers to do the reference counting instead of using a GC library with equal extra effort. I don't want to be lazy, and I don't want to debug problems in code that mixes and matches GC and non-GC objects and leaves it very ambiguous whether someone forgot to call delete.

I agree with you. As I said above, and in the readme, I consider GC as a complimentary solution to existing solutions. The reason for using a GC is to be relieved from having to think about potential referential cycles. One nasty problem is cycles inserted via subclassing. There can go undetected very easily, especially if the subclass containing the cycle is introduced indirectly (this is a real-world issue that had happened to my company some times).

...

...
But gc'd objects are are not supposed to touch other gc'd objects in their destructor (if you check the readme, I explicitly say that the order of finalization is random). Is there a realistic need to call another gc'd object from a destructor?

Saying your documentation states that GC objects shouldn't manipulate GC objects in their destructors is all well and good, but that won't stop people from doing it, and it won't stop their code from working when they test it and breaking later when it is used. If you could enforce such a thing with a compilation error using some template meta-programming trick that would be mildly interesting, but just saying its not supported and letting everyone who won't bother to read the documentation find that out the hard way is not good design because it is error prone. Just because the problem can't be solved doesn't mean there isn't a real problem here.

I respectfully disagree with you. Solutions always come with trade offs. For example, when using shared ptrs, one trade-off are referential cycles. For the GC, the trade off is that destructors should not touch other gc'd objects. The problem can be solved partially. Since the GC knows where pointers and objects are, it can find the root objects in the object graph and start the finalization from those objects. If the boost community is interested, I can code it. The only case that is unsolvable is the case where there is a cycle, since in that case it is not possible to tell where is the start of the graph. In this case, the gc will have to take an arbitrary decision and start the finalization from a random graph node.

Simonson, Lucanus J

9:49 p.m.

Achilleas Margaritis wrote:

...

The reason for using a GC is to be relieved from having to think about potential referential cycles. One nasty problem is cycles inserted via subclassing. There can go undetected very easily, especially if the subclass containing the cycle is introduced indirectly (this is a real-world issue that had happened to my company some times).

--snip--

...

The only case that is unsolvable is the case where there is a cycle, since in that case it is not possible to tell where is the start of the graph. In this case, the gc will have to take an arbitrary decision and start the finalization from a random graph node.

I may not understand all the issues as well as you clearly do, but it appears to me you are contradicting yourself. You state that GC frees you from having to think about reference cycles, but you don't currently allow such references, and if you did you wouldn't be able to resolve cycles anyway. I don't see how any problem is actually solved here. I am already free from having to think about reference cycles because I avoid the design patterns that lead to them. For code that already works that way I don't know that introducing GC into the mix to try to resuce it will make the soup taste any better. Man, I've made some bad soup that way, just keep adding good ingredients hoping it will make it taste less terrible, but all you get in the end is even more bad tasting soup. For code that is already broken, it is probably better to take something out of the software design than add something new (or a new overload of new, as the case may be.) Luke

Achilleas Margaritis

10:45 p.m.

Simonson, Lucanus J wrote:

...

Achilleas Margaritis wrote:

...
The reason for using a GC is to be relieved from having to think about potential referential cycles. One nasty problem is cycles inserted via subclassing. There can go undetected very easily, especially if the subclass containing the cycle is introduced indirectly (this is a real-world issue that had happened to my company some times).

--snip--

...
The only case that is unsolvable is the case where there is a cycle, since in that case it is not possible to tell where is the start of the graph. In this case, the gc will have to take an arbitrary decision and start the finalization from a random graph node.

I may not understand all the issues as well as you clearly do, but it appears to me you are contradicting yourself. You state that GC frees you from having to think about reference cycles, but you don't currently allow such references, and if you did you wouldn't be able to resolve cycles anyway. I don't see how any problem is actually solved here. I am already free from having to think about reference cycles because I avoid the design patterns that lead to them.. For code that already works that way I don't know that introducing GC into the mix to try to resuce it will make the soup taste any better. Man, I've made some bad soup that way, just keep adding good ingredients hoping it will make it taste less terrible, but all you get in the end is even more bad tasting soup. For code th at is already broken, it is probably better to take something out of the software design than add something new (or a new overload of new, as the case may be.)

The collector allows for cycles. What it does not currently allow is referencing gc'd objects in the destructor.

Jonas Persson

17 Apr 17 Apr

2:49 p.m.

Achilleas Margaritis skrev:

...

Can you please elaborate on the problem of destructors? I have read some papers that say that gc and destructors don't play well together, but I did not really agree with that. An example would be appreciated.

One example is that, by default, your gc do not handle cyclic dependencies correct because destructors are run when collected. The following example gives an access violation on windows: ---------------------------------------------- class Foo : public gc_object<Foo> { public: gc_ptr<Foo> m_next; std::string m_name; Foo(std::string name) : m_name(name) {} ~Foo() { m_next->lastGoodbye(); } void lastGoodbye() { std::cout << "Bye " << m_name; } }; void test_cycle() { gc_ptr<Foo> foo1 = new Foo("foo 1"); foo1->m_next = new Foo("foo 2"); foo1->m_next->m_next = foo1; foo1 = 0; gc::collect(); } ---------------------------------------------- / Jonas

Mathias Gaunard

18 Apr 18 Apr

11:16 a.m.

Jonas Persson wrote:

...

The following example gives an access violation on windows:

---------------------------------------------- class Foo : public gc_object<Foo> { public: gc_ptr<Foo> m_next; std::string m_name;

Foo(std::string name) : m_name(name) {}

~Foo() { m_next->lastGoodbye(); }

void lastGoodbye() { std::cout << "Bye " << m_name; } }; void test_cycle() { gc_ptr<Foo> foo1 = new Foo("foo 1"); foo1->m_next = new Foo("foo 2"); foo1->m_next->m_next = foo1;

foo1 = 0; gc::collect(); }

So that library overloads operator new, meaning all memory becomes garbage collected, unlike what I understood from what was answered to one of my other posts. I even wonder how gc_ptr<T>::gc_ptr::(T*) can be a safe constructor at all. A fairly terrible design, that has a lot of shortcomings, some of which where raised when talking of Boehm GC (basically, bad integration). The funny thing is that the right way to do GC in C++ has been known for a long time, but for some reason people still want to do it some other way. C++/CLI, despite its evilness, does it right. Bastek's library (SGCL) seems to do it somewhat right too.

Achilleas Margaritis

20 Apr 20 Apr

5:07 p.m.

Mathias Gaunard wrote:

...

Jonas Persson wrote:

...
The following example gives an access violation on windows:

---------------------------------------------- class Foo : public gc_object<Foo> { public: gc_ptr<Foo> m_next; std::string m_name;

Foo(std::string name) : m_name(name) {}

~Foo() { m_next->lastGoodbye(); }

void lastGoodbye() { std::cout << "Bye " << m_name; } }; void test_cycle() { gc_ptr<Foo> foo1 = new Foo("foo 1"); foo1->m_next = new Foo("foo 2"); foo1->m_next->m_next = foo1;

foo1 = 0; gc::collect(); }

So that library overloads operator new, meaning all memory becomes garbage collected, unlike what I understood from what was answered to one of my other posts.

No, it does not overload operator new, it only overloads the operator new for the classes you want to be garbage collected.

...

I even wonder how gc_ptr<T>::gc_ptr::(T*) can be a safe constructor at all.

What do you mean exactly?

...

A fairly terrible design, that has a lot of shortcomings, some of which where raised when talking of Boehm GC (basically, bad integration).

What shortcomings do you specifically refer to?

Mathias Gaunard

21 Apr 21 Apr

4:58 p.m.

Achilleas Margaritis wrote:

...

...
I even wonder how gc_ptr<T>::gc_ptr::(T*) can be a safe constructor at all.

What do you mean exactly?

What if I do something like Foo foo; gc_ptr<Foo> bar = &foo; ?

...

...
A fairly terrible design, that has a lot of shortcomings, some of which where raised when talking of Boehm GC (basically, bad integration).

What shortcomings do you specifically refer to?

I didn't understand you were overloading operator new for the class, I thought it was a global overloading.

jbosboom＠uci.edu

5:45 p.m.

...

What if I do something like

Foo foo; gc_ptr<Foo> bar = &foo;

?

I'd expect it to be the same as Foo* foo() { Foo foo; return &foo; } Don't expect a GC to save you from that one; you asked for stack duration when you declared the Foo object. --Jeffrey Bosboom

Achilleas Margaritis

8:55 p.m.

Mathias Gaunard wrote:

...

Achilleas Margaritis wrote:

...
...
I even wonder how gc_ptr<T>::gc_ptr::(T*) can be a safe constructor at all.

What do you mean exactly?

What if I do something like

Foo foo; gc_ptr<Foo> bar = &foo;

?

It's ok, nothing will happen. The collector, as it is stated in the readme, recognizes garbage-collected blocks. If a gc ptr points to a non-gc object, then the ptr is simply ignored.

Achilleas Margaritis

20 Apr 20 Apr

5:03 p.m.

Jonas Persson wrote:

...

Achilleas Margaritis skrev:

...
Can you please elaborate on the problem of destructors? I have read some papers that say that gc and destructors don't play well together, but I did not really agree with that. An example would be appreciated.

One example is that, by default, your gc do not handle cyclic dependencies correct because destructors are run when collected.

The following example gives an access violation on windows:

---------------------------------------------- class Foo : public gc_object<Foo> { public: gc_ptr<Foo> m_next; std::string m_name;

Foo(std::string name) : m_name(name) {}

~Foo() { m_next->lastGoodbye(); }

void lastGoodbye() { std::cout << "Bye " << m_name; } }; void test_cycle() { gc_ptr<Foo> foo1 = new Foo("foo 1"); foo1->m_next = new Foo("foo 2"); foo1->m_next->m_next = foo1;

foo1 = 0; gc::collect(); } ----------------------------------------------

But gc'd objects are are not supposed to touch other gc'd objects in their destructor (if you check the readme, I explicitly say that the order of finalization is random). Is there a realistic need to call another gc'd object from a destructor?

Mathias Gaunard

21 Apr 21 Apr

12:06 a.m.

Achilleas Margaritis wrote:

...

But gc'd objects are are not supposed to touch other gc'd objects in their destructor (if you check the readme, I explicitly say that the order of finalization is random). Is there a realistic need to call another gc'd object from a destructor?

Drop in replacement, or ability to use with or without GC.

Esben Mose Hansen

6:13 a.m.

On Monday 20 April 2009 19:03:59 Achilleas Margaritis wrote:

...

But gc'd objects are are not supposed to touch other gc'd objects in their destructor (if you check the readme, I explicitly say that the order of finalization is random). Is there a realistic need to call another gc'd object from a destructor?

I think the tradeoffs are reasonable.. However, would it be possible to have an assert if a gc_ptr is accessed (in a destructor) during garbage collection? That would improve the safety immensely, I think. -- Kind regards, Esben

Jonas Persson

9:25 a.m.

Achilleas Margaritis skrev:

...

But gc'd objects are are not supposed to touch other gc'd objects in their destructor (if you check the readme, I explicitly say that the order of finalization is random). Is there a realistic need to call another gc'd object from a destructor?

This isn't about random finalization. It is about that if I have a reference to a gc object I want to be able to rely on that it is alive until I release it. I dont want it to be removed under my feet because I was choosen as the random victim in a circular dependency break up. A gc object touching other gc'd object in the destructor may not be common, but the same problem appear even if that other gc object is accessed inside the destructor of any contained child object or indeed any function called by any destructor in any object in the dependency graph. This is not going to be easy to see. I think you would need to add some sort of guards against this usage, like assert if any gc_ptr is accessed during garbage collection as Esben proposed. But even then you have to be careful because that will not detect everything. For example shared_ptr members is often used to make sure that the lifetime of objects is extended long enough and never accessed directly. Dropping in gc_ptr in that case will bite you hard. As I have seen so far, I think the only safe usage of gc is to require a no circular dependency design. This is all general issues with garbage collection, and not specifically about your implementation. I dont really want to discourage you from continue your efforts to build a gc library, but to be useful in a C++ context all this has to be solved somehow. / Jonas

Gregory Peele ARA/CFD

1:33 p.m.

To chime in with a couple thoughts (I'm not expert on the topic though.) Doesn't Java deal with finalization issues in a sane way? If I recall, they guarantee that finalize() will be run at most once on an object after it becomes unreachable, but an object in a reference cycle isn't actually destroyed and collected until all objects in that cycle have been finalized (or more technically when there are no more active references to that object, including other objects pending finalization.) I have no clue how hard this would be to implement in C++ or even if it makes sense to try, especially as finalizers are intended to catch cases that are better addressed with RAII. Even in Java, they are discouraged in favor of try / finally clauses. Java also permits the finalizers to make the object active again, which I think is a rather pathological case. In favor of garbage collection in general, though - for some use cases, especially those that don't use non-memory resources, it makes a lot of sense. The Boehm guys claim that garbage collection can be a significant performance win for workloads involving large numbers of small objects, as compared to C++ new / delete - especially in multithreaded environments. It can also greatly speed up prototyping complex geometric or graph data structures, where reference counting (especially non-intrusively) is a huge performance / memory hit and playing the game of "who owns the pointer?" takes more time than actually developing the algorithm for your problem domain. Just my $0.02. Gregory Peele, Jr. Applied Research Associates, Inc. -----Original Message----- From: boost-bounces@lists.boost.org on behalf of Jonas Persson Sent: Tue 4/21/2009 5:25 AM To: boost@lists.boost.org Subject: Re: [boost] request for interest in a garbage collection library Achilleas Margaritis skrev:

...

But gc'd objects are are not supposed to touch other gc'd objects in their destructor (if you check the readme, I explicitly say that the order of finalization is random). Is there a realistic need to call another gc'd object from a destructor?

joaquin＠tid.es

2:13 p.m.

Gregory Peele ARA/CFD escribió:

...

In favor of garbage collection in general, though - for some use cases, especially those that don't use non-memory resources, it makes a lot of sense. The Boehm guys claim that garbage collection can be a significant performance win for workloads involving large numbers of small objects, as compared to C++ new / delete - especially in multithreaded environments.

AFAICS, you can have an implementation of delete where "delete p" simply calls the destructor of the object pointed to by p and then passes p to an internal garbage collector that will reclaim the memory in due time. This way you have determinstic resource liberation *and* GC speed. My point behind this is: you can have memory management as fast as any GC can achieve and still retain RAII style. No need to force users to resort to GC for performance's sake. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

Achilleas Margaritis

2:47 p.m.

...

AFAICS, you can have an implementation of delete where "delete p" simply calls the destructor of the object pointed to by p and then passes p to an internal garbage collector that will reclaim the memory in due time. This way you have determinstic resource liberation *and* GC speed.

joaquin＠tid.es

2:57 p.m.

Achilleas Margaritis escribió:

...

...
AFAICS, you can have an implementation of delete where "delete p" simply calls the destructor of the object pointed to by p and then passes p to an internal garbage collector that will reclaim the memory in due time. This way you have determinstic resource liberation *and* GC speed.

If you do that, what is the point of keeping the memory occupied by 'p' around? the object will be destroyed anyway and therefore it will not be of any use.

The point is that deferring memory release to the next GC round can be faster than releasing every little piece of memory immediately, at least in some scenarios: http://www.hpl.hp.com/personal/Hans_Boehm/gc/#details Take into acount that free(x) is by no means free in terms of execution time (no pun intended) --some bookkeeping has to be done by the internal memory manager when freeing a block of memory. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

Edouard A.

3:10 p.m.

On Tue, 21 Apr 2009 16:57:33 +0200, joaquin@tid.es wrote:

...

The point is that deferring memory release to the next GC round can be faster than releasing every little piece of memory immediately, at least in some scenarios:

http://www.hpl.hp.com/personal/Hans_Boehm/gc/#details

Take into acount that free(x) is by no means free in terms of execution time (no pun intended) --some bookkeeping has to be done by the internal memory manager when freeing a block of memory.

A decent memory allocator avoids actually freeing memory when you call free/delete. Instead, the block will be marked as available and future memory allocation may use this block (if it fits). This operation is inexpensive. Depending on your memory constraints, actual deallocation may even never happen (see STLPort default allocator). Memory pool can go further and be even more efficient because all blocks have the same size, making reusing very easy and efficient. Garbage collector just displace the memory management problem. Large applications still need careful design. No silver bullet. I'm very doubtful about the benefits of a C++ garbage collector but I'd be very happy to be proven wrong. -- EA

Achilleas Margaritis

8:58 p.m.

joaquin@tid.es wrote:

...

Achilleas Margaritis escribió:

...
...
AFAICS, you can have an implementation of delete where "delete p" simply calls the destructor of the object pointed to by p and then passes p to an internal garbage collector that will reclaim the memory in due time. This way you have determinstic resource liberation *and* GC speed.

If you do that, what is the point of keeping the memory occupied by 'p' around? the object will be destroyed anyway and therefore it will not be of any use.

The point is that deferring memory release to the next GC round can be faster than releasing every little piece of memory immediately, at least in some scenarios:

http://www.hpl.hp.com/personal/Hans_Boehm/gc/#details

Take into acount that free(x) is by no means free in terms of execution time (no pun intended) --some bookkeeping has to be done by the internal memory manager when freeing a block of memory.

This is valid for Boehm gc only, because the Boehm gc releases whole pages at a time. Using a boost pool is a very good solution and is quite fast, on par with Boehm gc.

Gregory Peele ARA/CFD

3:03 p.m.

I agree that the internal GC approach could potentially have the same performance benefits as a traditional GC for the workload I described, especially with some cleverness in lock-free algorithms to handle multithreading. I assume it would probably perform quite a bit better actually, since it doesn't have to solve the finalization problem. As mentioned, so would a well written pool allocator - these are just two different approaches to amortizing memory allocation and deallocation costs. Of course, an internal GC or any pool allocator techniques I've heard of don't help with the huge developer and maintainer cost of correctly implementing pointer ownership rules for complex cyclical data structures without paying the CPU and memory cost of thread-safe cycle-safe reference counting. I spent far more time solving memory leaks, dangling pointers, and invalid memory dereferences than I did implementing, testing, and correcting actual geometry logic for my Delaunay triangulation package, and my only used resource was memory so I didn't need deterministic finalization. Increasing developer productivity when used for the right task is where a traditional GC really shines, I think. Gregory Peele, Jr. Applied Research Associates, Inc. -----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Achilleas Margaritis Sent: Tuesday, April 21, 2009 10:48 AM To: boost@lists.boost.org Subject: Re: [boost] request for interest in a garbage collection library

...

AFAICS, you can have an implementation of delete where "delete p" simply calls the destructor of the object pointed to by p and then passes p to an internal garbage collector that will reclaim the memory in due time. This way you have determinstic resource liberation *and* GC speed.

If you do that, what is the point of keeping the memory occupied by 'p' around? the object will be destroyed anyway and therefore it will not be of any use. By using pool allocators, allocation/deallocation can become as fast as when using a garbage collector. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Simonson, Lucanus J

4:44 p.m.

New subject: discussion of garbage collection in C++

Gregory Peele ARA/CFD wrote:

...

In favor of garbage collection in general, though - for some use cases, especially those that don't use non-memory resources, it makes a lot of sense. The Boehm guys claim that garbage collection can be a significant performance win for workloads involving large numbers of small objects, as compared to C++ new / delete - especially in multithreaded environments. It can also greatly speed up prototyping complex geometric or graph data structures, where reference counting (especially non-intrusively) is a huge performance / memory hit and playing the game of "who owns the pointer?" takes more time than actually developing the algorithm for your problem domain.

--snip discussion of performance--

...

Of course, an internal GC or any pool allocator techniques I've heard of don't help with the huge developer and maintainer cost of correctly implementing pointer ownership rules for complex cyclical data structures without paying the CPU and memory cost of thread-safe cycle-safe reference counting. I spent far more time solving memory leaks, dangling pointers, and invalid memory dereferences than I did implementing, testing, and correcting actual geometry logic for my Delaunay triangulation package, and my only used resource was memory so I didn't need deterministic finalization. Increasing developer productivity when used for the right task is where a traditional GC really shines, I think.

I prefer to have a graph object that owns all graph nodes and deallocates them all in its destructor than have "liberated" graph nodes that can be assigned between different graphs. Stl style is already providing the big productivity boost that GC claims. Since the graph and geometry data structures are small and many in number it is better (in terms of both space and time) to store them in vectors and copy them around than allocate them dynamically through any mechanism including garbage collection. I don't think GC offers any advantage over my current style of graph or geometry programming in terms of either productivity or performance of the code generated. Cyclical ownership is just bad design. All the objects in the cycle have the same "life" and an object that encapsulates that relationship and can be used to scope them as a unit is the obvious solution, not GC. If the argument for GC is because it makes bad design less bad, then it can't win over the argument for good design. I never have problems with leaks in my graph or geometry algorithms because I never type new and delete execpt in the rarest of circumstances, and even then the ownership is clear and they are deallocated in the destructor of an object, making the code exception safe. GC solves a problem that does not exist for me. If other people still have this problem they should learn how to apply C++ in a way that doesn't lead to such problems, not rely on GC to let them implement sloppy design. C++ can be as productive to program in as Java if you use it well, and memory pooling with RAII will always outperform GC. Luke

Bastek

6:08 p.m.

New subject: discussion of garbage collection in C++

On 21 Kwi, 18:44, "Simonson, Lucanus J" <lucanus.j.simon...@intel.com> wrote:

...

I prefer to have a graph object that owns all graph nodes and deallocates them all in its destructor than have "liberated" graph nodes that can be assigned between different graphs. Stl style is already providing the big productivity boost that GC claims. Since the graph and geometry data structures are small and many in number it is better (in terms of both space and time) to store them in vectors and copy them around than allocate them dynamically through any mechanism including garbage collection. I don't think GC offers any advantage over my current style of graph or geometry programming in terms of either productivity or performance of the code generated. Cyclical ownership is just bad design. All the objects in the cycle have the same "life" and an object that encapsulates that relationship and can be used to scope them as a unit is the obvious solution, not GC. If the argument for GC is because it makes bad design less bad, then it can't win over the argument for good design.

I never have problems with leaks in my graph or geometry algorithms because I never type new and delete execpt in the rarest of circumstances, and even then the ownership is clear and they are deallocated in the destructor of an object, making the code exception safe. GC solves a problem that does not exist for me. If other people still have this problem they should learn how to apply C++ in a way that doesn't lead to such problems, not rely on GC to let them implement sloppy design. C++ can be as productive to program in as Java if you use it well, and memory pooling with RAII will always outperform GC.

struct User; struct Group { std::vector<User*> users; }; struct User { Group* group; }; Taking into account that the data can be used in multiple modules. Do you know when and where to release the memory for this type of objects? C++ language, enforces restrictions on mapping the structures of databases, because that does not have a Garbage Collector.

Simonson, Lucanus J

6:31 p.m.

New subject: discussion of garbage collection in C++

Bastek wrote:

...

struct User; struct Group { std::vector<User*> users; };

struct User { Group* group; };

Taking into account that the data can be used in multiple modules. Do you know when and where to release the memory for this type of objects?

C++ language, enforces restrictions on mapping the structures of databases, because that does not have a Garbage Collector.

Yes. The problem arises because the design is incomplete. Let me finish it for you: struct DataBase { std::list<User> users; std::list<Group> groups; } You release the memory when you release the data base and the code is exception safe and won't leak if you need to destroy the entire database due to a fatal error like screwing up the integrity of those pointers. You can also add and remove users and groups incrementally, of course. Not associating them with a database object is like associating them to a global object. That is bad design. This way you need to know which database you are working with by passing it into the functions that manipulate it (or make them member functions if you are old school), but not having a database object is not having a OO design. I never need to type new and delete to use this design. There is no question that writing C-style code in C++ is harder than writing Java, but that's not a fair comparison. Regards, Luke

Gregory Peele ARA/CFD

7:13 p.m.

New subject: discussion of garbage collection in C++

This approach obviously works, and is simple for toy cases like this. It becomes quite a bit less simple in real software. Yes, every software design should be capable of being broken up in such a way. But sometimes doing so is pure hell on your interface encapsulation, or comes with highly nontrivial costs, or maybe you'd rather use the GC to accomplish the same task in 1 month instead of 4 months so you can spend more of your time on stuff your customer actually cares about. I'm not going to dispute that it's always possible to design good software packages without a GC, and I think that you should always consider non-GC designs first (not that we have much of a choice in C++ at this point, which is the problem under discussion.) But especially for prototypes where proving algorithm feasibility is more important than clean design, or in the real world where we don't have unlimited time to develop and deliver our software using developers who aren't necessarily C++ programming gods, a GC could make life much easier by ensuring that we could rely on "good enough" automated memory management. If Boost developers feel that such a thing isn't appropriate for Boost's mission, that's fine. Boost doesn't have to be everything to everyone. However, to say that it wouldn't be useful or is somehow "wrong" is completely misleading, and it's very typical of the "I don't need it so therefore nobody should ever use it" attitude that is so common. Other languages don't use mandatory or opt-in GC strategies because their designers are stupid. They use them because it meets a need that RAII manual memory management doesn't. Of course, RAII manual memory management meets different needs that a GC doesn't. Use the best tool for the job... Gregory Peele, Jr. Applied Research Associates, -----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Simonson, Lucanus J Sent: Tuesday, April 21, 2009 2:31 PM To: boost@lists.boost.org Subject: Re: [boost] discussion of garbage collection in C++ Bastek wrote:

...

struct User; struct Group { std::vector<User*> users; };

struct User { Group* group; };

Taking into account that the data can be used in multiple modules. Do you know when and where to release the memory for this type of objects?

C++ language, enforces restrictions on mapping the structures of databases, because that does not have a Garbage Collector.

Edouard A.

7:50 p.m.

New subject: discussion of garbage collection in C++

Hi Gregory, GC is an approach at solving resources management. It works very well in many cases. As you seem to be versed in the area, you also know that it can go horribly wrong resulting in terrible performances and massive memory usage. These cases must be debugged and take a lot of time and can hit you very late in project (read: when switching to production). GC will not prevent you from correctly designing your project and being careful with resource management. A GC is a program, and as such, has to be correctly used to function properly. There is no "magic". You still need to plan in advance, be rigorous and anticipate the bandwidth. RAII and smart pointers are a different approach at resources management. It works very well in many cases. Like a GC, it needs to be used correctly to function properly.

...

This approach obviously works, and is simple for toy cases like this. It becomes quite a bit less simple in real software. Yes, every software design should be capable of being broken up in such a way. But sometimes doing so is pure hell on your interface encapsulation, or comes with highly nontrivial costs, or maybe you'd rather use the GC to accomplish the same task in 1 month instead of 4 months so you can spend more of your time on stuff your customer actually cares about.

I would like to be presented with a C++ project that takes 1 month with a GC, 4 months with RAII. Most surprising. Not to say I encountered every possible projects in my life, but I'm extremely surprised. I've worked on many different projects and every time I had a resource management issue, with garbage collected languages it took a different form, but nevertheless, it occurred.

...

I'm not going to dispute that it's always possible to design good software packages without a GC, and I think that you should always consider non-GC designs first (not that we have much of a choice in C++ at this point, which is the problem under discussion.) But especially for prototypes where proving algorithm feasibility is more important than clean design, or in the real world where we don't have unlimited time to develop and deliver our software using developers who aren't necessarily C++ programming gods, a GC could make life much easier by ensuring that we could rely on "good enough" automated memory management.

C++ is not a silver bullet and maybe your solution needs to be implemented with a different tool. Kind regards. -- Edouard Alligand Partner Bureau 14 SARL - http://www.bureau14.fr/

Gregory Peele ARA/CFD

8:23 p.m.

New subject: discussion of garbage collection in C++

Edouard, I appreciate your response. You made the point that I was trying to make in a much clearer way - GC and RAII are both useful techniques that have their place and drawbacks. Many other responses in this thread seem to think that RAII solves everything without any drawback, and there is no room or usefulness for other techniques. The overall message I wanted to convey is that GC should be an option available to C++ programmers, and it's to our benefit to have a well-designed, tested, and reviewed shared library for it. It's definitely not a magic bullet, but when used in the correct way, it's a useful tool in a designer's toolbox. It can also make a technical lead's life a bit easier if his developers are primarily Java developers who are experienced with GC issues, but need C++ performance and shared libraries for CPU-bound numerical tasks. :-) My only experience with C++ GC is Boehm 7.1, which I decided not to use because I could not get it to cooperate with Valgrind (and submitted a patch for other very basic issues I found that didn't exactly fill me with confidence), so the 1 month vs. 4 month comparison was purely hypothetical, based on much smaller scale experiences I've had with prototyping the code in Ruby vs. C++. I probably shouldn't have carelessly tossed that out there. I recently completed a task that took 4 months to design, implement, and test without a GC, the majority of which was spent designing and debugging memory ownership for a workload that is a classic example for GC advocates (millions of small objects mostly composed of pointers with lots of reference cycles) Perhaps I would have run into different memory management issues with a GC - as you pointed out, they come with plenty of traps of their own. And solving GC memory leaks isn't necessarily easier than manual memory leaks. I would have liked to have the choice, though. Gregory Peele, Jr. Applied Research Associates, Inc. -----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost-bounces@lists.boost.org] On Behalf Of Edouard A. Sent: Tuesday, April 21, 2009 3:51 PM To: boost@lists.boost.org Subject: Re: [boost] discussion of garbage collection in C++ Hi Gregory, GC is an approach at solving resources management. It works very well in many cases. As you seem to be versed in the area, you also know that it can go horribly wrong resulting in terrible performances and massive memory usage. These cases must be debugged and take a lot of time and can hit you very late in project (read: when switching to production). GC will not prevent you from correctly designing your project and being careful with resource management. A GC is a program, and as such, has to be correctly used to function properly. There is no "magic". You still need to plan in advance, be rigorous and anticipate the bandwidth. RAII and smart pointers are a different approach at resources management. It works very well in many cases. Like a GC, it needs to be used correctly to function properly.

...

This approach obviously works, and is simple for toy cases like this. It becomes quite a bit less simple in real software. Yes, every software design should be capable of being broken up in such a way. But sometimes doing so is pure hell on your interface encapsulation, or comes with highly nontrivial costs, or maybe you'd rather use the GC to accomplish the same task in 1 month instead of 4 months so you can spend more of your time on stuff your customer actually cares about.

...

I'm not going to dispute that it's always possible to design good software packages without a GC, and I think that you should always consider non-GC designs first (not that we have much of a choice in C++ at this point, which is the problem under discussion.) But especially for prototypes where proving algorithm feasibility is more important than clean design, or in the real world where we don't have unlimited time to develop and deliver our software using developers who aren't necessarily C++ programming gods, a GC could make life much easier by ensuring that we could rely on "good enough" automated memory management.

C++ is not a silver bullet and maybe your solution needs to be implemented with a different tool. Kind regards. -- Edouard Alligand Partner Bureau 14 SARL - http://www.bureau14.fr/ _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Achilleas Margaritis

9:05 p.m.

New subject: discussion of garbage collection in C++

I believe that as well. GC in C++ is just another tool to solve a specific set of problems. I do not think existing solutions that work well that should be replaced with GC. But sometimes it is nice to have, especially when development time is short.

Simonson, Lucanus J

7:59 p.m.

New subject: discussion of garbage collection in C++

Gregory Peele ARA/CFD wrote:

...

I'm not going to dispute that it's always possible to design good software packages without a GC, and I think that you should always consider non-GC designs first (not that we have much of a choice in C++ at this point, which is the problem under discussion.) But especially for prototypes where proving algorithm feasibility is more important than clean design, or in the real world where we don't have unlimited time to develop and deliver our software using developers who aren't necessarily C++ programming gods, a GC could make life much easier by ensuring that we could rely on "good enough" automated memory management.

Bjarne's definition of C++ was "C++ is a library and systems programming language." I don't know if someone has filled in the blank with a better adjective yet. Currently we also use it for application level programming because it is pretty good for that too, but we also use heterogeneous language systems where python, lua or *shudder* tcl runtime environments run on top of a C++ foundation and call C++ library and system interfaces through bindings. It would be very nice, and I mentioned this to Bjarne when he was last at Intel, to have a C++ runtime environment complete with garbage collection so that we could use it for application programming, rapid prototyping and scripting. He wasn't too keen on the idea, but there are commercial C++ runtime environments out there and C# is just runtime C++ that they broke compatibility on purpose. MS Office applications written in C++ running in interpreted mode with JIT compilation with only 15% runtime penalty was the proof point for C#. Why stop half way? Why not have an open source C++ runtime environment if the goal is applications programming and rapid prototyping in C++ with open source components? An open source GC library is a good building block for such a system, and of course we'd prefer to implement it in C++. Using the GC library to implement runtime environments for GC languages is one of the use cases where I think a GC library is a great thing to have in boost. I just don't want to see it misused. Prototypes all to often turn into products with little more efforting put in than a good sales pitch. Regards, Luke

Mathias Gaunard

23 Apr 23 Apr

12:20 a.m.

New subject: discussion of garbage collection in C++

Simonson, Lucanus J wrote: Why stop half way? Why not have an open source C++ runtime environment if the goal is applications programming and rapid prototyping in C++ with open source components? An open source GC library is a good building block for such a system, and of course we'd prefer to implement it in C++. Why build a runtime environment based on GC when you can make a better one that is not?

Raindog

25 Apr 25 Apr

4:23 a.m.

New subject: discussion of garbage collection in C++

Mathias Gaunard wrote:

...

Simonson, Lucanus J wrote:

Why stop half way? Why not have an open source C++ runtime environment if the goal is applications programming and rapid prototyping in C++ with open source components? An open source GC library is a good building block for such a system, and of course we'd prefer to implement it in C++.

Why build a runtime environment based on GC when you can make a better one that is not?

Because the majority of programmers have proven that they obviously prefer a GC environment over an RAII one. Given that a large number of C++ programmers don't even take advantage of RAII, I would say that a c++ runtime w/ GC and RAII would be very appealing to a larger number of people.

Cory Nelson

5:31 a.m.

New subject: discussion of garbage collection in C++

On Fri, Apr 24, 2009 at 9:23 PM, Raindog <raindog@macrohmasheen.com> wrote:

...

Because the majority of programmers have proven that they obviously prefer a GC environment over an RAII one. Given that a large number of C++ programmers don't even take advantage of RAII, I would say that a c++ runtime w/ GC and RAII would be very appealing to a larger number of people.

I think the popularity of other languages has nothing to do with GC or RAII, but because those languages give you libraries including everything but the kitchen sink and are thus usually quicker to get started with. Not to say that having optional GC would be bad. I personally don't know if I would ever want to use it, but I'm sure someone would find a good fit for it. -- Cory Nelson

Emil Dotchevski

6:09 a.m.

New subject: discussion of garbage collection in C++

On Fri, Apr 24, 2009 at 9:23 PM, Raindog <raindog@macrohmasheen.com> wrote:

...

Mathias Gaunard wrote:

...
Simonson, Lucanus J wrote:

Why stop half way? Why not have an open source C++ runtime environment if the goal is applications programming and rapid prototyping in C++ with open source components? An open source GC library is a good building block for such a system, and of course we'd prefer to implement it in C++.

Why build a runtime environment based on GC when you can make a better one that is not?

Because the majority of programmers have proven that they obviously prefer a GC environment over an RAII one.

So they can use a GC environment -- there are plenty of them to choose from. Somehow, C++ (and C) prevail and remain popular without GC. :) Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

Edward Diener

5:47 p.m.

New subject: discussion of garbage collection in C++

Raindog wrote:

...

Mathias Gaunard wrote:

...
Simonson, Lucanus J wrote:

Why stop half way? Why not have an open source C++ runtime environment if the goal is applications programming and rapid prototyping in C++ with open source components? An open source GC library is a good building block for such a system, and of course we'd prefer to implement it in C++.

Why build a runtime environment based on GC when you can make a better one that is not?

Because the majority of programmers have proven that they obviously prefer a GC environment over an RAII one. Given that a large number of C++ programmers don't even take advantage of RAII, I would say that a c++ runtime w/ GC and RAII would be very appealing to a larger number of people.

Whenever anyone writes "the majority of programmers..." one can discount the subsequent statement "the majority" of the time. .Net, Java, Python, Perl, Ruby, D et al automatically provide GC. Wonderful for non-deterministic memory management, poor to hopeless for any other form of resource management. C++ provides RAII resource management using smart pointers, with boost::shared_ptr<T> probably being used by quite a few C++ programmers worldwide. C++ is wonderful for all forms of resource management in general, but much more difficult for cross-referenced memory management, where boost::weak_ptr<T> offers one solution. Microsoft and Sun and other languages decided that their programmers did not want to have to think about memory management at all, fair enough, and would live with half-baked or programmer hand-rolled solutions to other resource management issues. So GC, that emperor with a big hole covering his nakedness, is all the popular rage and everyone pretends that the hole does not exist or is adequately covered with a nice patch. RAII and GC in C++ ? Sure, as long as today GC is strictly optional. But I, who have run into resource management issues much, much more often in my programming of non-C++ computer languages than I have run into cross-referencing memory issues programming in C++ will stick with C++ smart pointers. Currently mixing C++ RAII and GC in C++ is a headache I do not want. Good luck to others in using them both, in other words in mixing deterministic and non-deterministic destruction and being happy that everything is used correctly ( RAII objects holding GC objects and vice versa, containers of RAII objects holding GC objects and vice versa ). That is why I think that a real solution to using RAII and GC in C++ must come from the ability to tag objects at the class level, and only occasionally at the object level, as a non-memory resource holding object, and that the syntax for doing dynamic memory management in C++ must encompass both smart pointers and GC allocations of memory under the same mnemonics. In the ideal case a programmer should only have to create an object and then forget about having to manually delete it, and either RAII smart pointers kick in for deterministic destruction when non-memory resources have to be released or GC kicks in for non-deterministic destruction of objects when only memory is involved. That ideal case should exist for all languages, but Microsoft, Sun, and others want to pretend that GC is good enough. It isn't, and it has been one of the biggest frauds in computer programming in pretending that it is.

Emil Dotchevski

7:27 p.m.

New subject: discussion of garbage collection in C++

On Sat, Apr 25, 2009 at 10:47 AM, Edward Diener <eldiener@tropicsoft.com> wrote:

...

Raindog wrote: That is why I think that a real solution to using RAII and GC in C++ must come from the ability to tag objects at the class level, and only occasionally at the object level

I agree with most of what you're saying, I myself am not a fan of GC however I do appreciate its benefits in some cases, when all of the resources being managed allow non-deterministic termination (memory being the most important such resource.) Yet, occasionally even memory needs to be managed deterministically. So I disagree with your assertion that classes should be tagged for GC, and not individual object instances. One of the best features of shared_ptr is the fact that it abstracts resource management at object instance level. I know this is wishful thinking, but the ideal solution as far as I'm concerned would be to implement GC as a custom (non-deterministic) allocation strategy, per-instance, as a custom shared_ptr allocator. Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

Edward Diener

11:14 p.m.

New subject: discussion of garbage collection in C++

Emil Dotchevski wrote:

...

On Sat, Apr 25, 2009 at 10:47 AM, Edward Diener <eldiener@tropicsoft.com> wrote:

...
Raindog wrote: That is why I think that a real solution to using RAII and GC in C++ must come from the ability to tag objects at the class level, and only occasionally at the object level

I agree with most of what you're saying, I myself am not a fan of GC however I do appreciate its benefits in some cases, when all of the resources being managed allow non-deterministic termination (memory being the most important such resource.) Yet, occasionally even memory needs to be managed deterministically.

I agree with your last statement. See the end of my last paragraph below.

...

So I disagree with your assertion that classes should be tagged for GC, and not individual object instances.

In the ideal scheme classes which encompass non-memory resources should actually be tagged as non-GC, so RAII would automatically kick in for dynamically allocated objects of those classes; otherwise dynamic allocation uses GC. I also said that there are cases where individual object instances can be "tagged" and of course this must be allowed for any object. I just think that ideally tagging classes as RAII would cover the majority of the situations. Individual object instances, as an example, could occur when a container of RAII dynamically allocated objects is created and it is then important to "tag" the container itself as RAII, where otherwise it would be considered GC by default.

...

One of the best features of shared_ptr is the fact that it abstracts resource management at object instance level. I know this is wishful thinking, but the ideal solution as far as I'm concerned would be to implement GC as a custom (non-deterministic) allocation strategy, per-instance, as a custom shared_ptr allocator.

I don't think that it is easy to have user of an object decide for every object whether it is RAII or GC, which one would have to do with your ideally proposed shared_ptr<> custom allocator scheme, unless of course shared_ptr were made smart enough to understand the "tag" of the type of object it is encompassing and use its custom allocator accordingly in a default situation. It is normally much easier for the class designer to know whether the class is RAII or not, and much easier for the user of an instantiated object of that class just to instantiate the object and then not worry whether its destruction is deterministic or not. The whole point of a system of dynamically allocated memory being a smart combination of deterministic RAII and non-deterministic GC is that as much of the burden as possible, in determining RAII or GC, should be taken off the user of objects as possible so the code is not littered with endless manual decisions. Certainly the designer of a class knows with 99% certitude whether an object of his class needs RAII or not. In the very few run-time cases where a class could encompass either RAII or GC objects, or both, ( STL containers are the proverbial example ) then the user of that class has to have his say. Similarly if I am allocating one million, let's say, instances of GC object in a container, I should certainly have the ability of making that container RAII so I can release all that memory as soon as possible.

Emil Dotchevski

26 Apr 26 Apr

12:45 a.m.

New subject: discussion of garbage collection in C++

On Sat, Apr 25, 2009 at 4:14 PM, Edward Diener <eldiener@tropicsoft.com> wrote:

...

Emil Dotchevski wrote:

...
On Sat, Apr 25, 2009 at 10:47 AM, Edward Diener <eldiener@tropicsoft.com> wrote: One of the best features of shared_ptr is the fact that it abstracts resource management at object instance level. I know this is wishful thinking, but the ideal solution as far as I'm concerned would be to implement GC as a custom (non-deterministic) allocation strategy, per-instance, as a custom shared_ptr allocator.

I don't think that it is easy to have user of an object decide for every object whether it is RAII or GC, which one would have to do with your ideally proposed shared_ptr<> custom allocator scheme

No, the user wouldn't have a choice; the user would call a factory funtion: shared_ptr<foo> create_foo( ...... ); That's the beauty of the per-instance allocation strategy supported by shared_ptr. Emil Dotchevski Reverge Studios, Inc. http://www.revergestudios.com/reblog/index.php?n=ReCode

Raindog

6:57 a.m.

New subject: discussion of garbage collection in C++

...

Raindog wrote:

...
Mathias Gaunard wrote:

...
Simonson, Lucanus J wrote:

Why stop half way? Why not have an open source C++ runtime environment if the goal is applications programming and rapid prototyping in C++ with open source components? An open source GC library is a good building block for such a system, and of course we'd prefer to implement it in C++.

Why build a runtime environment based on GC when you can make a better one that is not?

Because the majority of programmers have proven that they obviously prefer a GC environment over an RAII one. Given that a large number of C++ programmers don't even take advantage of RAII, I would say that a c++ runtime w/ GC and RAII would be very appealing to a larger number of people.

Whenever anyone writes "the majority of programmers..." one can discount the subsequent statement "the majority" of the time. Unless you are talking about something that is true.

.Net, Java, Python, Perl, Ruby, D et al automatically provide GC. 8 of the top 10 languages in use today, according to http://www.tiobe.com/content/paperinfo/tpci/index.html, provide GC. I

Edward Diener wrote: think my statement about what the majority of programmers prefer still stands. I'm not trying to say however that GC is best/better than RAII etc, or that the languages with GC are somehow better than C++. I do however think that the "ez mode" programming that GC'd environments provided appeals to a larger number of programmers than does C++. C++ is an expert friendly language for many reasons and one would be naive to think that one's profession is filled with experts. Also, for people looking for a runtime envrironment for C++, they can look at Ch.

Andreas Masur

28 Apr 28 Apr

1:30 p.m.

New subject: discussion of garbage collection in C++

On Apr 26, 2009, at 2:57 AM, Raindog wrote:

...

...
.Net, Java, Python, Perl, Ruby, D et al automatically provide GC.

8 of the top 10 languages in use today, according to http://www.tiobe.com/content/paperinfo/tpci/index.html , provide GC. I think my statement about what the majority of programmers prefer still stands. I'm not trying to say however that GC is best/better than RAII etc, or that the languages with GC are somehow better than C++. I do however think that the "ez mode" programming that GC'd environments provided appeals to a larger number of programmers than does C++. C++ is an expert friendly language for many reasons and one would be naive to think that one's profession is filled with experts.

Well...while the table outlines that the majority of languages currently used are rather supported by a GC, I would not necessarily derive the same conclusion. I am a big believer of the 'Choose the right tool for the job at hand' idiom. And even though the availability of a GC is somewhat a criteria sometimes, it is usually not in my experience - at least not for these languages. Same may apply to the given 8 languages as well....the table does not indicate whether the languages are popular because of the GC. Furthermore, it does not outline how many programmers actually dislike the GC but have to deal with it since it comes with the language. In a sense these kind of tables usually show the usage of languages based on the current business needs rather than based on a technical level. This is certainly only my opinion though.... Ciao, Andreas

Simonson, Lucanus J

3:59 p.m.

New subject: discussion of garbage collection in C++

...

On Apr 26, 2009, at 2:57 AM, Raindog wrote:

...
...
.Net, Java, Python, Perl, Ruby, D et al automatically provide GC.

8 of the top 10 languages in use today, according to http://www.tiobe.com/content/paperinfo/tpci/index.html , provide GC. I think my statement about what the majority of programmers prefer still stands. I'm not trying to say however that GC is best/better than RAII etc, or that the languages with GC are somehow better than C++. I do however think that the "ez mode" programming that GC'd environments provided appeals to a larger number of programmers than does C++. C++ is an expert friendly language for many reasons and one would be naive to think that one's profession is filled with experts.

Andreas Masur wrote:

...

Same may apply to the given 8 languages as well....the table does not indicate whether the languages are popular because of the GC. Furthermore, it does not outline how many programmers actually dislike the GC but have to deal with it since it comes with the language. In a sense these kind of tables usually show the usage of languages based on the current business needs rather than based on a technical level.

This is certainly only my opinion though....

Andreas, I think you are giving the market place too much credit. The phenomenon clearly visible in the plot where C++ development is replaced by Java development from mid 2004 through 2006 is due in my opinion to many university programs buerocratic decision to teach Java instead of C++. This leads to the situation where the majority of new programmers don't know how to use a non-GC langauge, as opposed to preferring GC. C++ is an expert friendly language, the profession is not filled with experts, but that doesn't mean we should try to make it all things to all people. Experts need an expert friendly langauge more than non-experts need C++ to be spoon fed to them in a market for programming langauges that is already crowded with toddler spoons. Regards, Luke

joaquin＠tid.es

22 Apr 22 Apr

6:36 a.m.

Gregory Peele ARA/CFD escribió:

...

I agree that the internal GC approach could potentially have the same performance benefits as a traditional GC for the workload I described, especially with some cleverness in lock-free algorithms to handle multithreading. I assume it would probably perform quite a bit better actually, since it doesn't have to solve the finalization problem. As mentioned, so would a well written pool allocator - these are just two different approaches to amortizing memory allocation and deallocation costs.

Of course, an internal GC or any pool allocator techniques I've heard of don't help with the huge developer and maintainer cost of correctly implementing pointer ownership rules for complex cyclical data structures without paying the CPU and memory cost of thread-safe cycle-safe reference counting. I spent far more time solving memory leaks, dangling pointers, and invalid memory dereferences than I did implementing, testing, and correcting actual geometry logic for my Delaunay triangulation package, and my only used resource was memory so I didn't need deterministic finalization. Increasing developer productivity when used for the right task is where a traditional GC really shines, I think.

Hi Gregory, I agree with you that alledged benefits of GCs have to be sought in the realm of increased productivity --not that I like GCs, but that's another matter. I just wanted to point out that performance cannot possibly be presented as an advantage of garbage collection vs. RAII. Joaquín M López Muñoz Telefónica, Investigación y Desarrollo

Mathias Gaunard

21 Apr 21 Apr

4:54 p.m.

joaquin@tid.es wrote:

...

AFAICS, you can have an implementation of delete where "delete p" simply calls the destructor of the object pointed to by p and then passes p to an internal garbage collector that will reclaim the memory in due time. This way you have determinstic resource liberation *and* GC speed.

Actually, that would be faster. If you use mark & sweep, RAII gives you mark (which requires reachability analysis in normal GC scenarios) for free.

5887

Age (days ago)

5933

Last active (days ago)

List overview

Download

64 comments

22 participants

participants (22)

Achilleas Margaritis
Andreas Masur
Bastek
Beman Dawes
Cory Nelson
David Abrahams
Edouard A.
Edward Diener
Emil Dotchevski
Esben Mose Hansen
Frank Mori Hess
Gregory Peele ARA/CFD
jbosboom＠uci.edu
joaquin＠tid.es
Jonas Persson
Mathias Gaunard
Patrick Mihelich
Raindog
Sid Sacek
Simonson, Lucanus J
Thorsten Ottosen
Vicente Botet Escriba