New subject: multimap alternatives. Was: Re: : Review of a safer memory management approach for C++

8 Jun 2010

      A few responses ...
...
4. Re: Review of a safer memory management approach for C++
      (Mathias Gaunard)
   5. Re: Review of a safer memory management approach for C++
      (Roland Bock)
   6. Re: Review of a safer memory management approach for C++
      (Fernando Cacciola)
...
------------------------------
Message: 4
Date: Mon, 07 Jun 2010 18:07:55 +0100
From: Mathias Gaunard <mathias.gaunard@ens-lyon.org>
To: boost@lists.boost.org
Subject: Re: [boost] Review of a safer memory management approach for
  C++
Message-ID: <huj91v$cem$1@dough.gmane.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Bartlett, Roscoe A wrote:
...
Tools like valgrind and purify are very helpful but are not nearly
sufficient as described in Section 3.2  (and other sections) in:
http://www.cs.sandia.gov/~rabartl/TeuchosMemoryManagementSAND.pdf
The limitations you give are simply that you expect valgrind to do more
than tell you about memory errors, but also to tell you about contract
violation of any library as well.
That's not something a generic tool can do.
[Bartlett, Roscoe A] 

Exactly my point.
...
Contracts are specified by
each library, and can be optionally checked in a special debug mode the
library may provide.
[Bartlett, Roscoe A] 

Exactly.  However, rather than doing this haphazardly why not have a consistent built-in approach in your own software to catch such mistakes automatically in a debug-mode build?  Enter the Teuchos MM classes and the idioms described in:

    http://www.cs.sandia.gov/~rabartl/TeuchosMemoryManagementSAND.pdf
...
All that valgrind can do (as far as I my usage goes) is tell you if you
access some unallocated memory (relatively to the default global
allocator) or if you read an uninitialized object.
[Bartlett, Roscoe A] 

Valgrind and purify are very useful but they will often only flag there is a problem until much after the original error occurred.  As an example, a few months ago I was using std::multimap for the first time in a GCC implementation.  The documentation I found for std::multimap on the web was not very detailed and did not really explain behavior in a few important cases (I have had a hard time finding decent standard C++ library documentation).   I ran the code and it behaved in strange ways, segfaulted, etc.  I turned on the checked STL implementation with -D_GLIBCXX_DEBUG but it did not complain about anything.  I ran valgrind on it is it complained there was a problem in a place in the code that made no sense at all.  I knew the only "unsafe" code that I had written (that did not exclusively use the Teuchos MM classes and idioms) was the std::multimap code.  After more experimentation I figured out that I had guess the behavior of std::multimap incorrectly.  Once I figured out what the behavior really was, the program ran fine.  Here was a case where the checked STL implementation was not catching a basic user error and valgrind was worthless.  I just wish that I would have saved the state of this code in a branch or something so that I could show this to other people.
...
------------------------------
Message: 5
Date: Mon, 07 Jun 2010 19:14:33 +0200
From: Roland Bock <rbock@eudoxos.de>
To: boost@lists.boost.org
Subject: Re: [boost] Review of a safer memory management approach for
  C++
Message-ID: <4C0D28F9.1060301@eudoxos.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Bartlett, Roscoe A wrote:
...
Come one, at least some people on this mail list must have had
similar nightmare experiences in trying to track down and diagnose hard
memory misuse errors in C++ that took days of effort to resolve (even
with the help of tools like valgrind and purify).
Hi,
yes, I had such experiences, but that's years in the past, when I did
not know shared pointers. With shared pointers only very few memory
issues ever occurred.
[Bartlett, Roscoe A] 

I have to admit, that w.r.t. single objects, after I started using smart reference-counted pointers, I experienced very few memory errors.  However, I was still having some errors and I was writing a lot of paranoid manual error checking code with all of the raw pointers that remained (and yes, if all you have is an RCP class, you will still need to use raw pointers in many cases).  After I developed Teuchos::Ptr, the raw pointers to single objects went away and I ripped out a bunch of manual error checking code (a process that I will likely never finish because of the large amount of code I have written over the years).

There are people in my domain that still today will refuse to use a smart pointer class and insist on manipulating raw memory, even in brand-new code.  The cycle of undefined behavior, segfault, etc. will continue ...
...
These days, our memory issues are of a different kind: Memory
fragmenation due to malloc's strategies in multithreading scenarios.
These are even worse, because formally, there is no problem in the
code.
Writing you own allocator and then suddenly not being able to use
valgrind anymore, that's the memory fun today.
[Bartlett, Roscoe A] 

Do people have experience with library allocators from MPI and TBB?  These are supposed to place memory more carefully but they mean that you can't use the allocator embedded in std::vector anymore.
...
------------------------------
Message: 6
Date: Mon, 07 Jun 2010 14:32:15 -0300
From: Fernando Cacciola <fernando.cacciola@gmail.com>
To: boost@lists.boost.org
Subject: Re: [boost] Review of a safer memory management approach for
  C++
Message-ID: <hujad9$ic7$1@dough.gmane.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Hi Bartlett,
Just for the record and since your statements below are about general
experiences with large scale C++ projects, let me put my own
experiences in
context: I had been architecting, designing and implementing large-
scaled
projects from the early 90's.  The largest of which accounts for 160K
of C++
code, which in fact I wrote almost entirely myslef.
Again this is just to put the comments based on my experience in
persective.
[Bartlett, Roscoe A] 

The real problem comes when you can't architect all of the code yourself with one consistent style and instead have to glue together code from lots of different sources that had very inconsistent ideas about design and managing memory.  Early experience with this type of work lead me to put in the extra_data hack in Teuchos::RCP.  It was not pretty but it worked to glue all kinds of software together effectively.
...
...
These types of experiences have
lead many C++ teams to code C++ with a (justified but unfortunate)
paranoia
about the use of memory in C++ that has all kinds of bad consequences
I think that those memory problems are funtamentally rooted at a design
issue.
[Bartlett, Roscoe A] 

The problems are rooted in the raw manipulation of memory with raw pointers and raw calls to new and delete and raw use of arrays as raw pointers.  In my opinion that this is the design problem.  Others will likely disagree.
...
In the mid 90's I used to developed best practices and utilities for
sane memory
management (and other sanity requirements), in the same spirit of the
paper you
presented (I did read it btw). I even implemented custom allocators,
based on
class-specific memory pools, and mandated that every object should obey
a strict
allocation-deallocation protocol.
Simply defining a new class on my system required the use of a macro-
based DSL,
something like "DEFINE_DERIVED_OBJECT(Foo,Base). Likewise, object
graphs had to
be very carefully spelled with my DSL, as in "INCLUDE_SUBOBJECT(Bar)",
and so on..
This forced everyone in the team, year after year, to learn a language
on top of
C++.
[Bartlett, Roscoe A] 

Note that the STL absolutely defined a new language in C++ as has been pointed out by Scott Meyers in Item 1 in "Effective C++ 3rd edition".  The question is if the new language provides enough benefit to justify having to learn it.  I believe that STL has been hugely worth it but it is only a container, algorithm, and data-structure library and does not solve the most fundamental problems with C++ coding; the usage of raw pointers which the STL alone did not eliminate the need for in many programs (others may disagree).
...
In the long end, however, I realized I was just overengineering the
problem way
too much, putting a big burden on the team and making it difficult for
newcomers.
[Bartlett, Roscoe A] 

That was my opinion also in the late 90s as I was also going down that road and I stopped around 1999 but I have since come to regret that view today for the reasons described in Section 3.2 of:

    http://www.cs.sandia.gov/~rabartl/TeuchosMemoryManagementSAND.pdf

As argued in Sections 5.8 and 6.2 of the above document, the Teuchos MM classes and the associated idioms create a fairly thin language above raw pointers that increase the self-documenting nature of the code in a way you can't do with a language like Java or Python.
...
When C++ evolved I found new, much simpler ways to solve the same
problems, and
in particular, memory problems: using smart pointers (even long before
boost::shared_ptr came alone). Once I started using smart pointers I
never
looked back, and I never again, ever, had to spend a single minute on a
memory leak.
[Bartlett, Roscoe A] 

Smart pointers solve leaks very well but the real problem are other invalid usages of memory that create undefined behavior.  In CSE we have to use lots of arrays and we can't mandate that all memory use std::vector or STL allocators to do all of the allocations.  The main holes that were filled with the Teuchos MM classes were better safer handling of arrays without mandating (not even at compile-time) how the memory is allocated or deallocated in a consistent system.
...
...
Come one, at least some people on this mail list must have had
similar
nightmare experiences in trying to track down and diagnose hard
memory misuse
errors in C++ that took days of effort to resolve (even with the help
of
tools like valgrind and purify).
Before I started using smart pointers, yes.
After that, no.. never again.
[Bartlett, Roscoe A] 

Yes but what currently exists in C++0x and boost for arrays is not sufficient to provide the guarantees.  Read the paper, look the classes, and make of you mind for yourself.  Also, what did you do in cases where you did not need machinery for persisting associations and could not afford the overhead of reference-counting classes?  Did you just use a raw pointer?  Did you use a raw C++ reference?  Are all of your objects using value semantics (deep or shallow copy) and just copied objects?  That was the hole the Teuchos::Ptr class was designed to fill and it still provides full referentially checking in a debug-mode build.
...
...
And again, tools like valgrind and purify will *never* catch semantic
misuse
of memory (i.e. allocating a big chunk of memory and then breaking it
up to
be used to construct different objects and array of objects).  The
Teuchos MM
classes will catch most semantic misuse of memory in a way that no
tool like
valgrind and purify every can (because they don't know the context of
your
program, they only know that any read/writes in a big block of memory
that
you allocated look okay).  I think this is a big deal in catching
hard-to-find defects that are not (technically speaking) memory
misuse
defects but are program defects none the less.
I totally fail to see why this design mistakes (wrong allocation
pattern) should
be detected by a framework? These are design issues and should be deal
with at
that stage. Surely any team can be trained not to make such mistakes.
[Bartlett, Roscoe A] 

Telling people to simply stop making mistakes is not a solution to every problem.  W. Edwards Deming stated that most mistakes that are made by people are due to a faulty process or faulty support tools.  If people keep making mistakes then we need to first look at the processes and tools and not just blame them.  We should be able to create the tools so that basic errors in memory usage are automatically detected, even in C++.  The Teuchos MM classes and idioms paired with a static analysis tool (if we could find and configure one) to help enforce the idioms would largely solve this problem in C++.

Otherwise, most people what currently write CSE software would be better off writing code in a language like C# but that is just not viable for many reasons.

: Review of a safer memory management approach for C++

Bartlett, Roscoe A

Simonson, Lucanus J

Thorsten Ottosen

Felipe Magno de Almeida

Belcourt, Kenneth

Felipe Magno de Almeida

Simonson, Lucanus J

tags

participants (5)