BUG: filesystem remove_all [PATCH]

Greetings, I have discovered a bug in the implementation of remove_all. remove_all works by reading through a directory and deleting entries along the way. If it comes across a subdirectory, it immediately recurses into that subdirectory. However, boost::filesystem uses readdir, which on some platforms (including OS X, apparently) uses a static data area that can be overwritten on each call to readdir. This messes up the original directory iteration, ending up with some entries not being deleted. A solution is to not recurse the directory immediately, but to make a list of them. Once you have gone through all of the entries in a directory, then you recurse into the subdirectories. A patch is below, although it could probably be improved. Cheers, Walter Landry wlandry@ucsd.edu --- /home/boo/arx/arx/src/boost/_arx/++cache/wlandry@ucsd.edu--arx/boost/1/32/,4/libs/filesystem/src/operations_posix_windows.cpp 2005-01-04 14:11:44.000000000 -0500 +++ /home/boo/arx/arx/src/boost/libs/filesystem/src/operations_posix_windows.cpp 2005-01-18 21:16:49.000000000 -0500 @@ -28,6 +28,7 @@ #include <boost/throw_exception.hpp> #include <boost/detail/workaround.hpp> #include <cstring> +#include <list> #ifdef BOOST_NO_STDC_NAMESPACE namespace std { using ::strcmp; using ::remove; using ::rename; } @@ -191,11 +192,23 @@ if ( !fs::symbolic_link_exists( ph ) // don't recurse symbolic links && fs::is_directory( ph ) ) { + std::list<fs::path> dir_list; for ( fs::directory_iterator itr( ph ); itr != end_itr; ++itr ) { - count += remove_all_aux( *itr ); + if(!fs::symbolic_link_exists( ph ) // don't recurse symbolic links + && fs::is_directory( ph ) ) + { + dir_list.push_back(*itr); + } + else + { + count += remove( *itr ); + } } + for(std::list<fs::path>::iterator i=dir_list.begin(); + i!=dir_list.end(); ++i) + count+=remove_all_aux(*i); } fs::remove( ph ); return count;

Walter Landry wrote:
Greetings,
I have discovered a bug in the implementation of remove_all. remove_all works by reading through a directory and deleting entries along the way. If it comes across a subdirectory, it immediately recurses into that subdirectory. However, boost::filesystem uses readdir, which on some platforms (including OS X, apparently) uses a static data area that can be overwritten on each call to readdir. This messes up the original directory iteration, ending up with some entries not being deleted.
A solution is to not recurse the directory immediately, but to make a list of them. Once you have gone through all of the entries in a directory, then you recurse into the subdirectories. A patch is below, although it could probably be improved.
This patch would only fix remove_all, but the general case remains broken. A filesystem iterator should not be affected by another filesystem iterator. By the way, POSIX says that readdir can only overwrite the result of readdir on the same directory stream, so OS X seems to be buggy in this regard.

"Peter Dimov" <pdimov@mmltd.net> wrote:
Walter Landry wrote:
Greetings,
I have discovered a bug in the implementation of remove_all. remove_all works by reading through a directory and deleting entries along the way. If it comes across a subdirectory, it immediately recurses into that subdirectory. However, boost::filesystem uses readdir, which on some platforms (including OS X, apparently) uses a static data area that can be overwritten on each call to readdir. This messes up the original directory iteration, ending up with some entries not being deleted.
A solution is to not recurse the directory immediately, but to make a list of them. Once you have gone through all of the entries in a directory, then you recurse into the subdirectories. A patch is below, although it could probably be improved.
This patch would only fix remove_all, but the general case remains broken. A filesystem iterator should not be affected by another filesystem iterator.
The only fix for that is readdir_r. There is a comment in the docs that readdir_r might not always be available. This is one of those cases where an autoconf macro would come in handy.
By the way, POSIX says that readdir can only overwrite the result of readdir on the same directory stream, so OS X seems to be buggy in this regard.
I agree. Cheers, Walter Landry wlandry@ucsd.edu

At 06:40 AM 1/19/2005, Peter Dimov wrote:
Walter Landry wrote:
Greetings,
I have discovered a bug in the implementation of remove_all. remove_all works by reading through a directory and deleting entries along the way. If it comes across a subdirectory, it immediately recurses into that subdirectory. However, boost::filesystem uses readdir, which on some platforms (including OS X, apparently) uses a static data area that can be overwritten on each call to readdir. This messes up the original directory iteration, ending up with some entries not being deleted.
A solution is to not recurse the directory immediately, but to make a list of them. Once you have gone through all of the entries in a directory, then you recurse into the subdirectories. A patch is below, although it could probably be improved.
This patch would only fix remove_all, but the general case remains broken. A filesystem iterator should not be affected by another filesystem iterator.
Right. I'll add a test case to try to detect that happening.
By the way, POSIX says that readdir can only overwrite the result of readdir on the same directory stream, so OS X seems to be buggy in this regard.
Strange. I wonder if most OS X developers use readdir_r. Otherwise I would think a serious non-compliance in readdir would have been fixed a long time ago. Thanks, --Beman

At 10:58 PM 1/18/2005, Walter Landry wrote:
Greetings,
I have discovered a bug in the implementation of remove_all. remove_all works by reading through a directory and deleting entries along the way. If it comes across a subdirectory, it immediately recurses into that subdirectory. However, boost::filesystem uses readdir, which on some platforms (including OS X, apparently) uses a static data area that can be overwritten on each call to readdir. This messes up the original directory iteration, ending up with some entries not being deleted.
A solution is to not recurse the directory immediately, but to make a list of them. Once you have gone through all of the entries in a directory, then you recurse into the subdirectories. A patch is below, although it could probably be improved.
Walter, Thanks for the report. As Peter pointed out, that isn't supposed to happen on a POSIX compliant system. Is it possible the bug is actually in the remove_all code or your code? I've had a "to do" item for a month or so to change readdir to readdir_r on systems which support it. My guess is that almost all modern POSIX operating systems support readdir_r, but there is a configuration macro so the code can adapt to what is available. I'm not sure my guesses about POSIX are worth anything, however, since I would have guessed every modern POSIX OS implements readdir correctly:-( Anyhow, readdir_r should solve this problem so I'll accelerate that fix. --Beman

Beman Dawes <bdawes@acm.org> wrote:
At 10:58 PM 1/18/2005, Walter Landry wrote:
Greetings,
I have discovered a bug in the implementation of remove_all. remove_all works by reading through a directory and deleting entries along the way. If it comes across a subdirectory, it immediately recurses into that subdirectory. However, boost::filesystem uses readdir, which on some platforms (including OS X, apparently) uses a static data area that can be overwritten on each call to readdir. This messes up the original directory iteration, ending up with some entries not being deleted.
A solution is to not recurse the directory immediately, but to make a list of them. Once you have gone through all of the entries in a directory, then you recurse into the subdirectories. A patch is below, although it could probably be improved.
Walter,
Thanks for the report. As Peter pointed out, that isn't supposed to happen on a POSIX compliant system. Is it possible the bug is actually in the remove_all code or your code?
I don't think so. The same code works fine on NetBSD, Cygwin, and Linux (various flavors). It only breaks on OS X. The particular case causing problems is a directory filled with about 560 subdirectories. It may be that this is a sufficiently rare case that no one has run into it. Also, as a point of interest, there is an implementation of remove_all in C at http://www.cs.utk.edu/~plank/plank/classes/cs360/360/oldtests/t1/2000/a2.htm... It takes the same approach I gave, but also notes that you should really call chmod on the directories before attempting to delete anything in them.
I've had a "to do" item for a month or so to change readdir to readdir_r on systems which support it. My guess is that almost all modern POSIX operating systems support readdir_r, but there is a configuration macro so the code can adapt to what is available. I'm not sure my guesses about POSIX are worth anything, however, since I would have guessed every modern POSIX OS implements readdir correctly:-( Anyhow, readdir_r should solve this problem so I'll accelerate that fix.
That would be great. Cheers, Walter Landry wlandry@ucsd.edu
participants (3)
-
Beman Dawes
-
Peter Dimov
-
Walter Landry