boost::filesystem::exists bug and patch

::exists is implemented using ::stat. The ::stat routine returns an error on file sizes larger than 2 gigs in libc-2.3.2, and several others. So, very large files never exists (). Using ::access seems to fix the problem. =================================================================== RCS file: /cvsroot/boost/boost/libs/filesystem/src/operations_posix_windows.cpp,v retrieving revision 1.29 diff -c -r1.29 operations_posix_windows.cpp *** operations_posix_windows.cpp 6 Feb 2004 01:27:06 -0000 1.29 --- operations_posix_windows.cpp 9 Feb 2004 06:19:28 -0000 *************** *** 302,309 **** BOOST_FILESYSTEM_DECL bool exists( const path & ph ) { # ifdef BOOST_POSIX ! struct stat path_stat; ! return ::stat( ph.string().c_str(), &path_stat ) == 0; # else return ::GetFileAttributesA( ph.string().c_str() ) != 0xFFFFFFFF; # endif --- 302,308 ---- BOOST_FILESYSTEM_DECL bool exists( const path & ph ) { # ifdef BOOST_POSIX ! return ::access (ph.string ().c_str (), F_OK) == 0; # else return ::GetFileAttributesA( ph.string().c_str() ) != 0xFFFFFFFF; # endif

At 01:35 AM 2/9/2004, Jonathan Ultis wrote:
Good grief! Are you saying that ::stat() cannot be used at all because it fails for all uses when file sizes are larger than 2 gigs? That would be a disaster. The filesystem library uses stat in at least 6 other places. I would guess that stat() is one of the most heavily used POSIX functions dealing with files, and that software from here to Antarctica to Mars would fail regularly if stat() failed for large files. Hum... Didn't I just hear something about filesystems on Mars:-? I don't mean to doubt your report, but it seems hard to believe that such a commonly used function could fail for large files. Is there more to the story? --Beman PS: Your suggestion to use access() seems quite reasonable and harmless for exists(), but I haven't looked at the other filesystem uses to see if it could be uniformly applied.

Good point. Since you asked, I did some more reading. My bug report should have included "on 32 bit linux systems". And, it's possible to get stat to work for large files by using the correct defines. There could be linking problems if you do, but it's not likely to be a big concern. So, consider this a robustness patch. This will make exists () work on large files, even for goobers like me that don't have their compile set up to support large files. If you were extremely interested in supporting all configurations, there are also certain defines that can be set which cause large file support to be offered through an alternate set of function names, which can be useful for maintaining binary compatibility with pre-compiled libraries. Using that define and the alternate names would be a safe way to make the filesystem library work on 64bit files no matter what defines are used to compile. But, it's probably not worth the hassle if no one has complained yet. The linux gazette article at www.linuxgazette.com/issue67/tag/13.html was interesting, if you want to follow up on this. But, it basically said to read info on libc and look for LFS or large file support. Doing so was also interesting. --Jonathan Beman Dawes wrote:

At 12:01 AM 2/10/2004, Jonathan Ultis wrote:
Ah! These are the same techniques needed to correctly report file sizes for large files. It sounds as if it isn't safe to use stat() at all without applying them. Since I take it that you have a large file to test on, could you please read my http://lists.boost.org/MailArchives/boost/msg59662.php and report the results you get? That would be helpful. The issue isn't whether or not anyone has complained. Files that are OK today may grow to larger sizes in the future, and we don't want silent failures when the files cross size thresholds. Seems like one action item here is to add an option to the Boost.Filesystem regression tests to generate and test against large file sizes.
Thanks! That information is essentially the same as reported here by a couple of people in the discussion about how to report the size of large files correctly, but it is good to have conformation. It may take me awhile to workout all the wrinkles, but I'll be actively working on Boost.Filesystem issues over the next several weeks. --Beman

On Mon, 09 Feb 2004 21:18:08 -0500, Beman Dawes wrote
Why? As I recall the history here, it has only been fairly recent that many operating systems would support files larger than 2 gig. AFAIK Microsoft's FAT filesystems are all limited to 2 gig file size limits. So it might be the case that the majority of filesystems currently in use still have this limit.... Jeff

At 01:35 AM 2/9/2004, Jonathan Ultis wrote:
The best fix for this seems to be to enable 64-bit stat support. That fixes not just the original problem, but also fixes similar problems with other boost::filesystem functions. Accordingly, the implementation in CVS has been changed to #define __USE_FILE_OFFSET64 for POSIX implementations. This should be harmless on 64-bit systems and 32-bit systems which don't support files larger than 2 gigs. But it kicks in 64-bit support on systems like Linux which do supply Large File Support (LFS). It has been tested on Redhat Linux 8.0 with GCC 3.2 for files requiring 31, 32, and 33 bits to represent the size. The 32 and 33 bit tests failed before the fix, as expected, and succeeded after the fix. (Those tests already passed for Windows.) As well as fixing problems with older Boost.Filesystem functions, these tests confirm that the new boost::filesystem::file_size() function works reliably for Linux and Windows. I'd still like to get reports from AIX, Mac OS X, Solaris, etc., confirming that boost-root/libs/filesystem/example/file_size.cpp reliably reports sizes of files needing more than 31 bits to represent size. Thanks for the report, --Beman
participants (3)
-
Beman Dawes
-
Jeff Garland
-
Jonathan Ultis