[filesystem]: infinite-recursion with symlink

While playing around with boost::filesystem and the recursive_directory_iterator, I'm running into a infinite-recursion caused by symlinks. Not following symlinks would be, a solution, of course, but... So, is there a way to ascertain the path where the symlink is aiming at? And a second question: I didn't find anything about permission-handling - are there any plans? In my eyes, this is something essential for a filesystem-library. Thanks in advance

Nope, sadly.
You might be able to use the deprecated canonicalization/normalization functions.
Bad news, thanks anyway. The lack of getting permissions and need of using deprecated methods for that problem leads me to drop boost::filesystem and instead using Qt. Wanted to avoid this dependency for that part of my code, but it fullfills my requirements.

variadic.template wrote:
Nope, sadly.
You might be able to use the deprecated canonicalization/normalization functions.
Bad news, thanks anyway. The lack of getting permissions and need of using deprecated methods for that problem leads me to drop boost::filesystem and instead using Qt. Wanted to avoid this dependency for that part of my code, but it fullfills my requirements.
IIRC, Beeman is developing a Filesystem V3. I'm not sure how close it is to release, and if it will address your issue. You might try addressing him directly. Jeff

On Sat, Nov 21, 2009 at 6:46 PM, variadic.template < variadic.template@googlemail.com> wrote:
While playing around with boost::filesystem and the recursive_directory_iterator, I'm running into a infinite-recursion caused by symlinks. Not following symlinks would be, a solution, of course,
That's the usual solution. Use the no_push() member to tell the recursive_directory_iterator no to recurse into a directory.
but... So, is there a way to ascertain the path where the symlink is aiming at?
Version 3, in the sandbox, has just such a feature.
And a second question: I didn't find anything about permission-handling - are there any plans? In my eyes, this is something essential for a filesystem-library.
Not at the moment. We haven't figured out how to abstract away the differences between POSIX and Windows approaches to permissions. Any ideas appreciated. --Beman

On Tue, Nov 24, 2009 at 7:35 AM, Beman Dawes <bdawes@acm.org> wrote:
On Sat, Nov 21, 2009 at 6:46 PM, variadic.template < variadic.template@googlemail.com> wrote:
While playing around with boost::filesystem and the recursive_directory_iterator, I'm running into a infinite-recursion caused by symlinks. Not following symlinks would be, a solution, of course,
That's the usual solution. Use the no_push() member to tell the recursive_directory_iterator no to recurse into a directory.
but... So, is there a way to ascertain the path where the symlink is aiming at?
Version 3, in the sandbox, has just such a feature.
And a second question: I didn't find anything about permission-handling - are there any plans? In my eyes, this is something essential for a filesystem-library.
Not at the moment. We haven't figured out how to abstract away the differences between POSIX and Windows approaches to permissions. Any ideas appreciated.
I'm not sure there really is a way, because the two methods are not functionally equivalent, and in fact are quite different. i.e. windows ACLs can represent permissions that are not semantically representable in the unix model. In the filesystem library I've developed in-house my plan for implementing permissions is to just make a typedef called something like filesystem::permissions that resolve to opaque structures on both platforms, and then have APIs like filesystem::windows::allow_permission(), filesystem::windows::set_owner(), filesystem::posix::user_permissions(read | write | group), etc. Is this Filesystem V3 going to support any of the following (and is there an estimated release date, even if it's highly speculative)? a) timestamp operations b) cross-platform create/open of files c) windows junctions and symlinks d) unix block/char devices, sockets, and pipes create/open is the biggest gaping hole in the current FS library in my opinion. you often just need a handle and want to customized the way in which it's opened. with my code you can do soemthing like: filesystem::handle handle; /* opaque structure, only understood by the filesystem api */ filesystem::object_info info; /* boost variant, internal type depends on type of filesystem object */ filesystem::create_file( path, link_open_target, /* follow symlinks */ only_dir, /* fail unless this is a directory handle */ flags::async | flags::direct | flags::bypass_security, /* open for async direct i/o and disable any kernel security checking */ &handle, /* have the function return the handle (this can be null if not interested) */ &info; /* have the function return object info (this can be null if not interested) */ ); handle then is an opaque object that can be used by other filesystem apis like timestamp operations etc, and info is a boost::variant whose type depends on whether it's a directory, junctino, symlink, pipe, etc. I've solved all of the above problems in a cross-platform way in my own in-house api but it might be difficult to integrate any of what i've done into an interface consistent with the current filesystem library. if you want any of this code though let me know. Zach

On Tue, Nov 24, 2009 at 10:16 AM, Zachary Turner <divisortheory@gmail.com>wrote:
On Tue, Nov 24, 2009 at 7:35 AM, Beman Dawes <bdawes@acm.org> wrote:
And a second question: I didn't find anything about permission-handling
are there any plans? In my eyes, this is something essential for a filesystem-library.
Not at the moment. We haven't figured out how to abstract away the differences between POSIX and Windows approaches to permissions. Any ideas appreciated.
I'm not sure there really is a way, because the two methods are not functionally equivalent, and in fact are quite different. i.e. windows ACLs can represent permissions that are not semantically representable in the unix model.
Cygwin attempts to map the traditional Posix permissions functionality onto Windows. With Cygwin 1.7, now in beta, they are also mapping ACL's onto Windows. While that approach may not supply all the features of Windows, it does seem to provide enough functionality to be useful.
In the filesystem library I've developed in-house my plan for implementing permissions is to just make a typedef called something like filesystem::permissions that resolve to opaque structures on both platforms, and then have APIs like filesystem::windows::allow_permission(), filesystem::windows::set_owner(), filesystem::posix::user_permissions(read | write | group), etc.
Boost.Filesystem has always resisted doing that. There is still functionality other functionality that has been requested I'd like to work on first.
Is this Filesystem V3 going to support any of the following (and is there an estimated release date, even if it's highly speculative)?
The V3 library code is about ready for a beta. I'm working on docs now. I'm semi-aiming for a January 1st beta release.
a) timestamp operations
Which specific timestamp operations do you need? Last write time is already supported.
b) cross-platform create/open of files
Nothing planned, beyond the current <fstream> support.
c) windows junctions and symlinks
Supported in V3.
d) unix block/char devices, sockets, and pipes
Nothing planned in Boost.Filesystem. Several other Boost libraries already have at least some support for these.
create/open is the biggest gaping hole in the current FS library in my opinion. you often just need a handle and want to customized the way in which it's opened. with my code you can do soemthing like:
filesystem::handle handle; /* opaque structure, only understood by the filesystem api */ filesystem::object_info info; /* boost variant, internal type depends on type of filesystem object */
filesystem::create_file( path, link_open_target, /* follow symlinks */ only_dir, /* fail unless this is a directory handle */ flags::async | flags::direct | flags::bypass_security, /* open for async direct i/o and disable any kernel security checking */ &handle, /* have the function return the handle (this can be null if not interested) */ &info; /* have the function return object info (this can be null if not interested) */ );
handle then is an opaque object that can be used by other filesystem apis like timestamp operations etc, and info is a boost::variant whose type depends on whether it's a directory, junctino, symlink, pipe, etc.
Nothing like that planned at the moment. I've solved all of the above problems in a cross-platform way in my own
in-house api but it might be difficult to integrate any of what i've done into an interface consistent with the current filesystem library. if you want any of this code though let me know.
What I'd be most interested in is motivation and use cases. I need to better understand the need before I start thinking about code. Thanks, --Beman

On Tue, Nov 24, 2009 at 4:03 PM, Beman Dawes <bdawes@acm.org> wrote:
On Tue, Nov 24, 2009 at 10:16 AM, Zachary Turner <divisortheory@gmail.com
wrote:
a) timestamp operations
Which specific timestamp operations do you need? Last write time is already supported.
All timestamps which are gettable / settable via system calls. In windows this means create/access/modify, and posix this means create/modify/change. One of the posix ones I believe is actually not settable.
b) cross-platform create/open of files
Nothing planned, beyond the current <fstream> support.
Unfortunate, as there is really quite a bit in common between the two operating systems' create/open methods that could be exploited. For example (here, <-> means either equivalent or more or less equivalent): O_DIRECT <-> FILE_FLAG_NO_BUFFERING|FILE_FLAG_WRITE_THROUGH O_NONBLOCK <-> FILE_FLAG_OVERLAPPED O_CREAT | O_EXCL <-> CREATE_NEW O_CREAT | O_TRUNC <-> CREATE_ALWAYS You can abstract these into an enumeration such as create_direct, create_async, etc but also define all the platform specific ones as well and allow them to be combined with the generic ones. Other areas of boost require handles to operate on. For example, boost::asio supports asynchronous operations but requires a handle that's been opened with the appropriate flags (FILE_FLAG_OVERLAPPED, for example). It actually doesn't support async i/o for posix filesystem handles, but I have my own extensions to boost::asio that do allow this and map to the posix aio_* family of apis, and requires a handle that's been opened with O_NONBLOCK. boost::iostreams already supports a cross platform file descriptor / handle, but there is currently no cross-platform way to actually create such a handle. So I have the feeling that almost anyone using boost::iostreams::file_descriptor is using ifdefs all over their code to create the handles. Correct me if this is wrong.
c) windows junctions and symlinks
Supported in V3.
What types of operations are supported for junctions and symlinks? Can I query the target to see what it points to, and is there an api(s) that allows delete to selectively delete the target or the original item? Also I forgot to mention hard links. If hard links are supported, can i query the link count or get the inode number? (Contrary to popular belief, all versions since windows 2000 support hard links and the ability to get an inode # for a file).
d) unix block/char devices, sockets, and pipes
Nothing planned in Boost.Filesystem. Several other Boost libraries already have at least some support for these.
If boost.filesystem supported them, I could create them using a consistent interface to how I create other types of filesystem objects, and also be able to use any timestamp functionality provided by Boost.Filesystem to query them.
create/open is the biggest gaping hole in the current FS library in my opinion. you often just need a handle and want to customized the way in which it's opened. with my code you can do soemthing like:
filesystem::handle handle; /* opaque structure, only understood by
the
filesystem api */ filesystem::object_info info; /* boost variant, internal type depends on type of filesystem object */
filesystem::create_file( path, link_open_target, /* follow symlinks */ only_dir, /* fail unless this is a directory handle */ flags::async | flags::direct | flags::bypass_security, /* open for async direct i/o and disable any kernel security checking */ &handle, /* have the function return the handle (this can be null if not interested) */ &info; /* have the function return object info (this can be null if not interested) */ );
handle then is an opaque object that can be used by other filesystem apis like timestamp operations etc, and info is a boost::variant whose type depends on whether it's a directory, junctino, symlink, pipe, etc.
Nothing like that planned at the moment.
I've solved all of the above problems in a cross-platform way in my own
in-house api but it might be difficult to integrate any of what i've done into an interface consistent with the current filesystem library. if you want any of this code though let me know.
What I'd be most interested in is motivation and use cases. I need to better understand the need before I start thinking about code.
Well, I work on high performance backup software for linux/windows. For backup I need raw access to the disk, this means I need fine control over the handle that I'm performing I/O on. In particular, it needs to be asynchronous and support unbuffered i/o but there are various other cases where I use strange combinations of flags (for example FILE_FLAG_SEQUENTIAL_SCAN or FILE_FLAG_DELETE_ON_CLOSE on windows). Some files however should not be backed up (depends on some custom rules specified by the user) and for these I need to be able to recursively delete them. But maybe sometimes I want to follow links and sometimes I don't, again depends on some user parameters. If performance were not such a high consideration this is not a problem. But since it is, I need to do everything possible to minimize the number of API calls and opens/closes on individual files. Consider for example a system with millions of small files (say 0 bytes just for the sake of argument). In this case just opening the file is 100% of the work that needs to be done on this file, so I should try to open it as few times as possible. However, first I have to know that it's even a normal file and not some directory that I need to recurse into. So I check if it's a file, it is so then I can open it and start reading from it. But issuing two calls via a path is going to be much slower than first getting a handle to the object and then querying the handle for the required information. Then, without even opening it again, I can use the same handle to actually read data from the file, saving costly operations. There are many examples of optimizations such as this, but ultimately it boils down to the fact that operations on handles are much faster than operations on paths, and handles can also be used to actually perform i/o on. As for restore, I need to be able to set every possible aspect of a file that exists, including all timestamps, permissions, and I need to be able to restore any type of file whether it be a socket, pipe, or windows junction. Again, I need fine grained control over the handle. For example, on windows I need FILE_FLAG_BACKUP_SEMANTICS to disable ACL security checking. 90% of this can be abstracted into things that are common between each platform. For the parts that can't, I really like the model that Boost.Asio has employed, where it provides a windows and posix namespace and provides all the extra details there. Zach
participants (5)
-
Beman Dawes
-
Jeff Flinn
-
Scott McMurray
-
variadic.template
-
Zachary Turner