
"Chris Frey" <cdfrey@netdirect.ca> wrote in message news:20050602055255.GA32207@netdirect.ca...
Hi,
I recently had a need to do directory lookups in C++ and thought I'd take a look at boost::filesystem. I ran the sample_ls.cpp from the documentation, and it worked great.
The problem is that directory_iterator appears to return a pointer to a path object. This path object is then passed to functions such as is_directory() to find the type.
On systems that support type information in the directory entry itself, this structure limits what data can be returned in an iteration.
I would suggest something like (roughly):
class path; class dirent; class directory_iterator { // ... const dirent * operator-> () const; // ... };
class dirent { // some easy method to convert to path operator path (); // or perhaps a safer method would just be // to duplicate the path functions const std::string & leaf() const; // etc.
// and then possible optimized versions bool is_directory() const;
// ... };
The members of dirent would mirror the available functions that use the path object. If no optimization is possible, it just calls what the user would have had to call anyway. But if it is possible to optimize certain items (such as a struct dirent containing d_type), this would be used, possibly saving a call to stat().
Normally optimization should be left as implementation details... but in this case, I believe the class design limits what optimization is allowed. I would be pleased if I'm wrong on this.
Thanks for reading this far.
The question has come up in the past, although I don't recall anyone proposing exactly the solution you suggest. So the thoughts that follow are based on considerable thought, although that doesn't always mean much. Here is the analysis: * That is a lot of additional interface complexity to support an optimization that applies to Windows but not POSIX. Some of the other schemes (which involved additional overloads to specific operations functions) had less visible impact on the interface. * There have been no timings to indicate the inefficiency of the current interface actually impacts production applications. * AFAICS, there is nothing in the current interface which prevents caching of directory entries. Caching would probably aid more use cases than proposed changes to interfaces. But both caching and user dirent storage introduce serious additional race conditions. Not a good thing. In fact a showstopper unless cache management is introduced, further complicating the interface. * There is another outstanding issue (lack of directory iterator filtering and/or globbing) that does in fact impact both timing and ease of use of real-world applications and so is the highest priority for future work. Those are enough concerns to make the current interface look pretty good, IMO. Thanks for your interest, --Beman