[filesystem] #6521 Directory listing using C++11 range-based for loops

Ticket #6521 requests adding: class directory{ path p_; public: inline directory(path p):p_(p){} directory_iterator begin(){ return directory_iterator(p_); } directory_iterator end(){ return directory_iterator(); } }; so that that a range-based for loop can be used: for (auto itr : directory(".")) { cout << itr.path() << endl; } The above works as expected on GCC 4.6 and VC++ 11 beta. Is that the best way for filesystem directory iteration to support range-based for? Provide a class directory_tree for recursive iteration? --Beman

On Thu, Apr 19, 2012 at 9:57 AM, Beman Dawes <bdawes@acm.org> wrote:
Ticket #6521 requests adding:
class directory{ path p_; public: inline directory(path p):p_(p){} directory_iterator begin(){ return directory_iterator(p_); } directory_iterator end(){ return directory_iterator(); } };
so that that a range-based for loop can be used:
for (auto itr : directory(".")) { cout << itr.path() << endl; }
The above works as expected on GCC 4.6 and VC++ 11 beta.
Is that the best way for filesystem directory iteration to support range-based for?
Ticket #5896, Range directory iterators, suggests basing a solution on boost::iterator_range<bfs::directory_iterator>, and that's an obvious alternative to explore. --Beman

On 19-04-2012 17:29, Beman Dawes wrote:
Is that the best way for filesystem directory iteration to support range-based for?
Ticket #5896, Range directory iterators, suggests basing a solution on boost::iterator_range<bfs::directory_iterator>, and that's an obvious alternative to explore.
If we follow http://www.boost.org/doc/libs/1_49_0/libs/range/doc/html/range/reference/ran... then we could add boost::iterator_range<...> boost::directory_range( const boost::path& ) boost::iterator_range<...> boost::recursive_directory_range( const boost::path& ) kind regards -Thorsten

On 20-04-2012 10:12, Thorsten Ottosen wrote:
On 19-04-2012 17:29, Beman Dawes wrote:
then we could add
boost::iterator_range<...> boost::directory_range( const boost::path& ) boost::iterator_range<...> boost::recursive_directory_range( const boost::path& )
Then, if we wanted to make a really user-friendly interface, we could add boost::iterator_range<...> boost::directory_range( const boost::path&, const boost::path& extension ) support for multiple extensions or reguar expression would be cool, e.g. boost::directory_range( "some_path/", "(.txt|.doc|.jpg)" ) and support for the reverse sitution boost::iterator_range<...> boost::directory_range( const boost::path&, const boost::regex& toExclude ) Especially for recursive iteration, it is useful to skip entire directories. Remark: such support is probably most naturally added to the underying iterator classes by storing some boost::optional<> variables. -Thorsten

On Fri, Apr 20, 2012 at 11:24 AM, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
On 20-04-2012 10:12, Thorsten Ottosen wrote:
On 19-04-2012 17:29, Beman Dawes wrote:
then we could add
boost::iterator_range<...> boost::directory_range( const boost::path& ) boost::iterator_range<...> boost::recursive_directory_range( const boost::path& )
What's the advantage over the other suggestion?
Then, if we wanted to make a really user-friendly interface, we could add
boost::iterator_range<...> boost::directory_range( const boost::path&, const boost::path& extension )
support for multiple extensions or reguar expression would be cool, e.g.
boost::directory_range( "some_path/", "(.txt|.doc|.jpg)" )
and support for the reverse sitution
boost::iterator_range<...> boost::directory_range( const boost::path&, const boost::regex& toExclude )
Especially for recursive iteration, it is useful to skip entire directories.
That doesn't appear to scale. What if I want to have both include and exclude params? Olaf

On Fri, Apr 20, 2012 at 6:45 AM, Olaf van der Spek <ml@vdspek.org> wrote:
On Fri, Apr 20, 2012 at 11:24 AM, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
... boost::iterator_range<...> boost::directory_range( const boost::path&, const boost::regex& toExclude )
Especially for recursive iteration, it is useful to skip entire directories.
That doesn't appear to scale. What if I want to have both include and exclude params?
IIUC, Thorsten's subsequent suggestion:
Remark: such support is probably most naturally added to the underying iterator classes by storing some boost::optional<> variables.
was a way to address that. Regardless of how it is done, it would be highly desirable to be able to compose an arbitrary set of filters. Boost.Range already has the concept of a filtered range, so that's at least a model if not something that can be used directly. Thanks, --Beman

On Fri, Apr 20, 2012 at 5:24 AM, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
On 20-04-2012 10:12, Thorsten Ottosen wrote:
On 19-04-2012 17:29, Beman Dawes wrote:
then we could add
boost::iterator_range<...> boost::directory_range( const boost::path& ) boost::iterator_range<...> boost::recursive_directory_range( const boost::path& )
Then, if we wanted to make a really user-friendly interface, we could add
boost::iterator_range<...> boost::directory_range( const boost::path&, const boost::path& extension )
support for multiple extensions or reguar expression would be cool, e.g.
boost::directory_range( "some_path/", "(.txt|.doc|.jpg)" )
Cool! But:
and support for the reverse sitution
boost::iterator_range<...> boost::directory_range( const boost::path&, const boost::regex& toExclude )
is even better:-)
Especially for recursive iteration, it is useful to skip entire directories.
Mind-blowing! Your suggestion is getting very close to a solution for the general directory search problem I've wrestled with for years. What we would really like is to be able to apply a series of filters, some of which apply to directories, some to files. Examples are your include|exclude regex filters, applicable to directories|files|both. Another filter might be a file filter <=|>= a given size. Dates and permissions filters would also be useful. Perhaps user supplied filters. And how about filters applied to the contents of files?
Remark: such support is probably most naturally added to the underying iterator classes by storing some boost::optional<> variables.
I'll need to think about that - it will take me a while to digest this. Thanks, --Beman

on Fri Apr 20 2012, Beman Dawes <bdawes-AT-acm.org> wrote:
On Fri, Apr 20, 2012 at 5:24 AM, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
On 20-04-2012 10:12, Thorsten Ottosen wrote:
On 19-04-2012 17:29, Beman Dawes wrote:
then we could add
boost::iterator_range<...> boost::directory_range( const boost::path& ) boost::iterator_range<...> boost::recursive_directory_range( const boost::path& )
Then, if we wanted to make a really user-friendly interface, we could add
boost::iterator_range<...> boost::directory_range( const boost::path&, const boost::path& extension )
support for multiple extensions or reguar expression would be cool, e.g.
boost::directory_range( "some_path/", "(.txt|.doc|.jpg)" )
Cool!
But:
and support for the reverse sitution
boost::iterator_range<...> boost::directory_range( const boost::path&, const boost::regex& toExclude )
is even better:-)
So, add Boost.Parameter and you get directory_range('some_path', _recursive=true, _include=string_or_regex, _exclude=string_or_regex) directory_range('some_path', _exclude=string_or_regex) directory_range('some_path', _recursive=false)
Especially for recursive iteration, it is useful to skip entire directories.
Mind-blowing! Your suggestion is getting very close to a solution for the general directory search problem I've wrestled with for years.
What we would really like is to be able to apply a series of filters, some of which apply to directories, some to files. Examples are your include|exclude regex filters, applicable to directories|files|both. Another filter might be a file filter <=|>= a given size. Dates and permissions filters would also be useful. Perhaps user supplied filters. And how about filters applied to the contents of files?
And then there's directory_range('some_path') | filtered(predicate1) | filtered(predicate2) ... -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On Fri, Apr 20, 2012 at 9:27 AM, Dave Abrahams <dave@boostpro.com> wrote:
... So, add Boost.Parameter and you get
directory_range('some_path', _recursive=true, _include=string_or_regex, _exclude=string_or_regex) directory_range('some_path', _exclude=string_or_regex) directory_range('some_path', _recursive=false)
That's nice, but I think we can do better...
And then there's
directory_range('some_path') | filtered(predicate1) | filtered(predicate2) ...
Yes, and that's the direction I'm planning to explore. I'm just now setting up to try out such suggestions. Thanks, --Beman

on Fri Apr 20 2012, Beman Dawes <bdawes-AT-acm.org> wrote:
On Fri, Apr 20, 2012 at 9:27 AM, Dave Abrahams <dave@boostpro.com> wrote:
... So, add Boost.Parameter and you get
directory_range('some_path', _recursive=true, _include=string_or_regex, _exclude=string_or_regex) directory_range('some_path', _exclude=string_or_regex) directory_range('some_path', _recursive=false)
That's nice, but I think we can do better...
And then there's
directory_range('some_path') | filtered(predicate1) | filtered(predicate2) ...
Yes, and that's the direction I'm planning to explore.
I'm just now setting up to try out such suggestions.
You may find that post-hoc filtering with "|" is not going to allow you to avoid traversing into subdirectories that will ultimately be filtered out. On the other hand, maybe it's possible to do something with expression templates that saves you from that problem. -- Dave Abrahams BoostPro Computing http://www.boostpro.com

On 20-04-2012 17:23, Dave Abrahams wrote:
on Fri Apr 20 2012, Beman Dawes<bdawes-AT-acm.org> wrote:
And then there's
directory_range('some_path') | filtered(predicate1) | filtered(predicate2) ...
Yes, and that's the direction I'm planning to explore.
I'm just now setting up to try out such suggestions.
You may find that post-hoc filtering with "|" is not going to allow you to avoid traversing into subdirectories that will ultimately be filtered out.
This would be problematic IMO.
On the other hand, maybe it's possible to do something with expression templates that saves you from that problem.
Seems extreamly complicated. -Thorsten

On 20-04-2012 15:27, Dave Abrahams wrote:
What we would really like is to be able to apply a series of filters, some of which apply to directories, some to files. Examples are your include|exclude regex filters, applicable to directories|files|both. Another filter might be a file filter<=|>= a given size. Dates and permissions filters would also be useful. Perhaps user supplied filters. And how about filters applied to the contents of files?
And then there's
directory_range('some_path') | filtered(predicate1) | filtered(predicate2) ...
[Thorsten:] My original reason for embedding the regexes directly in the iterators was that it could be made to run somewhat more efficient because the iterators could peek at the low-level api and perhaps void string conversions. Filters applied afterwards does seem to scale much better, but I wonder how much slower it would get. This is not a personal problem for me, but it may to some. -Thorsten

On Mon, Apr 23, 2012 at 5:26 PM, Thorsten Ottosen <thorsten.ottosen@dezide.com> wrote:
On 20-04-2012 15:27, Dave Abrahams wrote:
What we would really like is to be able to apply a series of filters, some of which apply to directories, some to files. Examples are your include|exclude regex filters, applicable to directories|files|both. Another filter might be a file filter<=|>= a given size. Dates and permissions filters would also be useful. Perhaps user supplied filters. And how about filters applied to the contents of files?
And then there's
directory_range('some_path') | filtered(predicate1) | filtered(predicate2) ...
[Thorsten:]
My original reason for embedding the regexes directly in the iterators was that it could be made to run somewhat more efficient because the iterators could peek at the low-level api and perhaps void string conversions. Filters applied afterwards does seem to scale much better, but I wonder how much slower it would get. This is not a personal problem for me, but it may to some.
Can't you move the filtering to the second argument of directory_range (somehow) while keeping genericity? -- Olaf

On 20-04-2012 14:35, Beman Dawes wrote:
On Fri, Apr 20, 2012 at 5:24 AM, Thorsten Ottosen
boost::iterator_range<...> boost::directory_range( const boost::path&, const boost::regex& toExclude )
is even better:-)
Especially for recursive iteration, it is useful to skip entire directories.
Mind-blowing! Your suggestion is getting very close to a solution for the general directory search problem I've wrestled with for years.
What we would really like is to be able to apply a series of filters, some of which apply to directories, some to files. Examples are your include|exclude regex filters, applicable to directories|files|both. Another filter might be a file filter<=|>= a given size. Dates and permissions filters would also be useful. Perhaps user supplied filters. And how about filters applied to the contents of files?
I can see those would be useful.
Remark: such support is probably most naturally added to the underying iterator classes by storing some boost::optional<> variables.
I'll need to think about that - it will take me a while to digest this.
Just one remark: The filesystem library may invent its own | syntax for filter which need not be compatiable with e.g. Boost.Range's adaptors. In a sence, we could just be allowed to write boost::directory_range( some_path, directory_filter( "..." ) | filename_filter( "..." ) | filestamp_filter( "..." ) | filesize_filter( "..." ) | custom_filter( ... ) ); Then all the metaprogramming logic can be done internally in boost.filesystem, detecting logical errors and rearranging the filters such that the most beneficial one is run first etc. just my two cents -Thorsten
participants (4)
-
Beman Dawes
-
Dave Abrahams
-
Olaf van der Spek
-
Thorsten Ottosen