interest in a glob_iterator? (directory_iterator with regex)

Folks-- Is there general interest in a "globbing" iterator? If so I've got one I'm willing to re-package for submission. (Or at least discuss with folks for improvement) If you're interested, read on: glob_iterator aggregates a directory_iterator and regex to provide shell-style "*", "?", "{....}", "[....]" and "[^...]" wildcarding. boost components used: filesystem::directory_iterator filesystem::path filter_iterator reg_expression c_regex_traits Usage example: //...do something to all .cpp and .c files glob_iterator start( "*.{c,cpp}" ); glob_iterator end; while( start != end ){ std::string filename( start->leaf() ); //...do something with/to filename ++start; }

O.K. I'll do it. But my first task is to get a discussion going on the developer's list. Or, we could kick start the process if I sent you all the 100+ lines of code for an initial review and/or sanity check. How's that sound? --rich On Wednesday, January 14, 2004, at 05:40 AM, Angus Leeming wrote:
Rich Johnson wrote:
Folks-- Is there general interest in a "globbing" iterator?
I'm interested.

Rich Johnson
O.K. I'll do it. But my first task is to get a discussion going on the developer's list.
I'd want to see it decomposed into the following components: 1. A function which translates glob patterns into regexes 2. A filter_iterator adaptor which uses a regex matching function to match the paths from a directory_iterator -- Dave Abrahams Boost Consulting www.boost-consulting.com

"David Abrahams"
Rich Johnson
writes: O.K. I'll do it. But my first task is to get a discussion going on the developer's list.
I'd want to see it decomposed into the following components:
1. A function which translates glob patterns into regexes
2. A filter_iterator adaptor which uses a regex matching function to match the paths from a directory_iterator
And a more descriptive, less jargon-ish name. ----------------- Jeff Flinn Applied Dynamics, International

Jeff Flinn wrote:
And a more descriptive, less jargon-ish name.
Why? Rich is proposing something that would iterate over the files returned by the unix 'glob' function. "*.abc" is a 'glob' just as "^.*\.abc$" is the equivalent 'regular expression'. Both are jargon and both are fine. It's just that you're used to the latter... -- Angus

"Angus Leeming"
Jeff Flinn wrote:
And a more descriptive, less jargon-ish name.
Why? Rich is proposing something that would iterate over the files returned by the unix 'glob' function.
Other OS's have wildcard matching, with nary a mention of "glob". No doubt in the minority, I didn't have a clue what a "glob" was. My first impression was not to bother reading as it was a joke - based on "glob"'s definition in the(Merriam-Webster) dictionary - "a small drop" or "a large rounded mass". I must be showing my age - "Rock - that's not music!, now Benny Goodman's another thing...".
"*.abc" is a 'glob' just as "^.*\.abc$" is the equivalent 'regular expression'. Both are jargon and both are fine. It's just that you're used to the latter..
Then you might want this link in the documentation: http://info.astrian.net/jargon/terms/g.html#glob Jeff

Angus Leeming wrote:
Why? Rich is proposing something that would iterate over the files returned by the unix 'glob' function.
I'd never heard of a 'glob' either. But I have no problems with the name if it means something to someone. Cheers Russell

On Wednesday, January 14, 2004, at 10:32 PM, David Abrahams wrote:
Rich Johnson
writes: O.K. I'll do it. But my first task is to get a discussion going on the developer's list.
I'd want to see it decomposed into the following components: This was the approach I took.
1. A function which translates glob patterns into regexes This is the non-trivial part. I've opted to use a two-step process of: - regex_traits to map shell-style meta chars to regex syntax where-ever there's a 1-1 mapping. - a minimal string transform to convert the glob pattern to a regex pattern corresponding to the above traits. The specifics of this implementation are of course subject to debate.
2. A filter_iterator adaptor which uses a regex matching function to match the paths from a directory_iterator
This is the easy part--provide a predicate which encapsulates and invokes the regex produced above. --rich

1. A function which translates glob patterns into regexes This is the non-trivial part. I've opted to use a two-step process of: - regex_traits to map shell-style meta chars to regex syntax where-ever there's a 1-1 mapping. - a minimal string transform to convert the glob pattern to a regex pattern corresponding to the above traits. The specifics of this implementation are of course subject to debate.
Can't we do the whole thing with a regex search and replace? This has the advantage that we don't need to instantiate a new basic_regex template instance (so less code bloat if the user is already using regex). I've been playing with this, and have attached a simple dos_wildcard predicate that works this way - you can use this with boost::filter_iterator_adapter right now, and should work for both portable and native file paths (excluding one or two corner cases, like ":" in RaiserFS file names, even this can probably be worked around). How does this compare to yours? I admit I haven't tried with unix wildcards - although last time I looked at the std, I admit I was surprised by how complex (and subtle) these are - let me know if you want me to look for a unix-wildcard to regex transform.
2. A filter_iterator adaptor which uses a regex matching function to match the paths from a directory_iterator
This is the easy part--provide a predicate which encapsulates and invokes the regex produced above.
Yep,
typedef filter_iterator_adapter

Sounds great to me! Rich Johnson wrote:
Folks--
Is there general interest in a "globbing" iterator? If so I've got one I'm willing to re-package for submission. (Or at least discuss with folks for improvement)
If you're interested, read on:
glob_iterator aggregates a directory_iterator and regex to provide shell-style "*", "?", "{....}", "[....]" and "[^...]" wildcarding.
boost components used: filesystem::directory_iterator filesystem::path filter_iterator reg_expression c_regex_traits
Usage example: //...do something to all .cpp and .c files glob_iterator start( "*.{c,cpp}" ); glob_iterator end; while( start != end ){ std::string filename( start->leaf() ); //...do something with/to filename ++start; }
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
-- D. Alan Stewart Senior Software Developer Layton Graphics, Inc. 155 Woolco Drive Marietta, GA 30065 Voice: 770/973-4312 Fax: 800/367-8192 http://www.layton-graphics.com

On Tue, 13 Jan 2004 18:45:23 -0500, Rich Johnson wrote
Is there general interest in a "globbing" iterator? If so I've got one I'm willing to re-package for submission. (Or at least discuss with folks for improvement)
I'd like to see this as well. Jeff
participants (8)
-
Alan Stewart
-
Angus Leeming
-
David Abrahams
-
Jeff Flinn
-
Jeff Garland
-
John Maddock
-
Rich Johnson
-
Russell Hind