Hello everybody, Is there a globbing facility in boost? If not, are there any plans to add one? Please note that globbing is related to matching strings that correspond to file system objects functionality is neither regular expressions, nor is it file system related. Therefore it does not belong in regexp, expressive or bfs. IMO. Globbing is pattern matching using conventions that grew up with unix but have been widely copied to other operating systems and environments. The rules for the patterns are very different to regular expressions and somewhat simpler. I have written a directory iterator for my own use and I currently employ boost regexp for file name matching on unix. But i would very much prefer that it use globbing. The code is portable, via ifdefs, and in the windows branch I am able to use true globbing due to a facility in the win32 API.
On Fri, Aug 8, 2014 at 10:04 AM, Andrew Marlow
Globbing is pattern matching using conventions that grew up with unix but have been widely copied to other operating systems and environments. The rules for the patterns are very different to regular expressions and somewhat simpler.
For what it's worth, I believe you can mechanically translate glob patterns to regular expressions and use the regex engine of your choice.
[Please do not mail me a copy of your followup]
Nat Goodspeed
On Fri, Aug 8, 2014 at 10:04 AM, Andrew Marlow
wrote: Globbing is pattern matching using conventions that grew up with unix but have been widely copied to other operating systems and environments. The rules for the patterns are very different to regular expressions and somewhat simpler.
For what it's worth, I believe you can mechanically translate glob patterns to regular expressions and use the regex engine of your choice.
It would be handy to have a library that provided globbing as a simplifying facade around std::regex. -- "The Direct3D Graphics Pipeline" free book http://tinyurl.com/d3d-pipeline The Computer Graphics Museum http://computergraphicsmuseum.org The Terminals Wiki http://terminals.classiccmp.org Legalize Adulthood! (my blog) http://legalizeadulthood.wordpress.com
On 9/08/2014 15:28, Richard wrote:
For what it's worth, I believe you can mechanically translate glob patterns to regular expressions and use the regex engine of your choice.
It would be handy to have a library that provided globbing as a simplifying facade around std::regex.
+1 Ideally with support for both POSIX-style and Windows-style globs. (The former are more powerful but the latter have fewer reserved characters; both are useful at different times.) Where this does intersect with the filesystem, perhaps Boost.Filesystem could make use of it? I don't think that currently has any globbing support.
On 12 Aug 2014 at 19:23, Gavin Lambert wrote:
Ideally with support for both POSIX-style and Windows-style globs. (The former are more powerful but the latter have fewer reserved characters; both are useful at different times.)
Where this does intersect with the filesystem, perhaps Boost.Filesystem could make use of it? I don't think that currently has any globbing support.
You are correct on this. AFIO supports kernel side globbing during directory enumeration on Windows, with a user side emulation on POSIX (basically fnmatch()). In fact, kernel side globbing is so quick it is how AFIO stat's a single file entry, it simply sets the glob to an exact match and that's three orders of magnitude faster than the traditional way of opening a handle, reading metadata and closing the handle. I was not aware however that the NT kernel glob is any different to Unix glob. It is of course totally possible that the NT kernel glob is totally separate to the Windows glob which no doubt is DOS legacy encumbered. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
On 12/08/2014 23:24, Niall Douglas wrote:
I was not aware however that the NT kernel glob is any different to Unix glob. It is of course totally possible that the NT kernel glob is totally separate to the Windows glob which no doubt is DOS legacy encumbered.
Perhaps I am conflating different things (or just mistaken), but as far as I am aware Windows supports only * and ? metacharacters, while POSIX supports those in addition to [] and {} sets and has a different interpretation of \.
On 13 Aug 2014 at 11:09, Gavin Lambert wrote:
On 12/08/2014 23:24, Niall Douglas wrote:
I was not aware however that the NT kernel glob is any different to Unix glob. It is of course totally possible that the NT kernel glob is totally separate to the Windows glob which no doubt is DOS legacy encumbered.
Perhaps I am conflating different things (or just mistaken), but as far as I am aware Windows supports only * and ? metacharacters, while POSIX supports those in addition to [] and {} sets and has a different interpretation of \.
It turns out that answering this question of what glob characters Windows supports for directory enumerations is not trivial. After tracing through IFS driver sample code, I found this function FsRtlIsNameInExpression(): http://msdn.microsoft.com/en-us/library/windows/hardware/ff546850(v=vs .85).aspx And it would appear you are correct, * and ? are the only two common. I am aware of [seq] and [!seq] for fnmatch, I am not aware of {} sets. I have added a note to AFIO about this, thanks for the catch. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
[Please do not mail me a copy of your followup]
"Niall Douglas"
And it would appear you are correct, * and ? are the only two common. I am aware of [seq] and [!seq] for fnmatch, I am not aware of {} sets.
I always thought in Unix that globbing was done by the shell and not by the operating system. In particular, I thought {} globbing was a csh-ism and not an sh-ism. It's hard to tell these days now that bash has absorbed the useful features of all other shells. -- "The Direct3D Graphics Pipeline" free book http://tinyurl.com/d3d-pipeline The Computer Graphics Museum http://computergraphicsmuseum.org The Terminals Wiki http://terminals.classiccmp.org Legalize Adulthood! (my blog) http://legalizeadulthood.wordpress.com
On 14 Aug 2014 at 18:15, Richard wrote:
And it would appear you are correct, * and ? are the only two common. I am aware of [seq] and [!seq] for fnmatch, I am not aware of {} sets.
I always thought in Unix that globbing was done by the shell and not by the operating system. In particular, I thought {} globbing was a csh-ism and not an sh-ism. It's hard to tell these days now that bash has absorbed the useful features of all other shells.
On Windows, both the shell and filing system driver do globbing. Specifically, when enumerating a directory you can pass a glob to filter out entries you don't care about kernel side. This provides an *enormous* performance improvement for directories with many entries. Sadly the glob is very restrictive, just * and ?. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
On 8 Aug 2014 at 15:04, Andrew Marlow wrote:
Is there a globbing facility in boost? If not, are there any plans to add one?
I see no point. There is a standard regex find and replace which will convert any glob to regex (search stackoverflow). I don't mind regex gaining a glob to regex converter helper function though. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
I reckon this is the way to go, adding a helper function to boost regex. I
admit that globbing is OS-specific but I would still like to see a function
that globs like the posix shell (and compatibles).
On Aug 11, 2014 12:02 AM, "Niall Douglas"
On 8 Aug 2014 at 15:04, Andrew Marlow wrote:
Is there a globbing facility in boost? If not, are there any plans to add one?
I see no point. There is a standard regex find and replace which will convert any glob to regex (search stackoverflow). I don't mind regex gaining a glob to regex converter helper function though.
Niall
-- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
participants (5)
-
Andrew Marlow
-
Gavin Lambert
-
legalize+jeeves@mail.xmission.com
-
Nat Goodspeed
-
Niall Douglas