
Some comments below from a random C++ developer who has written multiple cross-platform filesystem libraries for US military use. Take them for whatever you want. Hopefully they're useful. :-) Beman Dawes wrote:
On Thu, Feb 18, 2010 at 10:35 AM, Stewart, Robert <Robert.Stewart@sig.com> wrote:
"stem" means nothing to me. I followed the Reference link to the path class documentation and find no description of it there.
Added a direct link to the reference doc description. Added an example to the reference doc description.
Regrettably the terminology for this is remarkably nonstandard, but I've never heard of "stem" before and would not have had any idea what it returned. Typically I've seen this called "base_filename" or even just "filename". Where did "stem" come from? Is there a precedent I'm not aware of?
-----Original Message----- From: boost-bounces@lists.boost.org [mailto:boost- bounces@lists.boost.org] On Behalf Of Stewart, Robert Sent: Thursday, February 18, 2010 2:27 PM To: boost@lists.boost.org Subject: Re: [boost] [Filesystem V3] Filesystem Version 3 beta 1 availablefor download and comment
Peter Dimov wrote:
Beman Dawes wrote:
On Thu, Feb 18, 2010 at 10:35 AM, Stewart, Robert <Robert.Stewart@sig.com> wrote:
You state that extension() returns the period to allow distinguishing between an empty extension and no extension. That seems wrong. Typical use cases for working with the extension will require stripping the period before proceeding, so you push extra work onto the client. Furthermore, I can't think of a case in which extension processing code would work differently when there is no extension and when the extension is empty. The extension is an empty string in both cases. Since you already provide has_extension() for distinguishing that there is one, extension() should return an empty string when nothing follows the period.
IIRC, that was Volodya's original design and I can't recall anyone ever complaining about it. True, we didn't have the has_extension() function, but still, I hate to break existing code. Does anyone else have a strong opinion?
It is the right design to retain the period, IMO, and most "get extension" functions do so, even on Windows, where there is no difference between "foo" and "foo." when actually used to refer to a file. See for example
http://msdn.microsoft.com/en-us/library/e737s6tf%28VS.100%29.aspx
That's an interesting precedent, but that strikes me as wrong, too!
On POSIX, it's even more important to retain the period, because "foo" and "foo." refer to different files.
I can see that creating "foo." from "foo" requires that one be able to set the extension as "." and that would require special case code. Perhaps the right solution is to prefix the argument with "." when omitted? That way, existing code, which provides the "." will continue to work, while code that has the extension, but no period, can work henceforth.
From what I've seen, either approach works and doesn't tend to imply significantly more work on the user's part - most use cases I've had are to maintain maps of extensions to some sort of class for processing files of that type, or to recognize certain types of files in a directory, which work either way. So my implementations tended to return with the leading dot for extension() (for disambiguating the crazy POSIX case) and to accept with or without leading dot for change_extension. If no leading dot is provided, one is automatically prepended. Giving the empty string to change_extension removed the extension, but I also had a special method for that. This worked well in practice for me. I do agree that "has_extension" is useful to have for clarity and for the rare use cases where only the presence of the extension matters and its contents do not.
Note that if there were ever a system that by convention used something other than dot for an extension separator, requiring a leading dot could be a problem. I'm not aware of any such systems though. One wrinkle that I never was able to decide how to handle was multiple extensions, like ".tar.gz". Some use cases would want ".tar.gz", some would just want ".gz", and a few would even want just ".tar". Does this library provide any direct support for managing chains of extensions like that? Hope that helped. Gregory Peele, Jr. Applied Research Associates, Inc.