
David Abrahams wrote:
on Wed May 21 2008, Beman Dawes <bdawes-AT-acm.org> wrote:
David Abrahams wrote:
on Tue May 20 2008, Beman Dawes <bdawes-AT-acm.org> wrote:
David Abrahams wrote:
I was just reviewing the filesystem docs and came across "leaf()". I'm sure this isn't the first time I've seen it, but this time I picked up a little semantic dissonance. Normally we think of "leaf" in the context of a tree as being a thing with no children. An interior node like a directory that has files or other directories in it is usually not called a "leaf." Right. And "leaf" never returns an interior node of a path. What is an "interior node" of a path? Would you talk about "interior nodes" of a std::vector<string>?
I wonder if this is the best possible name? The names used by the filesystem library were carefully chosen as a matched set. So you can't change a single name without making a corresponding change to the other names (like "branch") it is related to. I understand that a change may upset the apple cart, but the fact that the names are interdependent doesn't mean we shouldn't consider different ones. Sure. But given that the current names were widely discussed at the time of adoption, have been in use for quite a few years, and "basename" is already used by the library for a function with different semantics, change would be difficult.
Well, I did bring this up almost five years ago: http://lists.boost.org/Archives/boost/2003/08/50910.php
Is there a precedent we can draw on in some other language/library? In python, it's os.path.basename(p). Perl, php, and the posix basename command seem to do something similar. The filesytem names were chosen to be an improvement over the naming used by other libraries and/or languages, which always seemed to me to be misleading. For example, my intuition is that basename("foo.bar") should yield "foo", not "foo.bar". I don't know why -- basename seems like one of those names that would be semantically void except for the precedent provided by other languages trying to do the same thing. On the face of it, it doesn't suggest anything about the extension part of the name one way or the other. In any case, the 2-argument version of basename does something like what you want in many of those other languages.
I'm certainly open to persuasion, but so far, a pathname doesn't seem to resemble a tree in any conceptually useful way, and there seems to be no compelling advantage to inventing our own terminology here. If you have a better set of names, why don't you suggest them?
You asked for it. I'm going in the order given by http://www.ibm.com/developerworks/aix/library/au-boostfs/ because that's reasonably comprehensive and readable even though it looks like it may have some serious errors. Don't have time to pore through the full reference right now.
path members:
const std::string& string( ): OK
std::string root_directory( ): OK, but maybe should return boost::optional<path>. I wonder why we decay to std::string so eagerly.
std::string root_name( ): OK, but maybe should be called "root" and return boost::optional<path>
Logically, the root is made up of the root_name() and the root_directory(). If you change the name of root_directory() to root(), what do you call the combination of root_name() and root_directory()? As far as other aspects of the interface, like the return type, I want to revisit the whole design once C++0x stabilizes and there is a compiler available with more C++0x features to experiment with.
std::string leaf( ): should be basename(). back(), tail() and p.split()[1].string() are viable alternatives
One problem with basename is that it is already used by one of the convenience functions. Another is that I find it misleading. An alternative set I'd be comfortable with would be tail() for the current leaf() and head_path() for the current branch_path().
std::string branch_path( ): should be dirname()
That breaks the naming scheme; _path is uniformly used to signal that that the return potentially contains a path rather than just a single name. It is also misleading in that the return is often a path rather than just a single directory name, and for "c:" on Windows isn't a directory name at all. One of my frustrations with similar libraries has always been their misleading function names. And that's really your point about leaf() and branch_path(); you find them misleading. That's a concern, and why I'm willing to consider renaming them. But not to a set of names that is misleading to a different set of people, me included.
bool empty( ): OK, although this seems like a really uninteresting question to ask.
iterator: OK
operator/: I've liked that one ever since I came up with it ;-)
Yeah, I like it too, although some people find it too cute. I had a complaint recently from a new user that the append functionality was not supported; turned out he never read the docs for operator/ because he assumed operator/ couldn't possibly be what he was looking for.
The rest of the names look OK to me except for is_regular, which should be is_file. That name seems overthought and "regular" has all kinds of other connotations.
I'm not particularly fond of is_regular either. The problem with is_file is that some people argue it should be true for directories. I could get talked into is_file, if others support that and will help dealing with those who think of directories as files. Or maybe some other name could be found. is_regular_file()? Although longer, that seems clearer. --Beman