
on Fri Jan 25 2013, Beman Dawes <bdawes-AT-acm.org> wrote:
On Thu, Jan 24, 2013 at 8:56 PM, Dave Abrahams <dave@boostpro.com> wrote:
I'm finding that boost::filesystem::path seems to be a strange mix of different beasts, unlike any entity we have in the STL. For example, when you construct it from a pair of iterators, they're expected to be iterators over characters, but when you iterate over the path itself, you are iterating over strings of some kind (**). Even though, once constructed, this thing acts sort of like a container, it supports none of the usual container mutators (e.g. push_back, pop_back, erase) or even queries (e.g. size()), making it incompatible with generic algorithms and adaptors.
It isn't really a container,
Well, why not? It does most things that containers do, but with different names. And to expose iterators but then not let me use those iterators to modify the path is... well, disappointing.
but it is convenient to supply iterators over the elements of the contained path. Should more container-like mutators be supplied?
It certainly would make it more useful. I could then employ, e.g. back_inserter. But I also have problems with the fact that it's constructed with a range of characters but its iterators traverse a range of paths. It should at least have a constructor that takes an iterator range over the iterator's value_type.
I'm neutral - they would occasionally be useful, but add more signatures to an already fat interface.
Maybe some of the other interfaces should be dropped then :-)
In particular, this comes up because I'm trying to find the greatest common prefix of two paths. I thought this would be easy; I'd just use std::mismatch. But even once I've found the mismatch I don't see any obvious way to chop off the non-matching parts of one of the paths. I end up having to resort to some really ugly code (or I just haven't figured out how to use this thing correctly).
Not particularly elegant, but this does work:
path x("/foo/bar"); path y("/foo/baar");
auto result = std::mismatch(x.begin(), x.end(), y.begin());
path prefix; for (auto itr = x.begin(); itr != result.first; ++itr) prefix /= *itr;
std::cout << prefix << std::endl;
Nor is it particularly efficient. I am going to do this with every path that appears in boost's SVN dump, of which there are many. "Greatest common prefix" is not an unusual thing to want to do with paths. It should be both elegant and efficient.
Why should paths be so different from everything else? I think, if the design is actually right, some rationale is sorely needed.
Also,
* (**) the docs don't say what the value_type of path::iterator is. A string value? A range that becomes invalid when the path is destroyed? Ah!?! How surprising; inspecting the code shows it iterates over paths! A container whose element type is itself is very unusual!
It is a kludge to deal with the type of the contained string being implementation defined and not necessarily the type the user wants. In other words, a misuse of path to supply string interoperability. The returned type should ideally be a basic_string, with begin() and end() templatized on the string details, but I didn't think of that until recently.
It should ideally be a type that can be constructed without allocating storage and copying characters from the source path, like the recently-discussed string_ref.
* the docs claim you can construct a path from a "A C-array. The value type is required to be char, wchar_t, char16_t, or char32_t", but doesn't say how that array will be interpreted. From the wording I might have assumed it accepts a CharT(&)[N] and the length of the input is taken as N, but inspecting the code shows it expects a CharT* and interprets the source as null-terminated.
I'll make some doc changes per your comments above.
--Beman
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
-- Dave Abrahams BoostPro Computing Software Development Training http://www.boostpro.com Clang/LLVM/EDG Compilers C++ Boost