
On Thu, Jan 24, 2013 at 8:56 PM, Dave Abrahams <dave@boostpro.com> wrote:
I'm finding that boost::filesystem::path seems to be a strange mix of different beasts, unlike any entity we have in the STL. For example, when you construct it from a pair of iterators, they're expected to be iterators over characters, but when you iterate over the path itself, you are iterating over strings of some kind (**). Even though, once constructed, this thing acts sort of like a container, it supports none of the usual container mutators (e.g. push_back, pop_back, erase) or even queries (e.g. size()), making it incompatible with generic algorithms and adaptors.
It isn't really a container, but it is convenient to supply iterators over the elements of the contained path. Should more container-like mutators be supplied? I'm neutral - they would occasionally be useful, but add more signatures to an already fat interface.
In particular, this comes up because I'm trying to find the greatest common prefix of two paths. I thought this would be easy; I'd just use std::mismatch. But even once I've found the mismatch I don't see any obvious way to chop off the non-matching parts of one of the paths. I end up having to resort to some really ugly code (or I just haven't figured out how to use this thing correctly).
Not particularly elegant, but this does work: path x("/foo/bar"); path y("/foo/baar"); auto result = std::mismatch(x.begin(), x.end(), y.begin()); path prefix; for (auto itr = x.begin(); itr != result.first; ++itr) prefix /= *itr; std::cout << prefix << std::endl;
Why should paths be so different from everything else? I think, if the design is actually right, some rationale is sorely needed.
Also,
* (**) the docs don't say what the value_type of path::iterator is. A string value? A range that becomes invalid when the path is destroyed? Ah!?! How surprising; inspecting the code shows it iterates over paths! A container whose element type is itself is very unusual!
It is a kludge to deal with the type of the contained string being implementation defined and not necessarily the type the user wants. In other words, a misuse of path to supply string interoperability. The returned type should ideally be a basic_string, with begin() and end() templatized on the string details, but I didn't think of that until recently.
* the docs claim you can construct a path from a "A C-array. The value type is required to be char, wchar_t, char16_t, or char32_t", but doesn't say how that array will be interpreted. From the wording I might have assumed it accepts a CharT(&)[N] and the length of the input is taken as N, but inspecting the code shows it expects a CharT* and interprets the source as null-terminated.
I'll make some doc changes per your comments above. --Beman