
On Fri, Jan 25, 2013 at 4:30 PM, Beman Dawes <bdawes@acm.org> wrote:
On Thu, Jan 24, 2013 at 8:56 PM, Dave Abrahams <dave@boostpro.com> wrote:
I'm finding that boost::filesystem::path seems to be a strange mix of different beasts, unlike any entity we have in the STL. For example, when you construct it from a pair of iterators, they're expected to be iterators over characters, but when you iterate over the path itself, you are iterating over strings of some kind (**). Even though, once constructed, this thing acts sort of like a container, it supports none of the usual container mutators (e.g. push_back, pop_back, erase) or even queries (e.g. size()), making it incompatible with generic algorithms and adaptors.
It isn't really a container, but it is convenient to supply iterators over the elements of the contained path. Should more container-like mutators be supplied? I'm neutral - they would occasionally be useful, but add more signatures to an already fat interface.
Perhaps a path could have an interface analagous to std::vector<std::string>, even if the implementation is optimised somewhat to keep the commonly accessed string representation as the underlying storage. Perhaps random access would be a bit daft, but it does seem reasonable to converge the interface. Additionally it might help define a new Concept that it a subset of a Container to assist with Dave's goal of maximising reuse within other algorithms.
In particular, this comes up because I'm trying to find the greatest common prefix of two paths. I thought this would be easy; I'd just use std::mismatch. But even once I've found the mismatch I don't see any obvious way to chop off the non-matching parts of one of the paths. I end up having to resort to some really ugly code (or I just haven't figured out how to use this thing correctly).
I wonder if this is *really* what you want! I suspect that you probably want to determine the common effective prefix of the paths after canonicalisation. For illustration: I suspect that the result you want from fn("/usr/sbin/../bin/test1.txt", "/usr/bin/test2.txt") is "/usr/bin" rather than "/usr". The inclusion or exclusion of links is less obvious. My experience is that for the most-part I simply want the absolute canonical representation to be considered. Not particularly elegant, but this does work:
path x("/foo/bar"); path y("/foo/baar");
auto result = std::mismatch(x.begin(), x.end(), y.begin());
path prefix; for (auto itr = x.begin(); itr != result.first; ++itr) prefix /= *itr;
std::cout << prefix << std::endl;
I think this code doesn't "work" because it meets the stated requirements exactly! I think the requirements are normally greater than those we first think of when looking at the problem.
Why should paths be so different from everything else? I think, if the design is actually right, some rationale is sorely needed.
Also,
* (**) the docs don't say what the value_type of path::iterator is. A string value? A range that becomes invalid when the path is destroyed? Ah!?! How surprising; inspecting the code shows it iterates over paths! A container whose element type is itself is very unusual!
It is a kludge to deal with the type of the contained string being implementation defined and not necessarily the type the user wants. In other words, a misuse of path to supply string interoperability. The returned type should ideally be a basic_string, with begin() and end() templatized on the string details, but I didn't think of that until recently.
* the docs claim you can construct a path from a "A C-array. The value type is required to be char, wchar_t, char16_t, or char32_t", but doesn't say how that array will be interpreted. From the wording I might have assumed it accepts a CharT(&)[N] and the length of the input is taken as N, but inspecting the code shows it expects a CharT* and interprets the source as null-terminated.
I'll make some doc changes per your comments above.
The addition of a make_relative_path function has been discussed and code to provide the relative path from the canonical formats has been submitted previously see: http://stackoverflow.com/questions/10167382/boostfilesystem-get-relative-pat... It looks to be a very valuable feature even if it the implementation requires adjustment.
--Beman
With all that stated, I have found the recent versions of Boost.Filesystem to support my use-cases elegantly and without issue. Indeed it frequently offers superior solutions that are much better considered and though through than those provided by many scripting languages. Obviously while most of my communication has been about what I would like to see done differently I am a grateful user of this library. Thank you for your hard work. Regards, Neil Groves