Re: [boost] [Boost-commit] svn:boost r62919 - in trunk: boost/filesystem/v2 boost/filesystem/v3 libs/filesystem/v2/test libs/filesystem/v3/test

On 13 June 2010 10:33, <bdawes@acm.org> wrote:
Author: bemandawes Date: 2010-06-13 13:33:39 EDT (Sun, 13 Jun 2010) New Revision: 62919 URL: http://svn.boost.org/trac/boost/changeset/62919
[...]
@@ -620,7 +620,7 @@ basic_path< String, Traits > & ph ) { String str; - is >> str; + std::getline(is, str); // See ticket 3863 ph = str; return is; }
I'm not convinced this is the right approach here, since cout << mypath << ' ' << myint; doesn't roundtrip with cin >> mypath >> myint; the way I expect operator<< and operator>> to work. Perhaps the portable pathname grammar could be extended to allow backslash-escaping of spaces and backslashes? That or remove the stream operators entirely and force people to choose which way they want to print the path and deal with the result as a fully-general string.

On Sun, Jun 13, 2010 at 1:47 PM, Scott McMurray <me22.ca+boost@gmail.com> wrote:
On 13 June 2010 10:33, <bdawes@acm.org> wrote:
Author: bemandawes Date: 2010-06-13 13:33:39 EDT (Sun, 13 Jun 2010) New Revision: 62919 URL: http://svn.boost.org/trac/boost/changeset/62919
[...]
@@ -620,7 +620,7 @@ basic_path< String, Traits > & ph ) { String str; - is >> str; + std::getline(is, str); // See ticket 3863 ph = str; return is; }
I'm not convinced this is the right approach here, since
cout << mypath << ' ' << myint;
doesn't roundtrip with
cin >> mypath >> myint;
the way I expect operator<< and operator>> to work.
I agree with you that there is a problem here.
Perhaps the portable pathname grammar could be extended to allow backslash-escaping of spaces and backslashes? That or remove the stream operators entirely and force people to choose which way they want to print the path and deal with the result as a fully-general string.
In thinking about your suggestion, and several similar approaches, I realized the problem isn't limited to filesystem::path. It pops up every time someone wants to stream string values. It should be pretty simple to write two function templates: For output: template<String> String delimit(const String& s); Returns: s wrapped in double-quote characters. Any double-quote characters in s are escaped. A similar function for input. I'll think this in more detail later today, but in the meantime perhaps someone else could flesh out this idea. Thanks, --Beman

On 06/14/2010 03:12 PM, Beman Dawes wrote:
On Sun, Jun 13, 2010 at 1:47 PM, Scott McMurray<me22.ca+boost@gmail.com> wrote:
On 13 June 2010 10:33,<bdawes@acm.org> wrote:
Author: bemandawes Date: 2010-06-13 13:33:39 EDT (Sun, 13 Jun 2010) New Revision: 62919 URL: http://svn.boost.org/trac/boost/changeset/62919
[...]
@@ -620,7 +620,7 @@ basic_path< String, Traits> & ph ) { String str; - is>> str; + std::getline(is, str); // See ticket 3863 ph = str; return is; }
I'm not convinced this is the right approach here, since
cout<< mypath<< ' '<< myint;
doesn't roundtrip with
cin>> mypath>> myint;
the way I expect operator<< and operator>> to work.
I agree with you that there is a problem here.
Perhaps the portable pathname grammar could be extended to allow backslash-escaping of spaces and backslashes? That or remove the stream operators entirely and force people to choose which way they want to print the path and deal with the result as a fully-general string.
In thinking about your suggestion, and several similar approaches, I realized the problem isn't limited to filesystem::path. It pops up every time someone wants to stream string values.
It should be pretty simple to write two function templates:
For output:
template<String> String delimit(const String& s);
Returns: s wrapped in double-quote characters. Any double-quote characters in s are escaped.
A similar function for input.
I'll think this in more detail later today, but in the meantime perhaps someone else could flesh out this idea.
+1 for removing streaming operators for path. For diagnostic purposes it should be enough to have the path::string< CharT > method. For serialization purposes, I think, it's better to have a separate header that enables support for Boost.Serialization. The escape/unescape functions (why "delimit"?) you mentioned could be useful, but only as helpers for other APIs (such as the OS API). I think, convenience.hpp is the right place for them. Just my 2 cents.

On Mon, 2010-06-14 at 16:15 +0400, Andrey Semashev wrote: [snip discussion of removal of stream operators from path]
The escape/unescape functions (why "delimit"?) you mentioned could be useful, but only as helpers for other APIs (such as the OS API). I think, convenience.hpp is the right place for them.
Are these functions specific to Boost.Filesystem? A mechanism for quoting strings (whether it is double-quoting, back-slashing, or URL-style) would seem to be generally useful, and not just specific to the filesystem library. Phil -- Phil Richards, <news@derived-software.ltd.uk>

Phil Richards wrote:
On Mon, 2010-06-14 at 16:15 +0400, Andrey Semashev wrote:
[snip discussion of removal of stream operators from path]
The escape/unescape functions (why "delimit"?) you mentioned could be useful, but only as helpers for other APIs (such as the OS API). I think, convenience.hpp is the right place for them.
Are these functions specific to Boost.Filesystem? A mechanism for quoting strings (whether it is double-quoting, back-slashing, or URL-style) would seem to be generally useful, and not just specific to the filesystem library.
+1 String.Algos? _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

On 06/14/2010 04:42 PM, Phil Richards wrote:
On Mon, 2010-06-14 at 16:15 +0400, Andrey Semashev wrote: [snip discussion of removal of stream operators from path]
The escape/unescape functions (why "delimit"?) you mentioned could be useful, but only as helpers for other APIs (such as the OS API). I think, convenience.hpp is the right place for them.
Are these functions specific to Boost.Filesystem? A mechanism for quoting strings (whether it is double-quoting, back-slashing, or URL-style) would seem to be generally useful, and not just specific to the filesystem library.
I think, path quoting rules can be OS-specific, so it seems fit to reside in Boost.Filesystem.

Andrey Semashev wrote:
On 06/14/2010 04:42 PM, Phil Richards wrote:
Are these functions specific to Boost.Filesystem? A mechanism for quoting strings (whether it is double-quoting, back-slashing, or URL-style) would seem to be generally useful, and not just specific to the filesystem library.
I think, path quoting rules can be OS-specific, so it seems fit to reside in Boost.Filesystem.
While that may be true, they could be built atop a more general offering. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

On 6/14/2010 9:14 AM, Stewart, Robert wrote:
Andrey Semashev wrote:
On 06/14/2010 04:42 PM, Phil Richards wrote:
Are these functions specific to Boost.Filesystem? A mechanism for quoting strings (whether it is double-quoting, back-slashing, or URL-style) would seem to be generally useful, and not just specific to the filesystem library.
I think, path quoting rules can be OS-specific, so it seems fit to reside in Boost.Filesystem.
While that may be true, they could be built atop a more general offering.
Don't wait for a perfect mousetrap. A merely good one will do for now. My $.02, -- Eric Niebler BoostPro Computing http://www.boostpro.com

Eric Niebler wrote:
On 6/14/2010 9:14 AM, Stewart, Robert wrote:
Andrey Semashev wrote:
I think, path quoting rules can be OS-specific, so it seems fit to reside in Boost.Filesystem.
While that may be true, they could be built atop a more general offering.
Don't wait for a perfect mousetrap. A merely good one will do for now.
Oh, sure. I didn't mean that Boost.Filesystem should wait for something available elsewhere, but that the more general mechanism would be very useful. When a good, generalized mechanism is available, the pathname quoting logic should be replaced with an invocation of it, though. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

On Mon, 2010-06-14 at 09:26 -0400, Eric Niebler wrote:
On 6/14/2010 9:14 AM, Stewart, Robert wrote:
Andrey Semashev wrote:
On 06/14/2010 04:42 PM, Phil Richards wrote:
Are these functions specific to Boost.Filesystem? A mechanism for quoting strings (whether it is double-quoting, back-slashing, or URL-style) would seem to be generally useful, and not just specific to the filesystem library.
I think, path quoting rules can be OS-specific, so it seems fit to reside in Boost.Filesystem. While that may be true, they could be built atop a more general offering. Don't wait for a perfect mousetrap. A merely good one will do for now.
If the three suggested quoting routines are provided (by whatever mechanism: clever, dumb; generic, or otherwise) as non-filesystem specific conversions, then filesystem can use them. The advantage of them not being in boost::filesystem is that other people will be able to find and use them too (for non-filesystem specific things). I have a feeling that there are a number of useful general-purpose utility things hidden away inside specific libraries (sometimes as detail). It would be shame to add to that tally. Phil -- Phil Richards, <news@derived-software.ltd.uk>

On Mon, Jun 14, 2010 at 11:56 AM, Phil Richards <news@derived-software.ltd.uk> wrote:
If the three suggested quoting routines are provided (by whatever mechanism: clever, dumb; generic, or otherwise) as non-filesystem specific conversions, then filesystem can use them.
The advantage of them not being in boost::filesystem is that other people will be able to find and use them too (for non-filesystem specific things).
I have a feeling that there are a number of useful general-purpose utility things hidden away inside specific libraries (sometimes as detail). It would be shame to add to that tally.
Agreed. Once I've templatized my current prototype, I'll put together some brief docs to see if folks are interested.

On Mon, Jun 14, 2010 at 8:15 AM, Andrey Semashev <andrey.semashev@gmail.com> wrote:
On 06/14/2010 03:12 PM, Beman Dawes wrote:
...
+1 for removing streaming operators for path. For diagnostic purposes it should be enough to have the path::string< CharT > method. For serialization purposes, I think, it's better to have a separate header that enables support for Boost.Serialization.
The escape/unescape functions (why "delimit"?) you mentioned could be useful, but only as helpers for other APIs (such as the OS API). I think, convenience.hpp is the right place for them.
Actually, what I'd like to propose is that path continue to have streaming operations, but that the output format be double quote delimited, with an escape character such as backslash or ampersand to escape double quotes or itself. The input format accepted would be (1) the same as the output format, thus ensuring roundtrips, or (2) if the first character is not a double quote then the first whitespace will terminate as in the past. An alternate for (2) would be to throw an exception, but that seems a bit draconian to me. Furthermore, I see the need to output and then roundtrip strings back as input, regardless of embedded whitespace, to be a common need, so would provide a general facility so a user could write: std::stringstream ss; const std::string expected("foo bar"); std::string actual; ss << delimit_string(expected); ss >> delimit_string(actual); assert(expected == actual); std::cout << delimit_string(expected); // will output "foo bar", including the double quotes I see this need as so commonplace that I wouldn't be surprised if it already exists somewhere in boost. If not, I'll put it in namespace detail for now, but it might go in the string library at some point. The actual prototype code I've written for delimit_string() takes optional arguments for the escape character and the delimiter. It needs to be templated to work with various string types. --Beman

On 14 June 2010 15:10, Beman Dawes <bdawes@acm.org> wrote:
Furthermore, I see the need to output and then roundtrip strings back as input, regardless of embedded whitespace, to be a common need
I think the best kind of << or >> would be ones that prevent "shell script injection attacks". Is there some usable lowest common denominator between the escapes used by various shells? Platform-specific output and a permissive parser might work too. Remembering the Requirement to "Be able to write portable script-style filesystem operations in modern C++" <http://www.boost.org/doc/libs/1_43_0/libs/filesystem/doc/design.htm#Requirements>, there should be some way to stringstream together something that can be safely passed to std::system. Imagine something like this: path p = "-f x"; stringstream ss; ss << delete_program << " " << p; system(ss.str().c_str()); It'd be great if it became system("rm './-f x'"); or similar, avoiding all kinds of pitfalls in the process.

On Mon, Jun 14, 2010 at 10:24 PM, Scott McMurray <me22.ca+boost@gmail.com> wrote:
On 14 June 2010 15:10, Beman Dawes <bdawes@acm.org> wrote:
Furthermore, I see the need to output and then roundtrip strings back as input, regardless of embedded whitespace, to be a common need
I think the best kind of << or >> would be ones that prevent "shell script injection attacks". Is there some usable lowest common denominator between the escapes used by various shells? Platform-specific output and a permissive parser might work too.
Remembering the Requirement to "Be able to write portable script-style filesystem operations in modern C++" <http://www.boost.org/doc/libs/1_43_0/libs/filesystem/doc/design.htm#Requirements>, there should be some way to stringstream together something that can be safely passed to std::system.
Imagine something like this:
path p = "-f x"; stringstream ss; ss << delete_program << " " << p; system(ss.str().c_str());
It'd be great if it became system("rm './-f x'"); or similar, avoiding all kinds of pitfalls in the process.
Using double quotes as delimiters works well for legitimate uses in both Windows and bash shells, is readable by humans, and avoids the need for escapes in the vast majority of all real-world path names. In your example, the call to system would become system("rm \"-f x\""), and that will be harmless for both Windows and bash shells. Because of its familiarity as an escape character, I think backslash makes a good choice as the default escape for delimited strings in general. But because backslashes appear routinely in Windows paths, I think filesystem::path should probably use a different escape character. I'm using ampersand right now, but am not wedded to it and am not even sure why it popped into my mind. Percent is another possibility. --Beman

On 15 June 2010 08:09, Beman Dawes <bdawes@acm.org> wrote:
Using double quotes as delimiters works well for legitimate uses in both Windows and bash shells, is readable by humans, and avoids the need for escapes in the vast majority of all real-world path names.
Unless someone accidentally makes a filename with a $ in it, or in bash one one with a ! in it, or in windows one with two %s in it.
In your example, the call to system would become system("rm \"-f x\""), and that will be harmless for both Windows and bash shells.
Except that in linux, the rm command sees the - and gives an argument parsing error. That's by no means the fault of the filesystem library, but it's be great if it were handled gracefully, since we have a type system and know that the coder isn't trying to specify flags. (I suppose theoretically there could be a problem in windows too, with a linux-thinking script running something like "del /test.txt" and getting "Invalid switch - "test.txt".")
participants (6)
-
Andrey Semashev
-
Beman Dawes
-
Eric Niebler
-
Phil Richards
-
Scott McMurray
-
Stewart, Robert