boost.filesystem - rationale for copy_file

What is the rationale for having copy_file fail if the target file exists? This is not how any cp/copy in any shell I know of works, actually quite surprising. And with boost.filesystem there is no way to avoid this, short of rewriting the function to allow it. The other way around would have been easy: fs::exists(target) || fs::copy_file(source, target) (One option is to delete the target before doing the copy, but then it is impossible to keep the same inode as before, and f.x. for backup files this is something that you want.) -- Lgb

At Tuesday 2005-01-25 14:28, you wrote:
What is the rationale for having copy_file fail if the target file exists?
This is not how any cp/copy in any shell I know of works, actually quite surprising.
in windows (and every OS I've been on the writing team) the shell will at least _ask_ for permission to overwrite.
And with boost.filesystem there is no way to avoid this, short of rewriting the function to allow it. The other way around would have been easy: fs::exists(target) || fs::copy_file(source, target)
(One option is to delete the target before doing the copy, but then it is impossible to keep the same inode as before, and f.x. for backup files this is something that you want.)
-- Lgb
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Victor A. Wagner Jr. http://rudbek.com The five most dangerous words in the English language: "There oughta be a law"

From: "Victor A. Wagner Jr." <vawjr@rudbek.com>
At Tuesday 2005-01-25 14:28, you wrote:
What is the rationale for having copy_file fail if the target file exists?
This is not how any cp/copy in any shell I know of works, actually quite surprising.
in windows (and every OS I've been on the writing team) the shell will at least _ask_ for permission to overwrite.
*nix cp has a -i option to query you, but the default behavior is to overwrite quietly. The shells do nothing to change that. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

On Thu, 27 Jan 2005 12:39:34 -0700 "Victor A. Wagner Jr." <vawjr@rudbek.com> wrote:
At Tuesday 2005-01-25 14:28, you wrote:
What is the rationale for having copy_file fail if the target file exists?
This is not how any cp/copy in any shell I know of works, actually quite surprising.
in windows (and every OS I've been on the writing team) the shell will at least _ask_ for permission to overwrite.
Hmmm. I don't do windows, so I am no help there. However, I am somewhat familiar with the unix environment, and the "cp" command under unix will copy over the existing file, without an error, warning, or prompt. If you want that behavior, you have to pass it a command line argument (usually -i). Some people may alias cp to get that behavior in their interactive shells (e.g., alias cp="cp -i"), but it is definitely not the default, and I agree that copy_file() call should not force the file to be gone before doing to copy.

Somewhere on Shadow Earth, at Thu, Jan 27, 2005 at 04:40:06PM -0500, Jody Hagins wrote: <snip>
Some people may alias cp to get that behavior in their interactive shells (e.g., alias cp="cp -i"), but it is definitely not the default, and I agree that copy_file() call should not force the file to be gone before doing to copy.
And if IIRC, for the last few years, by default, various RedHat distros had "alias cp='cp -i'" as a default in one of the shell startup files, so it is easy to believe that many folks might not realise that that was special behaviour that they had had requested on their behalf, rather than standard shell behaviour. -- Timothy Knox <mailto:tdk@thelbane.com> The one thing I've learned about freedom of expression is that you really ought to keep that sort of thing to yourself. -- Scott Adams, _I'm Not Anti-Business, I'm Anti-Idiot_

At Thursday 2005-01-27 14:40, you wrote:
On Thu, 27 Jan 2005 12:39:34 -0700 "Victor A. Wagner Jr." <vawjr@rudbek.com> wrote:
At Tuesday 2005-01-25 14:28, you wrote:
What is the rationale for having copy_file fail if the target file exists?
This is not how any cp/copy in any shell I know of works, actually quite surprising.
in windows (and every OS I've been on the writing team) the shell will at least _ask_ for permission to overwrite.
Hmmm. I don't do windows, so I am no help there. However, I am somewhat familiar with the unix environment, and the "cp" command under unix will copy over the existing file, without an error, warning, or prompt. If you want that behavior, you have to pass it a command line argument (usually -i). Some people may alias cp to get that behavior in their interactive shells (e.g., alias cp="cp -i"), but it is definitely not the default, and I agree that copy_file() call should not force the file to be gone before doing to copy.
well, I could comment on the wisdom (or lack thereof) in the way *nix chose to handle this (but I won't comment further than to say *nix was trying to solve a different problem). but I was working on building systems for embedded work and their development. We made sure you didn't accidently get rid of something.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Victor A. Wagner Jr. http://rudbek.com The five most dangerous words in the English language: "There oughta be a law"

"Victor A. Wagner Jr." <vawjr@rudbek.com> writes: | well, I could comment on the wisdom (or lack thereof) in the way *nix | chose to handle this (but I won't comment further than to say *nix was | trying to solve a different problem). | but I was working on building systems for embedded work and their | development. | We made sure you didn't accidently get rid of something. No you didn't :-) I am sure you read every pop-up or question asked by programs you use all the time. But I'll bet that most people just click "Ok" or press 'y'. Anyway the problem with copy_file is that the requirement on having target non-existant is not possible to turn off. -- Lgb

On Fri, Jan 28, 2005 at 10:17:54AM +0100, Lars Gullik Bj?nnes wrote:
Anyway the problem with copy_file is that the requirement on having target non-existant is not possible to turn off.
Sounds like there should be a copy_file() overload (or a default argument) with a "no clobber" flag saying whether or not to overwrite. This would be analoguous to cp(1)'s -f switch and sh(1)'s >| redirection operator, which give you the option of clobbering an existing file even when in safe-by-default mode (i.e. using "cp -i" or "set -o noclobber") Looking at the implementation of copy_file() I'm wondering why it isn't done like this (this provides noclobber support too): std::fstream out; if (noclobber) { out.open(to_file_ph.string().c_str(), std::ios::in); if (out.is_open()) { // file exists boost::throw_exception( filesystem_error( "boost::filesystem::copy_file", from_file_ph, to_file_ph, fs::detail::system_error_code() ) ); } } std::ifstream in(from_file_ph.string().c_str()); out.open(to_file_ph.string().c_str(), std::ios::out); out << in.rdbuf(); It may not be blindingly fast on all systems (though there's no reason it can't be, with a good filebuf) but it's a lot simpler than a loop with read(2)/write(2) and file descriptors that have to be closed manually, and would work on all platforms. jon -- "In times like these, it helps to recall that there have always been times like these." - Paul Harvey

Jonathan Wakely <cow@compsoc.man.ac.uk> writes: | Looking at the implementation of copy_file() I'm wondering why it isn't | done like this (this provides noclobber support too):
| std::fstream out; | if (noclobber) | { | out.open(to_file_ph.string().c_str(), std::ios::in); wouldn't a exists work equally well? (a bit less heavyweight perhaps) | if (out.is_open()) | { | // file exists | boost::throw_exception( filesystem_error( | "boost::filesystem::copy_file", | from_file_ph, to_file_ph, fs::detail::system_error_code() ) ); | } | } | std::ifstream in(from_file_ph.string().c_str()); | out.open(to_file_ph.string().c_str(), std::ios::out); Add std::ios::trunc here and we are in violent agreement. | out << in.rdbuf();
| It may not be blindingly fast on all systems (though there's no reason | it can't be, with a good filebuf) Might be that the code could work directly on filebuf as well. -- Lgb

On Fri, Jan 28, 2005 at 10:58:24AM +0000, Jonathan Wakely wrote:
Looking at the implementation of copy_file() I'm wondering why it isn't done like this (this provides noclobber support too):
std::fstream out; if (noclobber) { out.open(to_file_ph.string().c_str(), std::ios::in); if (out.is_open()) { // file exists boost::throw_exception( filesystem_error( "boost::filesystem::copy_file", from_file_ph, to_file_ph, fs::detail::system_error_code() ) ); } } std::ifstream in(from_file_ph.string().c_str()); out.open(to_file_ph.string().c_str(), std::ios::out); out << in.rdbuf();
Hmm, this has a race-condition, which is a good reason to stick with open(2) and O_CREAT|O_EXCL and read(2)/write(2). However, in that case there's a comment about the implementation: // TODO: Ask POSIX experts if this is the best way to copy a file Now, I'm no POSIX expert, but I believe the loop used: ssize_t sz; while ( (sz = ::read( infile, buf.get(), buf_sz )) > 0 && (sz = ::write( outfile, buf.get(), sz )) > 0 ) {} can result in lost data if write(2) only does a partial write, which might happen if the write is interrupted by a signal. I think it should loop on the writes too, until write() has returned sz in total. jon -- Message terminated with signal 11, SIGFAULT

Jonathan Wakely <cow@compsoc.man.ac.uk> writes: | On Fri, Jan 28, 2005 at 10:58:24AM +0000, Jonathan Wakely wrote:
Looking at the implementation of copy_file() I'm wondering why it isn't done like this (this provides noclobber support too):
std::fstream out; if (noclobber) { out.open(to_file_ph.string().c_str(), std::ios::in); if (out.is_open()) { // file exists boost::throw_exception( filesystem_error( "boost::filesystem::copy_file", from_file_ph, to_file_ph, fs::detail::system_error_code() ) ); } } std::ifstream in(from_file_ph.string().c_str()); out.open(to_file_ph.string().c_str(), std::ios::out); out << in.rdbuf();
| Hmm, this has a race-condition, In that the target can come into existance after checked for? lots and lots of races like that possible when working with filesystems. | which is a good reason to stick with | open(2) and O_CREAT|O_EXCL and read(2)/write(2).
| However, in that case there's a comment about the implementation:
| // TODO: Ask POSIX experts if this is the best way to copy a file
| Now, I'm no POSIX expert, but I believe the loop used: Well, you can expect the implementators of your stdlib to be experts on the system it is implemented for. And that speaks in favour of writing a C++ solution. Pity that there is no ios::nocreate in the standard. And in the case where noclobber is false there is no race. | ssize_t sz; | while ( (sz = ::read( infile, buf.get(), buf_sz )) > 0 | && (sz = ::write( outfile, buf.get(), sz )) > 0 ) {} exactly the kindo thing that has been solved of the implementators for the out << in.rdbuf(); case. -- Lgb

At 09:55 AM 1/28/2005, Lars Gullik Bjønnes wrote:
Jonathan Wakely <cow@compsoc.man.ac.uk> writes: ... | ssize_t sz; | while ( (sz = ::read( infile, buf.get(), buf_sz )) > 0 | && (sz = ::write( outfile, buf.get(), sz )) > 0 ) {}
exactly the kindo thing that has been solved of the implementators for the out << in.rdbuf(); case.
It is really up to the implementor. I'm happy with the current code, modulo any needed bug fixes. It seems too heavy-weight to use iostreams for such a simple problem. --Beman

Beman Dawes <bdawes@acm.org> writes: | At 09:55 AM 1/28/2005, Lars Gullik Bjønnes wrote: | >Jonathan Wakely <cow@compsoc.man.ac.uk> writes: | >... | >| ssize_t sz; | >| while ( (sz = ::read( infile, buf.get(), buf_sz )) > 0 | >| && (sz = ::write( outfile, buf.get(), sz )) > 0 ) {} | > | >exactly the kindo thing that has been solved of the implementators for | >the | > out << in.rdbuf(); | >case.
| It is really up to the implementor. Exactly and by exploiting the implementors knowledge you don't need comments like "// TODO: Ask POSIX experts if this is the best way to copy a file" | I'm happy with the current code, | modulo any needed bug fixes. It seems too heavy-weight to use | iostreams for such a simple problem. If you look at a strace you will see that iostreams is not really heavy-weight for such a simple problem... and if it is it is a lousy iostreams implementation. But as long as I am allowed to overwrite the target and have it use O_TRUNC then I am happy. -- Lgb

At 07:30 AM 1/28/2005, Jonathan Wakely wrote:
However, in that case there's a comment about the implementation:
// TODO: Ask POSIX experts if this is the best way to copy a file
Now, I'm no POSIX expert, but I believe the loop used:
ssize_t sz; while ( (sz = ::read( infile, buf.get(), buf_sz )) > 0 && (sz = ::write( outfile, buf.get(), sz )) > 0 ) {}
can result in lost data if write(2) only does a partial write, which might happen if the write is interrupted by a signal. I think it should loop on the writes too, until write() has returned sz in total.
Hum... Reading the POSIX spec, it looks like you are right. ::write() will also only do a partial write if the process runs out of disk space, for example. Thanks, --Beman

At 04:28 PM 1/25/2005, Lars Gullik Bjønnes wrote:
What is the rationale for having copy_file fail if the target file exists?
Better safe than sorry. I guess that design approach comes from my experience that in filesystem manipulations early detection of errors pays big dividends, even though you have to write more pedantic code.
This is not how any cp/copy in any shell I know of works, actually quite surprising.
The filesystem library isn't a shell. Even if it was, I think the modern trend is in favor of more rather than less security.
And with boost.filesystem there is no way to avoid this, short of rewriting the function to allow it. The other way around would have been easy: fs::exists(target) || fs::copy_file(source, target)
(One option is to delete the target before doing the copy, but then it is impossible to keep the same inode as before, and f.x. for backup files this is something that you want.)
Boost filesystem can't make guarantees about inodes; that isn't something that would be portable across various operating systems. Early in the development of the filesystem library, we tried providing a lot of shell-like options for various operations functions. That really didn't work; it is hard to choose option feature-sets that satisfy a majority of users. I'm not totally opposed to modifying the copy_file function in one way or another, but I'd want to hear stronger arguments that "shells do it a certain way." --Beman

Beman Dawes <bdawes@acm.org> writes: | At 04:28 PM 1/25/2005, Lars Gullik Bjønnes wrote: | > | >What is the rationale for having copy_file fail if the target file | >exists?
| Better safe than sorry. I guess that design approach comes from my | experience that in filesystem manipulations early detection of errors | pays big dividends, even though you have to write more pedantic | code. Sure let the default be that target is not overwritten, but removing the option... | >This is not how any cp/copy in any shell I know of works, actually | >quite surprising.
| The filesystem library isn't a shell. Even if it was, I think the | modern trend is in favor of more rather than less security. In that sense security is in the eye of the beholder... at least I do not find a system that ask nagging questions more secure that one that doesn't. Having the possiblity of ressurecting a file that was accidentally deleted otoh... | >And with boost.filesystem there is no way to avoid this, short of | >rewriting the function to allow it. The other way around would have | >been easy: fs::exists(target) || fs::copy_file(source, target) | > | >(One option is to delete the target before doing the copy, but then it | >is impossible to keep the same inode as before, and f.x. for backup | >files this is something that you want.)
| Boost filesystem can't make guarantees about inodes; that isn't | something that would be portable across various operating systems. So therefore why not make it impossible with boost.filesystem? and by forcing a remove(<file>) && copy_file(<target>, <file) it opens up for the race mentioned in another comment in this thread. | Early in the development of the filesystem library, we tried providing | a lot of shell-like options for various operations functions. That | really didn't work; it is hard to choose option feature-sets that | satisfy a majority of users.
| I'm not totally opposed to modifying the copy_file function in one way | or another, but I'd want to hear stronger arguments that "shells do it | a certain way." The argument is not "shells do it that way". But "cp" does it that way. "cp" is not a shell it is a program specialized for copying files. Also that by not allowing overwrite of existing files it opens up for a race. -- Lgb

"Lars Gullik Bjønnes" wrote:
The argument is not "shells do it that way". But "cp" does it that way. "cp" is not a shell it is a program specialized for copying files. Also that by not allowing overwrite of existing files it opens up for a race.
Not only a race. Delete/recreate loses all metainformation attached to the original file - inode, attributes, permissions, ACLs, alternate forks/streams; it is also possible that the program doesn't have permission to delete+recreate, but does have permission to overwrite this specific file. Of course one can use fopen and friends instead. ;-)

larsbj@gullik.net (Lars Gullik Bjønnes) writes: | The argument is not "shells do it that way". But "cp" does it that | way. "cp" is not a shell it is a program specialized for copying | files. Also that by not allowing overwrite of existing files it opens | up for a race. btw. There already is a race in the copy_file code: between that stat of the source file and the opening of the source file. This can be avoided by using fstat instead of stat. Feeble attempt at solving that prob and makeing clobbering optional: void copy_file(string const & source, string const & target, bool noclobber = true) { int const infile = ::open(source.c_str(), O_RDONLY); if (infile == -1) { throw string("boost::filesystem::copy_file"); } struct stat source_stat; int const ret = ::fstat(infile, &source_stat); if (ret == -1) { ::close(infile); throw string("boost::filesystem::copy_file"); } int const flags = O_WRONLY | O_CREAT | (noclobber ? O_EXCL : O_TRUNC); int const outfile = ::open(target.c_str(), flags, source_stat.st_mode); if (outfile == -1) { ::close(infile); throw string("boost::filesystem::copy_file"); } std::size_t const buf_sz = 32768; char buf[buf_sz]; ssize_t in = -1; ssize_t out = -1; while (true) { in = ::read(infile, buf, buf_sz); if (in == -1) { break; } else if (in == 0) { break; } else { out = ::write(outfile, buf, in); if (out == -1) { break; } } } ::close(infile); ::close(outfile); if (in == -1 || out == -1) throw string("boost::filesystem::copy_file"); } -- Lgb
participants (8)
-
Beman Dawes
-
Jody Hagins
-
Jonathan Wakely
-
larsbj@gullik.net
-
Peter Dimov
-
Rob Stewart
-
Timothy Knox
-
Victor A. Wagner Jr.