Shouldn't boost::filesystem be BOOST_POSIX on cygwin?

At several places in the code we find something like: // BOOST_POSIX or BOOST_WINDOWS specify which API to use. # if !defined( BOOST_WINDOWS ) && !defined( BOOST_POSIX ) # if defined(_WIN32) || defined(__WIN32__) || defined(WIN32) || defined(__CYGWIN__) # define BOOST_WINDOWS # else # define BOOST_POSIX # endif # endif (As in for example boost/libs/filesystem/src/path_posix_windows.cpp) This sets BOOST_WINDOWS explicitely, as default, for cygwin - which is a POSIX operating system with a single root. This breaks a lot of code - if not all - of boost::filesystem. For example, the following code: #include <iostream> #include <boost/filesystem/path.hpp> #include <boost/filesystem/operations.hpp> int main() { boost::filesystem::path p = "/usr"; if (!boost::filesystem::exists(p)) std::cerr << p.string() << " doesn't exist according to boost::filesystem" << std::endl; } indeed causes boost::filesystem::exists to return 'false' on cygwin while "/usr" definitely exists. The reason for that is obvious and looks like a clear bug: The code of boost::filesystem::exists is as follows: BOOST_FILESYSTEM_DECL bool exists( const path & ph ) { # ifdef BOOST_POSIX struct stat path_stat; if(::stat( ph.string().c_str(), &path_stat ) != 0) { if((errno == ENOENT) || (errno == ENOTDIR)) return false; // stat failed because the path does not exist // for any other error we assume the file does exist and fall through, // this may not be the best policy though... (JM 20040330) } return true; # else if(::GetFileAttributesA( ph.string().c_str() ) == 0xFFFFFFFF) { UINT err = ::GetLastError(); if((err == ERROR_FILE_NOT_FOUND) || (err == ERROR_PATH_NOT_FOUND) || (err == ERROR_INVALID_NAME)) return false; // GetFileAttributes failed because the path does not exist // for any other error we assume the file does exist and fall through, // this may not be the best policy though... (JM 20040330) return true; } return true; # endif } Therefore, with BOOST_POSIX undefined, "/usr" is passed directly to ::GetFileAttributesA() and that can obviously not work. My question is therefore: shouldn't BOOST_POSIX be forced to be defined on cygwin? -- Carlo Wood <carlo@alinoe.com> PS Please note that, looking at http://www.meta-comm.com/engineering/boost-regression/developer/summary.html nobody is running the testsuite on cygwin! How broken on cygwin is boost really?

I brought this up a while ago. I believe Beman stated as a rationale for this behavior that he and other boost developers and users like to use Windows native paths in Cygwin, using Cygwin as an environment for Windows software development. You can define BOOST_POSIX when compiling to force POSIX path handling. Arguably, POSIX should be the default on Cygwin, since it is designed to be a POSIX emulation environment, and many users surely use it as such. -- Jeremy Maitin-Shepard

Jeremy Maitin-Shepard <jbms@attbi.com> writes:
I brought this up a while ago. I believe Beman stated as a rationale for this behavior that he and other boost developers and users like to use Windows native paths in Cygwin, using Cygwin as an environment for Windows software development.
I think that's a bad rationale. In point of fact most cygwin tools can handle either path format, but the native representation is indeed a POSIX one.
You can define BOOST_POSIX when compiling to force POSIX path handling. Arguably, POSIX should be the default on Cygwin, since it is designed to be a POSIX emulation environment, and many users surely use it as such.
Yeah. But IMO on cygwin boost::filesystem should handle both formats. -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

At Wednesday 2004-08-18 05:32, you wrote:
Jeremy Maitin-Shepard <jbms@attbi.com> writes:
I brought this up a while ago. I believe Beman stated as a rationale for this behavior that he and other boost developers and users like to use Windows native paths in Cygwin, using Cygwin as an environment for Windows software development.
I think that's a bad rationale. In point of fact most cygwin tools can handle either path format, but the native representation is indeed a POSIX one.
You can define BOOST_POSIX when compiling to force POSIX path handling. Arguably, POSIX should be the default on Cygwin, since it is designed to be a POSIX emulation environment, and many users surely use it as such.
Yeah. But IMO on cygwin boost::filesystem should handle both formats.
am I correctly inferring that it should check "both" filesystems for existence, etc.? should paths with '\' in them ONLY check the native windows filesystem while those with '/' the POSIX one? What if it has both? I see worms here, maybe only one or two, but perhaps and entire can.
-- Dave Abrahams Boost Consulting http://www.boost-consulting.com
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Victor A. Wagner Jr. http://rudbek.com The five most dangerous words in the English language: "There oughta be a law"

On Wed, Aug 18, 2004 at 08:32:32AM -0400, David Abrahams wrote:
Yeah. But IMO on cygwin boost::filesystem should handle both formats.
I agree. I am currently working around the problem by implemeting support for this in my own code (I don't want my code to break when boost suddenly changes its default - or when someone ELSE links my software with a boost that has been compiled with BOOST_POSIX :/). My approach is this: When "/foo/bar" (which has a root_directory) is not complete (according to boost::filesystem::is_complete) then apparently we have a multi-root system and I prepend the cygwin installation path before it (only on cygwin that is; actually I prepend something that I called a 'root_base', which is empty on GNU/Linux, and "C:/cygwin" (for example) on cygwin. On windows this would be the current drive name ("C:" or "D:" etc)). If it *is* complete, then it is not touched because then we apparently have a single-root system. We observe that this correction is only needed when !fs::is_complete(ph) && ph.has_root_directory() Because, when it is complete, then the path already works - and when it isn't and there is also no root directory - then the path is relative and gets expanded correctly (namely, current_path() *does* include the 'root_base' on cygwin with BOOST_WINDOWS). Having defined "root_base" above - then only thing that is needed is to add this test for the parameters of certain operations, like 'exists'. Thus, the boost::filesystem::exists function then would be changed to prepend the root_base when the path pass to it is not complete but _does_ have a root_directory. What do you think? -- Carlo Wood <carlo@alinoe.com>

David Abrahams <dave@boost-consulting.com> writes:
Jeremy Maitin-Shepard <jbms@attbi.com> writes:
I brought this up a while ago. I believe Beman stated as a rationale for this behavior that he and other boost developers and users like to use Windows native paths in Cygwin, using Cygwin as an environment for Windows software development.
I think that's a bad rationale. In point of fact most cygwin tools can handle either path format, but the native representation is indeed a POSIX one.
How does it handle both path formats simultaneously? Does / mean the Cygwin root, while \ means the current drive/volume root? In order to support that, there would probably need to be separate special path handling for Cygwin, so that / and \ can be treated separately. Then there is still the problem of representation in the boost.filesystem "portable" format.
[snip]
-- Jeremy Maitin-Shepard

On Wed, Aug 18, 2004 at 12:56:57PM -0400, Jeremy Maitin-Shepard wrote:
How does it handle both path formats simultaneously? Does / mean the Cygwin root, while \ means the current drive/volume root? In order to support that, there would probably need to be separate special path handling for Cygwin, so that / and \ can be treated separately. Then there is still the problem of representation in the boost.filesystem "portable" format.
No, no, no. As I explained in my previous post, what he meant (I assume) is that it should handle both types of canonical paths. That is, it should understand "/usr" AND "c:/cygwin/usr". As I pointed out, this is possible because "/usr" is not complete (but does have a root directory), while the other is. This is not ambigious. Thus, on cygwin, boost::filesystem should treat "/usr" as "c:/cygwin/usr" when compiled with BOOST_WINDOWS (and treat "c:/cygwin/usr" as itself). When it would be compiled with BOOST_POSIX then "c:/cygwin/usr" would be an illegal path name and "/usr" would be handled internally as "/usr" (by calling the POSIX system calls of cygwin). -- Carlo Wood <carlo@alinoe.com>

Carlo Wood <carlo@alinoe.com> writes:
Thus, on cygwin, boost::filesystem should treat "/usr" as "c:/cygwin/usr" when compiled with BOOST_WINDOWS (and treat "c:/cygwin/usr" as itself).
I hope you're not considering "C:\cygwin" to be the fixed installation point of the Cygwin suite. On my two computers, one sits at "C:\Program Files\cygwin", "D:\Program Files\cygwin" on the other. Also, most Cygwin programs are fairly agnostic about which way the slashes go, so long as the backslashes are escaped properly. -- Steven E. Harris

Carlo Wood <carlo@alinoe.com> writes:
On Wed, Aug 18, 2004 at 12:56:57PM -0400, Jeremy Maitin-Shepard wrote:
How does it handle both path formats simultaneously? Does / mean the Cygwin root, while \ means the current drive/volume root? In order to support that, there would probably need to be separate special path handling for Cygwin, so that / and \ can be treated separately. Then there is still the problem of representation in the boost.filesystem "portable" format.
No, no, no. As I explained in my previous post, what he meant (I assume) is that it should handle both types of canonical paths. That is, it should understand "/usr" AND "c:/cygwin/usr". As I pointed out, this is possible because "/usr" is not complete (but does have a root directory), while the other is. This is not ambigious.
The standard Windows path handling code in boost.filesystem treats the ("portable" format) path /usr as the Windows path "\usr", namely the usr subdirectory of the root directory of the current drive/volume.
Thus, on cygwin, boost::filesystem should treat "/usr" as "c:/cygwin/usr" when compiled with BOOST_WINDOWS (and treat "c:/cygwin/usr" as itself). When it would be compiled with BOOST_POSIX then "c:/cygwin/usr" would be an illegal path name and "/usr" would be handled internally as "/usr" (by calling the POSIX system calls of cygwin).
If it followed these semantics, there would be no reason to define BOOST_POSIX when compiling Boost for cygwin, since the only effect would be that certain paths become illegal, no new paths become legal, and all legal paths refer to the same files. Consider, however, these alternate semantics for boost filesystem path handling on Cygwin: When compiled with BOOST_WINDOWS: "/" refers to the root directory of the current drive/volume "letter:/" refers to the <letter> drive/volume root directory When compiled with BOOST_POSIX: "/" refers to the Cygwin root directory "letter:/" refers to the <letter> drive/volume root directory -- Jeremy Maitin-Shepard

On Wed, Aug 18, 2004 at 02:09:56PM -0400, Jeremy Maitin-Shepard wrote:
The standard Windows path handling code in boost.filesystem treats the ("portable" format) path /usr as the Windows path "\usr", namely the usr subdirectory of the root directory of the current drive/volume.
Sure. That is what I said. On windows, "/usr" would in effect be expanded to "letter:/usr" where <letter> is the (current) drive/volume root directory. That isn't different from what it does now. Only on cygwin the behaviour would change, and "/usr" would refer to "C:/cygwin/usr" rather than "C:/usr". If someone really wanted to specify a path that is a WINDOWS path (rather non-portable) then he has to make sure that path is complete before using it. For example, he could use "C:/Program Files" and would just access that directory, while using "/Program Files" would not work (anymore). I don't consider that a problem however. People who want to specify windows paths on cygwin might as well use complete paths - it is more important that one can ALSO use POSIX paths (like "/usr"). [...]
If it followed these semantics, there would be no reason to define BOOST_POSIX when compiling Boost for cygwin, since the only effect would be that certain paths become illegal, no new paths become legal, and all legal paths refer to the same files.
Not entirely true - the result of 'current_path()' would still change from "C:/cygwin" to "/".
Consider, however, these alternate semantics for boost filesystem path handling on Cygwin:
When compiled with BOOST_WINDOWS:
"/" refers to the root directory of the current drive/volume
"letter:/" refers to the <letter> drive/volume root directory
When compiled with BOOST_POSIX:
"/" refers to the Cygwin root directory
"letter:/" refers to the <letter> drive/volume root directory
My problem with this is that the behaviour of boost::filesystem then depends on how it was compiled; that means in at least that no shared library (or dll) may be produced: it should always be a static library. Otherwise a program can work on one machine and fail on the next because the filesystem lib uses different semantics. More over, relying on "/" to refer to the root directory of the current drive/volume, on a multi-root system SHOULD be deprecated, because it is 'kinky' to use the term used in the documentation of boost::filesystem. A programmer should always use either complete paths or paths relative to the initial directory (the working directory at program start). Using "/" is not safe. My proposal would be to have this behaviour: Whether compiled with BOOST_WINDOWS or BOOST_POSIX: "/" refers to the root directory of single root systems. If the system is not a single-root system then it should throw. However, if the system is cygwin, then it should be the cygwin root. "letter:/" refers to the <letter> drive/volume root directory on multi-root systems. If the system is not a multi-root system then it should throw*) On cygwin this would be the windows path. *) When being used for an operation; not when handling a 'path'. -- Carlo Wood <carlo@alinoe.com>

David Abrahams wrote:
Yeah. But IMO on cygwin boost::filesystem should handle both formats.
I don't think attempting this is a good idea. If boost::filesystem on Cygwin is documented as "accepts both Cygwin and Win32 path," I can construct a filesystem where at least one valid Win32 or Cygwin path will not be correctly accepted, and thus make a liar of the documentation. In particular the meaning of /dir is different between Cygwin and Win32. It is valid in both. However, I do not think it is necessary to accept Win32 paths in Cygwin mode, as that Cygwin file primatives are able to deal with "c:\\dir" correctly to begin with. In other words, Cygwin has intrinsic support for both types itself, without any special support from Boost. Aaron W. LaFramboise

On Wed, Aug 18, 2004 at 01:13:51PM -0500, Aaron W. LaFramboise wrote:
However, I do not think it is necessary to accept Win32 paths in Cygwin mode, as that Cygwin file primatives are able to deal with "c:\\dir" correctly to begin with. In other words, Cygwin has intrinsic support for both types itself, without any special support from Boost.
In that case it should DEFINITELY be forced to be compiled with BOOST_POSIX. *Certainly* this should be the default. Hopefully this can be changed before the next release? -- Carlo Wood <carlo@alinoe.com>

At 08:32 AM 8/18/2004, David Abrahams wrote:
Jeremy Maitin-Shepard <jbms@attbi.com> writes:
I brought this up a while ago. I believe Beman stated as a rationale for this behavior that he and other boost developers and users like to use Windows native paths in Cygwin, using Cygwin as an environment for Windows software development.
I think that's a bad rationale. In point of fact most cygwin tools can handle either path format, but the native representation is indeed a POSIX one.
Boost filesystem, like cygwin, has to support both. There are significant numbers of cygwin users for both the Windows and POSIX environments.
You can define BOOST_POSIX when compiling to force POSIX path handling. Arguably, POSIX should be the default on Cygwin, since it is designed to be a POSIX emulation environment, and many users surely use it as such.
Yeah. But IMO on cygwin boost::filesystem should handle both formats.
It does. The docs were updated after the last release to make that clearer. --Beman

Beman Dawes <bdawes@acm.org> writes:
At 08:32 AM 8/18/2004, David Abrahams wrote:
Jeremy Maitin-Shepard <jbms@attbi.com> writes:
I brought this up a while ago. I believe Beman stated as a rationale for this behavior that he and other boost developers and users like to use Windows native paths in Cygwin, using Cygwin as an environment for Windows software development.
I think that's a bad rationale. In point of fact most cygwin tools can handle either path format, but the native representation is indeed a POSIX one.
Boost filesystem, like cygwin, has to support both. There are significant numbers of cygwin users for both the Windows and POSIX environments.
You can define BOOST_POSIX when compiling to force POSIX path handling. Arguably, POSIX should be the default on Cygwin, since it is designed to be a POSIX emulation environment, and many users surely use it as such.
Yeah. But IMO on cygwin boost::filesystem should handle both formats.
It does. The docs were updated after the last release to make that clearer.
That's not what I meant. It should support both formats in a single build, just like most cygwin tools do. dave@penguin ~ $ ls c:/boost/index.htm c:/boost/index.htm dave@penguin ~ $ ls c:\\boost\\index.htm c:\boost\index.htm dave@penguin ~ $ ls /cygdrive/c/boost/index.htm /cygdrive/c/boost/index.htm about the only cygwin tool I've found that doesn't recognize regular windows paths is scp, and that's because the colon is already used to separate host from path: scp index.htm shell.sf.net:/home/groups/b/bo/boost/htdocs -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

Beman Dawes <bdawes@acm.org> writes:
Yeah. But IMO on cygwin boost::filesystem should handle both
At 05:15 PM 8/22/2004, David Abrahams wrote: formats.
It does. The docs were updated after the last release to make that
clearer.
That's not what I meant. It should support both formats in a single build, just like most cygwin tools do.
There are two issues; behavior of certain functions and path formats. The function behavior (dealing with what is a relative path) has to be selected at compile time unless we added runtime option arguments to select behavior, and every time we have looked at that it became unworkable. I guess we could provide addition functions with different names, but we already have quite a number of path decomposition and query functions. I'd like to see a compelling use case that the current path member functions can't handle. For formats, we do provide runtime selection. I think "native" already provides what is being asked for here.
dave@penguin ~ $ ls c:/boost/index.htm c:/boost/index.htm
That should work.
dave@penguin ~ $ ls c:\\boost\\index.htm c:\boost\index.htm
The double backslashes are not valid at those locations in a Windows native path. And yes, I did a test to verify that, using an XP machine. I guess we could provide a cygwin extension that allowed them, but my intuition is that would be a bad idea. Cygwin may fix the bug in the future.
dave@penguin ~ $ ls /cygdrive/c/boost/index.htm /cygdrive/c/boost/index.htm
Should work, AFAIK. --Beman

Beman Dawes <bdawes@acm.org> writes:
At 05:15 PM 8/22/2004, David Abrahams wrote:
Beman Dawes <bdawes@acm.org> writes:
Yeah. But IMO on cygwin boost::filesystem should handle both formats.
It does. The docs were updated after the last release to make that clearer.
That's not what I meant. It should support both formats in a single build, just like most cygwin tools do.
There are two issues; behavior of certain functions and path formats. The function behavior (dealing with what is a relative path) has to be selected at compile time unless we added runtime option arguments to select behavior, and every time we have looked at that it became unworkable. I guess we could provide addition functions with different names, but we already have quite a number of path decomposition and query functions. I'd like to see a compelling use case that the current path member functions can't handle.
For formats, we do provide runtime selection. I think "native" already provides what is being asked for here.
dave@penguin ~ $ ls c:/boost/index.htm c:/boost/index.htm
That should work.
dave@penguin ~ $ ls c:\\boost\\index.htm c:\boost\index.htm
The double backslashes are not valid at those locations in a Windows native path.
Those are single backslashes once the bash shell gets done with them and they become arguments to "ls". -- Dave Abrahams Boost Consulting http://www.boost-consulting.com

At 09:13 PM 8/22/2004, David Abrahams wrote:
Beman Dawes <bdawes@acm.org> writes:
At 05:15 PM 8/22/2004, David Abrahams wrote:
dave@penguin ~ $ ls c:\\boost\\index.htm c:\boost\index.htm
The double backslashes are not valid at those locations in a Windows native path.
Those are single backslashes once the bash shell gets done with them and they become arguments to "ls".
OK, so that case will work too. But I was wrong about the current code always working. It will work if BOOST_WINDOWS is defined, but not for BOOST_POSIX. To make that a bit clearer, the current path implementation has code like: #if defined(BOOST_WINDOWS) // accept native Windows format paths ... #endif The #if needs to be changed to: #if defined(BOOST_WINDOWS) || defined(__CYGWIN__) Thanks, --Beman

On Mon, Aug 23, 2004 at 08:13:39AM -0400, Beman Dawes wrote:
#if defined(BOOST_WINDOWS) // accept native Windows format paths ... #endif
The #if needs to be changed to:
#if defined(BOOST_WINDOWS) || defined(__CYGWIN__)
Just in one or a few places you mean? Otherwise you might as well just define BOOST_POSIX. -- Carlo Wood <carlo@alinoe.com>

At 11:20 AM 8/23/2004, Carlo Wood wrote:
On Mon, Aug 23, 2004 at 08:13:39AM -0400, Beman Dawes wrote:
#if defined(BOOST_WINDOWS) // accept native Windows format paths ... #endif
The #if needs to be changed to:
#if defined(BOOST_WINDOWS) || defined(__CYGWIN__)
Just in one or a few places you mean? Otherwise you might as well just define BOOST_POSIX.
The code given was for exposition. The actual code may vary, and might just force BOOST_WINDOWS. But it there are a bunch of cases, and each will have to be looked at to make sure it should be included. --Beman

and that can obviously not work.
My question is therefore: shouldn't BOOST_POSIX be forced to be defined on cygwin?
I believe that should be the default, yes; I don't really want to make what is in effect a breaking change without Beman's permission though. However, you can work around the problem by defining BOOST_POSIX (on the command line or in boost/config/user.hpp) and rebuilding. John.

At 11:28 AM 8/17/2004, Carlo Wood wrote:
At several places in the code we find something like:
// BOOST_POSIX or BOOST_WINDOWS specify which API to use. # if !defined( BOOST_WINDOWS ) && !defined( BOOST_POSIX ) # if defined(_WIN32) || defined(__WIN32__) || defined(WIN32) || defined(__CYGWIN__) # define BOOST_WINDOWS # else # define BOOST_POSIX # endif # endif
(As in for example boost/libs/filesystem/src/path_posix_windows.cpp)
This sets BOOST_WINDOWS explicitely, as default, for cygwin - which is a POSIX operating system with a single root.
That's not really the full story. Programs compiled with the Cygwin tools (gcc compiler, libraries, etc.) are useful (and commonly used) in two different environments: * Plain Windows (such as from the Windows command line). * Cygwin's Linux/POSIX emulator (such as from the bash command line). Both uses have to be supported by the filesystem library. Originally only the emulator usage was supported; we quickly heard from users that this wasn't acceptable.
This breaks a lot of code - if not all - of boost::filesystem. For example, the following code:
#include <iostream> #include <boost/filesystem/path.hpp> #include <boost/filesystem/operations.hpp>
int main() { boost::filesystem::path p = "/usr"; if (!boost::filesystem::exists(p)) std::cerr << p.string() << " doesn't exist according to boost::filesystem" << std::endl; }
indeed causes boost::filesystem::exists to return 'false' on cygwin while "/usr" definitely exists.
The reason for that is obvious and looks like a clear bug: The code of boost::filesystem::exists is as follows:
BOOST_FILESYSTEM_DECL bool exists( const path & ph ) { # ifdef BOOST_POSIX struct stat path_stat; if(::stat( ph.string().c_str(), &path_stat ) != 0) { if((errno == ENOENT) || (errno == ENOTDIR)) return false; // stat failed because the path does not exist // for any other error we assume the file does exist and fall through, // this may not be the best policy though... (JM 20040330) } return true; # else if(::GetFileAttributesA( ph.string().c_str() ) == 0xFFFFFFFF) { UINT err = ::GetLastError(); if((err == ERROR_FILE_NOT_FOUND) || (err ==
ERROR_PATH_NOT_FOUND)
|| (err == ERROR_INVALID_NAME)) return false; // GetFileAttributes failed because the path does not exist // for any other error we assume the file does exist and fall through, // this may not be the best policy though... (JM 20040330) return true; } return true; # endif }
Therefore, with BOOST_POSIX undefined, "/usr" is passed directly to ::GetFileAttributesA() and that can obviously not work.
My question is therefore: shouldn't BOOST_POSIX be forced to be defined on cygwin?
Yes, if you want the Cygwin emulator behavior, No, if you want the Windows behavior. There isn't any way for the compiler to know which the user prefers. To make that clearer, a "Note for Cygwin users" was added to the docs several months ago. See below. --Beman Note for Cygwin users The library's implementation code automatically detects the current platform, and compiles the POSIX or Windows implementation code accordingly. Automatic platform detection during object library compilation can be overridden by defining BOOST_POSIX or BOOST_WINDOWS macros. With the exception of the Cygwin environment, there is usually no reason to define one of the macros, as the software development kits supplied with most compilers only support a single platform. The Cygwin package of tools supports traditional Windows usage, but also provides an emulation layer and other tools which can be used to make Windows act as Linux (and thus POSIX), and provide the Linux look-and-feel. GCC is usually the compiler of choice in this environment, as it can be installed via the Cygwin install process. Other compilers can also use the Cygwin emulation of POSIX, at least in theory. Those wishing to use the Cygwin POSIX emulation layer should define the BOOST_POSIX macro when compiling the Boost Filesystem Library's object-library. The macro does not need to be defined (and will have no effect if defined) for Boost Filesystem Library user programs.
participants (8)
-
Aaron W. LaFramboise
-
Beman Dawes
-
Carlo Wood
-
David Abrahams
-
Jeremy Maitin-Shepard
-
John Maddock
-
Steven E. Harris
-
Victor A. Wagner Jr.