Re: [boost] Windows invalid chars in file names

Bronek Kozicki wrote:
Yes and no - there are platform specific limtations that apply to any filesystem attached to the machine you run your program on, and there are also filesystem specific limitations. I have nothing against library enforcing platform specific limitations, however it should not try to enforce limitations that not always apply - instead just go and try to create the file, and report error (presumably via exception) if that fails.
It appears to me that the issue in our code is going to be solved by using the boost portability checker rather than the boost native checker. However, your statements in the last mail show some (in my opinion) misunderstandings, and the discussion suggests that the boost filesystem only goes halfway towards fulfilling its goals. To start off, it could be understood from you that the reason question marks or asterisks are not allowed is that they are problematic for the filesystem but not the operating system. Looking at the invalid set of characters in your URL -- "< > : " / \ |" it becomes quite clear that the logic behind these is that they are reserved tokens in the command shell. But, in fact, so is "question mark" or "asterisk." The NTFS filesystem itself probably could care less if the filename has a question mark but certain operating system utilities such as the command shell, the file search dialog, or the FindFirstFile and related functions will not operate properly if it could be allowed. Furthermore, to quote one of the basic requirements of the boost::filesystem Library as listed in: http://boost.org/libs/filesystem/doc/design.htm "Interface smoothly with current C++ Standard Library input/output facilities. For example, paths should work in std::basic_fstream constructors. "Rationale: One of the most common uses of file system functionality is to manipulate paths for eventual use in input/output operations. Thus the need to interface smoothly with standard library I/O." I can't see how "work" fits in with a constructor throwing an exception because the path is invalid. Naturally, if no checker was used to construct the boost::filesystem::path object, then the library safeguards against this have been bypassed. But if the user provides the native checker why shouldn't the library identify a problem such as an asterisk or question mark inappropriately placed in the path? Furthermore, the library identifies the need to work within the reality that "Some operating systems allow file systems with different characteristics to be "mounted" within a directory tree. Thus a ISO-9660 or Windows file system may end up as a sub-tree of a POSIX directory tree." If an operating system allows multiple filesystems, the boost filesystem should identify and allow for this reality. In an ideal scenario, the boost filesystem "native" checker will: A) identify whether the path is legal for the operating system B) identify which filesystem the path references C) identify whether the path is legal for the referenced filesystem Fulfilling (A) alone requires a user to provide himself the functionality of (B) and (C). Simply hoping for an exception in the constructor of fstream falls short of the stated goal of smooth interfacing with the standard i/o library. Various scenarios can also be produced that show shortcomings in such a method. For example, a mail system with a GUI interface that stores mail in directories or files named according to the name of the user. The GUI interface, when asked to create a new user, should not permit creation of a username whose name is not valid in the filesystem. But only if the user presses OK, should the filesystem be expected to create the user file and if it exists ask whether the user directory should be overwritten. There is a very wide pattern here that is used in all sorts of applications, whether it is a firefox profile, an IDE project name, or a mail directory or inbox file. The filesystem as it exists now requires each one of these to write their own logic to handle (B) and (C) above, whereas this should be expected of a purported all- purpose "standard" filesystem library. Yitzhak Sapir

-- Martin Bonner Pi Technology, Milton Hall, Ely Road, Milton, Cambridge, CB4 6WZ +44 1223 203894 ________________________________ From: boost-bounces@lists.boost.org on behalf of Yitzhak Sapir Looking at the invalid set of characters in your [Bronek Kozicki's] URL -- "< > : " / \ |" it becomes quite clear that the logic behind these is that they are reserved tokens in the command shell. But, in fact, so is "question mark" or "asterisk." a) ':', '/', and '\' are seperators in a path (nothing to do with the command shell). b) ? and * are /not/ reserved tokens in the command shell. It is a continuing pain to Windows command line tools authors that the command shell does not expand wildcards - that has to be done by each individual tool. Similary tokenizing arguments is not done by the command shell, but by each tool - and to make that even mildly possible, forbidding " in filenames makes that a lot easier. I don't see the logic behind >, <, and |, but I wouldn't assume it is the command shell that forces that.
participants (2)
-
Martin Bonner
-
Yitzhak Sapir