boost filesystems and case issues
I've looked through the documentation and Googled, but I cannot find any info on the topic. If I want to find a file named 'System.dll', but on some systems the file is named 'system.DLL' or 'SYSTEM.DLL', etc. If I could could do something like this: if (boost::filesystem::is_regular(itr->path())) { std::cout << itr->path().lower() << std::endl; } Notice the .lower() In this way, each path would be lowercase and I could search for only 'system.dll' and .lower() would find any occurrence of that filename, no matter the case. I wonder how others do this today? Do you use transform or tolower from another library? Thank you
On Tuesday 10 June 2008 15:56:05 brad@16systems.com wrote:
I've looked through the documentation and Googled, but I cannot find any info on the topic. If I want to find a file named 'System.dll', but on some systems the file is named 'system.DLL' or 'SYSTEM.DLL', etc. If I could could do something like this:
if (boost::filesystem::is_regular(itr->path())) { std::cout << itr->path().lower() << std::endl; }
Notice the .lower() In this way, each path would be lowercase and I could search for only 'system.dll' and .lower() would find any occurrence of that filename, no matter the case.
I wonder how others do this today? Do you use transform or tolower from another library?
I don't understand the question, can't you use std::tolower (from <cctype>) ? To compare strings in case insensitive ways using a common case is an acceptable solution. This is a problem when using locales for which this cunversion is not bidirectional (I understand in Greek there are 2 different lower case characters that uppercase is the same character thus using the std::tolower aproach you would get false negatives in some situations). -- Mihai RUSU Email: dizzy@roedu.net "Linux is obsolete" -- AST
On Tue, Jun 10, 2008 at 05:38:33PM +0300, dizzy wrote:
To compare strings in case insensitive ways using a common case is an acceptable solution. This is a problem when using locales for which this cunversion is not bidirectional (I understand in Greek there are 2 different lower case characters that uppercase is the same character thus using the std::tolower aproach you would get false negatives in some situations).
Didn't know this :-) Now I remember that even for German there exists "ß" which has no capitalisation. Some people use "SS" or "SZ" or even "ß" for it. Don't know what toupper("ß") returns. So converting a string containing "ß" to upper case could result in some false negatives too :-) Nevertheless I'm sure that just a funny example. In praxis it will not cause harm. Jens
On Tue, Jun 10, 2008 at 04:45:21PM +0200, Jens Seidel wrote:
Didn't know this :-) Now I remember that even for German there exists "ß" which has no capitalisation. Some people use "SS" or "SZ" or even "ß" for it. Don't know what toupper("ß") returns.
So converting a string containing "ß" to upper case could result in some false negatives too :-) Nevertheless I'm sure that just a funny example. In praxis it will not cause harm.
Actually, in the newly released Unicode 5.1 standard, there's a codepoint U+1E9E LATIN CAPITAL LETTER SHARP S, but the default unicode casing rules converts from 'capital sharp s' down to 'small sharp s', but an upper casing from 'small sharp s' produces 'SS'. There are however tailored special cases which may be used that produces the expected roundtrip. In general, it's unwise to assume that lower/upper casing is non-destructive, especially in the Unicode world of today. -- Lars Viklund | zao@acc.umu.se | 070-310 47 07
participants (4)
-
brad@16systems.com
-
dizzy
-
Jens Seidel
-
Lars Viklund