
Stefan Seefeld wrote:
Patrick Bennett wrote:
It should be char* (and std::string) UTF-8 strings throughout for all platforms - passing as-is for platforms like Linux, and converting to/from UCS-2 on Windows. I can't speak for other platforms as I'm most familiar with Windows and Linux.
Isn't it abusive to force utf-8 into a std::string ?
Abuse is a relative term here. ;)
While it is technically possible the semantics isn't quite the same. operator [] (size_t i) wouldn't return the i'th character any more, at least not for characters outside the ascii range.
Correct (kind of), but I'd far prefer that std::string be used than for some completely new type to be defined. For users of boost::filesystem, I can't personally think of a time when a user would need to iterate the paths or files a character at a time. Because of UTF-8's nature, even if a user were to search for something like '/', it would still work for find's, [], etc. UTF-8 maps to std::string extremely well. I think there is also a fair amount of precendents already set for using UTF-8 internally using std::string as the storage mechanism. UTF-8 strings don't contain embedded nul's (std::string still works for that though), ASCII characters remains ASCII characters, and you can tell if you're in the middle of a multi-byte sequence. Since we're talking about filesystem's inability to be used with internationalized applications, and you don't think UTF-8/std::string is the way to do it, what is your recommendation? Cheers... Patrick Bennett