[interprocess] utf8 file names in memory mapped files

Hi Ion, It would be great if we could support utf8 encoded file names on windows too. I guess all it requires is a utility function that converts from utf8 to utf16 on windows. Such a function is avaiable in the windows API, and I can supply it. kind regards Thorsten

Hi Ion,
It would be great if we could support utf8 encoded file names on windows too.
I guess all it requires is a utility function that converts from utf8 to utf16 on windows. Such a function is avaiable in the windows API, and I can supply it.
There is function in Boost.Locale (header only) to convert any-utf-to-any-utf boost::locale::conv::utf_to_utf<wchar_t>(utf8_string) boost::locale::conv::utf_to_utf<char>(utfXX_string) See: http://www.boost.org/doc/libs/1_48_0/libs/locale/doc/html/group__codepage.ht... +1 for proposal Artyom

El 30/01/2012 14:36, Thorsten Ottosen escribió:
It would be great if we could support utf8 encoded file names on windows too.
I guess all it requires is a utility function that converts from utf8 to utf16 on windows. Such a function is avaiable in the windows API, and I can supply it.
No, I need to change all internals and windows api to use wide strings instead of narrow ones. I guess I could use boost::filesystem::path or similar instead of raw strings (after all, named objects use something similar to paths), but I also need to be backwards compatible and write some maintainable code, supporting also wide strings in Unix platforms. Suggestions welcome. Ion

Den 31-01-2012 00:06, Ion Gaztañaga skrev:
El 30/01/2012 14:36, Thorsten Ottosen escribió:
It would be great if we could support utf8 encoded file names on windows too.
I guess all it requires is a utility function that converts from utf8 to utf16 on windows. Such a function is avaiable in the windows API, and I can supply it.
No, I need to change all internals and windows api to use wide strings instead of narrow ones. I guess I could use boost::filesystem::path or similar instead of raw strings (after all, named objects use something similar to paths), but I also need to be backwards compatible and write some maintainable code, supporting also wide strings in Unix platforms. Suggestions welcome.
Lately I have been porting our 32 bit windows apps to 64 bit linux. The single most time consuming (and annoying) problem has been in dealing with portable file names. I don't see any particular reason for supporting wide strings, neither in Boost.Filesystem or Interprocess. If everything was in "utf8-mode", we can just use std:::string. Of course, both libraries needs to do custom stuff under the hood (e.g. converting to utf16 on windows before calling the windows api). -Thorsten

El 31/01/2012 11:05, Thorsten Ottosen escribió:
I don't see any particular reason for supporting wide strings, neither in Boost.Filesystem or Interprocess. If everything was in "utf8-mode", we can just use std:::string. Of course, both libraries needs to do custom stuff under the hood (e.g. converting to utf16 on windows before calling the windows api).
I'm really ignorant on encoding and unicode issues, but I guess windows users don't expect "char *" or std::string to be utf-8 encoded. Best, Ion

Den 31-01-2012 20:51, Ion Gaztañaga skrev:
El 31/01/2012 11:05, Thorsten Ottosen escribió:
I don't see any particular reason for supporting wide strings, neither in Boost.Filesystem or Interprocess. If everything was in "utf8-mode", we can just use std:::string. Of course, both libraries needs to do custom stuff under the hood (e.g. converting to utf16 on windows before calling the windows api).
I'm really ignorant on encoding and unicode issues, but I guess windows users don't expect "char *" or std::string to be utf-8 encoded.
Well, its just a matter of stating the contract of your code. If you state that the string or const char* argument is assumed to be utf8, then your're done. This is a better contract than "I assume nothing, and guarantee nothing". Alternatively, you can accept a boost::filesystem::path object, if you can live with that dependency. Then you have pushed the problem to the caller. -Thorsten

El 01/02/2012 11:27, Thorsten Ottosen escribió:
Alternatively, you can accept a boost::filesystem::path object, if you can live with that dependency. Then you have pushed the problem to the caller.
filesystem::path seems very useful, but I don't like the dependency (Interprocess is a header only lib) and I find path interface suboptimal. There is an open bug for wide-string/unicode support for Interprocess, so I'll need to find time (too much time I'm afraid) to study the issue in detail. By the way, I don't find the restriction of using plain ASCII characters for named objects. I think mutex, shm semaphore names don't badly need encoding support. file_mapping, on the other hand needs at least wide strings, at least in windows. Ion
participants (3)
-
Artyom Beilis
-
Ion Gaztañaga
-
Thorsten Ottosen