
Hello Surprisingly enough, C++ file-based streams can be opened with a char * string (for the filename) only, while modern computer systems have Unicode filenames. I see boost::iostreams::basic_file also gets constructed from a char * only. My project name has Korean characters, and I work on a Latin1 Windows system, and on my system the narrow-characters set simply does not contain Korean characters. Is there a (good) way to open a file with a wstring in boost ? Thank you, Timothy Madden

I use :
typedef boost::filesystem::wpath SlmWPath;
typedef boost::filesystem::wfstream SlmWfstream;
typedef boost::filesystem::wofstream SlmWOfstream;
typedef boost::filesystem::wifstream SlmWIfstream;
then you can use wchar, I have not tested with non asci filenames though.
On Mon, Aug 2, 2010 at 11:46 AM, Timothy Madden
Hello
Surprisingly enough, C++ file-based streams can be opened with a char * string (for the filename) only, while modern computer systems have Unicode filenames. I see boost::iostreams::basic_file also gets constructed from a char * only.
My project name has Korean characters, and I work on a Latin1 Windows system, and on my system the narrow-characters set simply does not contain Korean characters.
Is there a (good) way to open a file with a wstring in boost ?
Thank you, Timothy Madden
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Bo Jensen wrote:
I use :
typedef boost::filesystem::wpath SlmWPath; typedef boost::filesystem::wfstream SlmWfstream; typedef boost::filesystem::wofstream SlmWOfstream; typedef boost::filesystem::wifstream SlmWIfstream;
Yes, I can create wide streams, what I want is to pass a wide string as the file name to be opned. Thank you, Timothy Madden

On Mon, Aug 2, 2010 at 2:03 PM, Timothy Madden
Bo Jensen wrote:
I use :
typedef boost::filesystem::wpath SlmWPath; typedef boost::filesystem::wfstream SlmWfstream; typedef boost::filesystem::wofstream SlmWOfstream; typedef boost::filesystem::wifstream SlmWIfstream;
Yes, I can create wide streams, what I want is to pass a wide string as the file name to be opned.
This should work : boost::filesystem::wfstream test; test.open(L"somepath"); The above typedefs was just to show the tools you need from booost filesystem.
Thank you, Timothy Madden
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Bo Jensen wrote:
On Mon, Aug 2, 2010 at 2:03 PM, Timothy Madden
wrote: Bo Jensen wrote:
I use :
typedef boost::filesystem::wpath SlmWPath; typedef boost::filesystem::wfstream SlmWfstream; typedef boost::filesystem::wofstream SlmWOfstream; typedef boost::filesystem::wifstream SlmWIfstream; Yes, I can create wide streams, what I want is to pass a wide string as the file name to be opned.
This should work :
boost::filesystem::wfstream test;
test.open(L"somepath");
The above typedefs was just to show the tools you need from booost filesystem.
Oh, yes, you are right ! It must be the 'Additions to <fstream>' thing, that construct an ifstream from a filesystem::path/wpath. Actually I do not even have a wchar_t * in my program, I use a wpath ! :) I guess Bjarne was right, the boost documentation is a challenge for a newcomer. :) And I like to think of myself as a tough guy ... Thank you, Timothy Madden

On Mon, Aug 2, 2010 at 8:32 PM, Timothy Madden
Bo Jensen wrote:
On Mon, Aug 2, 2010 at 2:03 PM, Timothy Madden
wrote: Bo Jensen wrote:
I use :
typedef boost::filesystem::wpath SlmWPath; typedef boost::filesystem::wfstream SlmWfstream; typedef boost::filesystem::wofstream SlmWOfstream; typedef boost::filesystem::wifstream SlmWIfstream;
Yes, I can create wide streams, what I want is to pass a wide string as the file name to be opned.
This should work :
boost::filesystem::wfstream test;
test.open(L"somepath");
The above typedefs was just to show the tools you need from booost filesystem.
Oh, yes, you are right !
It must be the 'Additions to <fstream>' thing, that construct an ifstream from a filesystem::path/wpath. Actually I do not even have a wchar_t * in my program, I use a wpath ! :)
I don't know all the details, but on windows I think filenames is only utf-16. On linux you should be safe, what ever locale you use. I would be interested to hear how it worked out.
I guess Bjarne was right, the boost documentation is a challenge for a newcomer. :) And I like to think of myself as a tough guy ...
Thank you, Timothy Madden
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

On Mon, Aug 2, 2010 at 9:39 PM, Bo Jensen
On Mon, Aug 2, 2010 at 8:32 PM, Timothy Madden
wrote: Bo Jensen wrote:
On Mon, Aug 2, 2010 at 2:03 PM, Timothy Madden
wrote: Bo Jensen wrote:
I use :
typedef boost::filesystem::wpath SlmWPath; typedef boost::filesystem::wfstream SlmWfstream; typedef boost::filesystem::wofstream SlmWOfstream; typedef boost::filesystem::wifstream SlmWIfstream;
Yes, I can create wide streams, what I want is to pass a wide string as the file name to be opned.
This should work :
boost::filesystem::wfstream test;
test.open(L"somepath");
The above typedefs was just to show the tools you need from booost filesystem.
Oh, yes, you are right !
It must be the 'Additions to <fstream>' thing, that construct an ifstream from a filesystem::path/wpath. Actually I do not even have a wchar_t * in my program, I use a wpath ! :)
I don't know all the details, but on windows I think filenames is only utf-16. On linux you should be safe, what ever locale you use. I would be interested to hear how it worked out.
We had a program compiling under Microsoft Visual Studio 2003 that was running in different regions around the world and allowing users in their regions to open their files ok. This program deals at the char* level, working with strings encoded using local code pages. When we upgraded to Microsoft Visual Studio 2008, this failed to work on std::ofstream/std::ifstream because Microsoft changed some internals of the runtime library. To fix this (I think we should've been doing this all along anyway!) we needed to issue a std::setlocale(LC_CTYPE,"") call at program startup so that the runtime library internally knew how to convert the char* to a wide character string. The runtime library uses mbstowcs() to convert that char* to a wchar_t* which needs to know the code page. If the program was Unicode we wouldn't have faced the issue above. Pete

On 8/2/2010 15:21, PB wrote:
We had a program compiling under Microsoft Visual Studio 2003 that was running in different regions around the world and allowing users in their regions to open their files ok. This program deals at the char* level, working with strings encoded using local code pages.
The only way this approach could possibly work is if users only used file names that could be encoded in their default code page, which is simply not true a lot of the time. For example, I run under a Japanese locale, but I regularly deal with files with Chinese names or German names that cannot be represented in CP932, the Japanese code page under Windows. -- Rainer Deyke - rainerd@eldwood.com

On 03/08/10 00:06, Rainer Deyke wrote:
On 8/2/2010 15:21, PB wrote:
We had a program compiling under Microsoft Visual Studio 2003 that was running in different regions around the world and allowing users in their regions to open their files ok. This program deals at the char* level, working with strings encoded using local code pages.
The only way this approach could possibly work is if users only used file names that could be encoded in their default code page, which is simply not true a lot of the time. For example, I run under a Japanese locale, but I regularly deal with files with Chinese names or German names that cannot be represented in CP932, the Japanese code page under Windows.
If only Windows supported a UTF-8 locale like most other systems...

If only Windows supported a UTF-8 locale like most other systems...
I think that's a C Standard Library implementation issue, so is a property of the compiler, not the OS. The multi-byte string _bmlen etc. support is designed around double-byte character sets, not UTF-8. Windows supports UTF8 as a "code page" for the fundamental wide/narrow conversion functions, so should take that fine for any narrow-string API function. Note that file names have their own setting separate from the main code page; that might be confusing matters. --John TradeStation Group, Inc. is a publicly-traded holding company (NASDAQ GS: TRAD) of three operating subsidiaries, TradeStation Securities, Inc. (Member NYSE, FINRA, SIPC and NFA), TradeStation Technologies, Inc., a trading software and subscription company, and TradeStation Europe Limited, a United Kingdom, FSA-authorized introducing brokerage firm. None of these companies provides trading or investment advice, recommendations or endorsements of any kind. The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

Bo Jensen wrote:
On Mon, Aug 2, 2010 at 8:32 PM, Timothy Madden
wrote: Bo Jensen wrote:
On Mon, Aug 2, 2010 at 2:03 PM, Timothy Madden
wrote: I use :
typedef boost::filesystem::wpath SlmWPath; typedef boost::filesystem::wfstream SlmWfstream; typedef boost::filesystem::wofstream SlmWOfstream; typedef boost::filesystem::wifstream SlmWIfstream; Yes, I can create wide streams, what I want is to pass a wide string as
Bo Jensen wrote: the file name to be opned. This should work :
boost::filesystem::wfstream test;
test.open(L"somepath");
The above typedefs was just to show the tools you need from booost filesystem. Oh, yes, you are right !
It must be the 'Additions to <fstream>' thing, that construct an ifstream from a filesystem::path/wpath. Actually I do not even have a wchar_t * in my program, I use a wpath ! :)
I don't know all the details, but on windows I think filenames is only utf-16. On linux you should be safe, what ever locale you use. I would be interested to hear how it worked out.
All Windows API functions have an ANSI version, including file system
functions, despite NTFS having Unicode filenames. I do not know what
happens when an ANSI function has to return some Korean/Japanize file
name from the file system, on a computer with some latin locale, anyone
cares to try ?
Anyway I find that I have to explicitly
#include

All Windows API functions have an ANSI version, including file system functions, despite NTFS having Unicode filenames. I do not know what happens when an ANSI function has to return some Korean/Japanize file name from the file system, on a computer with some latin locale, anyone cares to try ?
When it converts to ANSI (that is, code-page encoding), you get non-round-trip substitutions and '?' characters, so you then can't open the file even though it claims to have found it in the directory. Or, you use the short-name alias, which is always 8-character ASCII. --John TradeStation Group, Inc. is a publicly-traded holding company (NASDAQ GS: TRAD) of three operating subsidiaries, TradeStation Securities, Inc. (Member NYSE, FINRA, SIPC and NFA), TradeStation Technologies, Inc., a trading software and subscription company, and TradeStation Europe Limited, a United Kingdom, FSA-authorized introducing brokerage firm. None of these companies provides trading or investment advice, recommendations or endorsements of any kind. The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

On 02/08/10 12:46, Timothy Madden wrote:
Hello
Surprisingly enough, C++ file-based streams can be opened with a char * string (for the filename) only, while modern computer systems have Unicode filenames.
All of them but Microsoft Windows support UTF-8.
Is there a (good) way to open a file with a wstring in boost ?
Boost.Fileystem has wide characters support, but I would advise avoiding wide characters entirely.

Mathias Gaunard wrote:
On 02/08/10 12:46, Timothy Madden wrote:
Hello
Surprisingly enough, C++ file-based streams can be opened with a char * string (for the filename) only, while modern computer systems have Unicode filenames.
All of them but Microsoft Windows support UTF-8.
How would I let the file-stream object know that the filename to be opened is encoded in UTF-8 ?
Is there a (good) way to open a file with a wstring in boost ?
Boost.Fileystem has wide characters support, but I would advise avoiding wide characters entirely.
How ? If user enters an Unicode filename (with Korean characters) for me to open, and the current locale is Latin 2, how would I open the file ? Thank you, Timothy Madden

On 02/08/10 15:02, Timothy Madden wrote:
Mathias Gaunard wrote:
On 02/08/10 12:46, Timothy Madden wrote:
Hello
Surprisingly enough, C++ file-based streams can be opened with a char * string (for the filename) only, while modern computer systems have Unicode filenames.
All of them but Microsoft Windows support UTF-8.
How would I let the file-stream object know that the filename to be opened is encoded in UTF-8 ?
It is assumed to be in the locale of the system. Most POSIX systems use a UTF-8 locale these days, but if you really want to be portable, you should convert that.
How ?
If user enters an Unicode filename (with Korean characters) for me to open, and the current locale is Latin 2, how would I open the file ?
On Windows, convert from UTF-8 to wide characters when calling system calls. On other operating systems, pass UTF-8 to the system calls, or convert them to the locale if you care enough about non-utf8 locales.
participants (6)
-
Bo Jensen
-
John Dlugosz
-
Mathias Gaunard
-
PB
-
Rainer Deyke
-
Timothy Madden