
On Fri, Nov 16, 2012 at 1:28 PM, Alex Perry <Alex.Perry@smartlogic.com>wrote:
[...] I was trying to point out (obviously unclearly) possible problems with the single command line version of execute
OK, so you were talking about the single string versus argument array? Then this is orthogonal to the encoding issue. My input on command lines vs argv ================== Problems in the current implementation ---------------------------- Try using the current implementation of set_args to call cmd.exe so it will do the same as the following call (i.e. set MSVC environment and then print the values of the inc* variables): cmd.exe /C ""%VS80COMNTOOLS%vsvars32.bat">NUL && set inc" Additionally, the current set_args implementation *is not* the right inverse of CommandLineToArgvW! The following: std::vector<std::string> vArgs; vArgs.push_back("a.exe"); vArgs.push_back("a b\\c"); // single backslash -- escape execute( set_args( vArgs ) ); will result in the following command line to CreateProcess (verbatim, no escapes): a.exe "a b\\c" Which will appear in main argv[] as *two* backslashes! Facts -------- POSIX argv[] and Windows command line cannot be mapped bijectively, unfortunately. The problem is that the splitting of the command line into arguments on Windows is done by any program in its own way. Usually, for C++ programs, this is done inside the CRT with the CommandLineToArgvW function (or equivalent), which happens to be bugged (ask me for details if interested). Other programs (like cmd.exe) parse the whole command line in a totally different way. No sensible set_args can be surjective. This may give the impression that providing set_cmd_line is inevitable. However, the use of such function will be limited mostly for Windows. Yet, we do not have to support everything that each platform provides. My opinion ---------- I prefer concise, minimal and uniform interfaces. This implies: * Use only the set_args, no set_cmd_line. Rationale: consistent with POSIX and the standard argv[] passed to main. Removes the need of run_exe or parsing the set_cmd_line to retrieve the exe name from there. * Leave the behavior in case of embedded quotation marks unspecified. Do not escape quotation marks within the argument. Rationale: no problem on POSIX, there it does no parsing anyway. On windows this will increase the image (set theoretic) of set_args. In particular it will be possible to invoke both examples from above. Cons: It's user's responsibility to escape double quotation marks on windows. But hey, she is the only one who could know how to do it properly. [...]
Given
boost::filesystem::path foo_path = .... boost::filesystem::path somefile_path = ...
which have been determined by some mechanism (hopefully in a x-platform manner without #ifdefs)
This is the assumption that fails. filesystem::path works great if you write for POSIX only, it works great if you write for windows only using wide strings, but it fails when you start writing portable code. Try initializing those paths from, for example, fields in an SQLIte database, or a text file. You can solve many issues imbuing a UTF-8 locale into filesystem, but this won't solve all the issues (.c_str()), and you cannot always change the global state to accomplish this. Continuing with your example:
std::vector<boost::filesystem::path::string_type> vArgs;
vArgs.push_back( foo_path.native() ); vArgs.push_back( "-f" );
Error on windows: cannot convert char[3] to std::wstring... vArgs.push_back( somefile_path.native() );
execute( args( vArgs ) );
seemed sufficient - trying to force this into a utf8string (I may have misunderstood what you were arguing for) just seems awkward to me.
Assuming we imbued a UTF-8 codecvt into boost filesystem (or adopted a policy that this is the default), then: std::vector<std::string> vArgs; vArgs.push_back(foo_path.string()); vArgs.push_back("-f"); vArgs.push_back(somefile_path.string()); execute( args( vArgs ) ); just works. Personally I do not like using filesystem::path for various reasons, I just use std::string for paths. So no calls to .string() will be in my code. -- Yakov