Boost.Process 0.5: Another update/potential candidate for an official library

I've uploaded a new version of Boost.Process 0.5 to <http://www.highscore.de/boost/process0.5/>: - Made child movable and non-copyable on Windows (destructor closes handles) - Added cross-platform helpers in a new header file boost/process/mitigate.hpp - Updated and improved documentation (eg. FAQ is new) - Added the POSIX initializer notify_io_service (Boost.Asio helper) - Minor bug fixes and improvements (eg. in search_path() on Windows) At this point there is nothing left on my todo list. I got quite a lot of feedback so far and would appreciate if people can look at this new version again. If there are no major shortcomings found, I'll ask for a formal review. Boris

First, thanks for your efforts. On Wed, Nov 7, 2012 at 9:47 PM, Boris Schaeling <boris@highscore.de> wrote:
At this point there is nothing left on my todo list. I got quite a lot of feedback so far and would appreciate if people can look at this new version again. If there are no major shortcomings found, I'll ask for a formal review.
I have process launching code to write starting tomorrow, I will try to use this and give feedbacks (on windows only though, VS2012). Does it work with Boost 1.51? Joel Lamotte

Quick questions after having read the doc: 1. Does error handling works with processes we launch and don't wait for too? I assume yes. 2. I remember pointing before that in this example: execute( run_exe("test.exe"), set_cmd_line("test --foo /bar")); There is repetition of the executable name. I suggested allowing to set_cmd_line() without run_exe(). Did you implement this possibility? 3."On Windows a relative path is relative to the work directory of the parent process. On POSIX a relative path is relative to the work directory set with start_in_dir as the directory is changed before the program starts." Does this means that using start_in_dir on Windows have no effect? 4. There are mentions of the helper facilities but it would be better if their use was explained in the document too. I still think it's problematic to not provide similar behaviour between platforms, but I'm willing to make a first (real application) test and see if I can help improve later. Joel Lamotte

On Wed, 07 Nov 2012 23:30:25 +0100, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote:
Quick questions after having read the doc: 1. Does error handling works with processes we launch and don't wait for too? I assume yes.
Yes.
2. I remember pointing before that in this example:
execute( run_exe("test.exe"), set_cmd_line("test --foo /bar"));
There is repetition of the executable name. I suggested allowing to set_cmd_line() without run_exe(). Did you implement this possibility?
No. But I agree with you that it would be nice to have. I had to draw a line somewhere though if we ever want to get a process management library into Boost. (But drop me a mail if you like to contribute a function which can parse the executable name and ideally considers quotes, blanks and escape characters. :)
3."On Windows a relative path is relative to the work directory of the parent process. On POSIX a relative path is relative to the work directory set with start_in_dir as the directory is changed before the program starts." Does this means that using start_in_dir on Windows have no effect?
start_in_dir() works on all platforms. It's just that on POSIX this happens: chdir("bla"); execve("../foo", ...); If the path to the executable is a relative one, it's relative to the new work directory as the directory is changed before execve() is called. Does the hint in the docs make sense now? Or I should rewrite it to make it clearer? Thanks for your feedback, Boris

On Tue, Nov 13, 2012 at 2:14 PM, Boris Schaeling <boris@highscore.de> wrote:
On Wed, 07 Nov 2012 23:30:25 +0100, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote:
execute( run_exe("test.exe"), set_cmd_line("test --foo /bar"));
There is repetition of the executable name. I suggested allowing to set_cmd_line() without run_exe(). Did you implement this possibility?
No. But I agree with you that it would be nice to have. I had to draw a line somewhere though if we ever want to get a process management library into Boost. (But drop me a mail if you like to contribute a function which can parse the executable name and ideally considers quotes, blanks and escape characters. :)
If we look at the variant using a vector for arguments, in that case, seems like it is quite trivial to get the executable as the first element in the vector. Anyways, seems to me that either you have to merge args in a vector in a single string on Windows, or split the string into an array of args on Unix. So if you anyway have to split, why not make run_exe optional? Or are you using the shell on Unix? Hopefully not (sorry, I did not download and look at the code, just the documentation a bit). Have looked at Python's subprocess module? It has, IMO, a very nice interface for launching processes, and you essentially specify the command either as a single string or a sequence of arguments, where the executable is the first element, and, following your interface, we could simply end up with something like, e.g. execute(args("test --foo /bar"), ...); for single string, and a another option like this execute(args(my_argument_sequence), ...); for arguments provided as a sequence, where the executable is the first element in the list. No need to always put separate run_exe, and no need for set_cmd_line and set_args to be different. In this case, args could be overloaded for a string (and/or c string) and a sequence (templated), or even a range or pair of iterators. args could be named something else, my point is, the command line being specified as a string or a sequence of tokens serves the same final purpose, hence it seems to me there is no need for a different name. Granted, this interface requires extra work to parse the string, but like I said, unless you make use of system() or the like on Unix, you have to do it anyways before calling any of the exec* variants, unless I am missing something. -- François Duranleau

On Tue, 13 Nov 2012 23:15:02 +0100, Francois Duranleau <xiao.bai.xiong@gmail.com> wrote:
[...]If we look at the variant using a vector for arguments, in that case, seems like it is quite trivial to get the executable as the first element in the vector. Anyways, seems to me that either you have to merge args in a vector in a single string on Windows, or split the string into an array of args on Unix. So if you anyway have to split, why not make run_exe optional? Or are you using the shell on Unix? Hopefully not (sorry, I did not download and look at the code, just the documentation a bit).
You are right, the command line is indeed already split.
[...]no need for set_cmd_line and set_args to be different. In this case, args could be overloaded for a string (and/or c string) and a sequence (templated), or even a range or pair of iterators. args could be named something else, my point is, the command line being specified as a string or a sequence of tokens serves the same final purpose, hence it seems to me there is no need for a different name. Granted, this
Yep, sounds all reasonable to me. That's definitely something which should be put back on the todo-list. Thanks for your feedback, too! Boris

On 13 November 2012 22:15 Francois Duranleau [mailto:xiao.bai.xiong@gmail.com] wrote :-
[...] Have looked at Python's subprocess module? It has, IMO, a very nice interface for launching processes, and you essentially specify the command either as a single string or a sequence of arguments, where the executable is the first element, and, following your interface, we could simply end up with something like, e.g.
execute(args("test --foo /bar"), ...);
Just a caveat about the python subprocess module - possibly this is fixed in python 3.0 but certainly for the python 2.X there is a problem with the python interface when used on windows. Consider: "foo my_file" Even if in the python code this is in Unicode when the process is called this is translated into 8 bit binary using the system code page which can destroy the my_file name. There appears to be no way to get python (at least 2.x versions) to call the underlying Win32 CreateProcess call using windows native utf16 encoding. The only solution to this that I know of is if you are in control of foo so can arrange that it can support utf8 by an optional command line switch and do : "foo --use_utf8 my_file" Or some such. If boost process was to model python subprocess then it would need some mechanism to be explicit about what encoding should be used to actually make the call for the process and how to translate the command line correctly. Alex

On Wed, Nov 14, 2012 at 8:24 AM, Alex Perry <Alex.Perry@smartlogic.com> wrote:
On 13 November 2012 22:15 Francois Duranleau [mailto:xiao.bai.xiong@gmail.com] wrote :-
[...] Have looked at Python's subprocess module? It has, IMO, a very nice interface for launching processes, and you essentially specify the command either as a single string or a sequence of arguments, where the executable is the first element, and, following your interface, we could simply end up with something like, e.g.
execute(args("test --foo /bar"), ...);
Just a caveat about the python subprocess module - possibly this is fixed in python 3.0 but certainly for the python 2.X there is a problem with the python interface when used on windows. Consider:
"foo my_file"
Even if in the python code this is in Unicode when the process is called this is translated into 8 bit binary using the system code page which can destroy the my_file name. There appears to be no way to get python (at least 2.x versions) to call the underlying Win32 CreateProcess call using windows native utf16 encoding. The only solution to this that I know of is if you are in control of foo so can arrange that it can support utf8 by an optional command line switch and do :
"foo --use_utf8 my_file"
Or some such.
If boost process was to model python subprocess then it would need some mechanism to be explicit about what encoding should be used to actually make the call for the process and how to translate the command line correctly.
Isn't this more an implementation issue rather than an API issue? I mean, the underlying implementation could easily convert an input UTF8 string to UTF16 string before making the call to CreateProcess? Granted, though, the command line string version should probably also support wide char strings. -- François Duranleau

On 14 November 2012 13:35 Francois Duranleau [mailto:xiao.bai.xiong@gmail.com] wrote :-
Isn't this more an implementation issue rather than an API issue? I mean, the underlying implementation could easily convert an input UTF8 string to UTF16 string before making the call to CreateProcess? Granted, though, the command line string version should probably also support wide char strings.
Sorry was unclear - Think it's both an implementation issue in python and a problem with interface. It’s a similar problem to why boost filesystem returns path objects rather than strings see http://www.boost.org/doc/libs/1_52_0/libs/filesystem/doc/tutorial.html#Class... const char * p = "foo my_file"; std:string s("foo my_file"); Aren't explicit about what encoding the string is using (eg utf8 or 8 bit using the current systems code page) - whilst 8 bit code pages are probably only of use for backwards compatibility for systems which care a lot about this (eg windows) any x-platform library would have to support specifying the encoding somehow (and possibly also what encoding foo is expecting to be called with) - Whilst this can be done cleanly using iostreams and boost locale - the simple passing a command line string is probably too flawed to be useful imho. Alex

On Wed, Nov 14, 2012 at 7:49 PM, Alex Perry <Alex.Perry@smartlogic.com>wrote:
It’s a similar problem to why boost filesystem returns path objects rather than strings [...] the simple passing a command line string is probably too flawed to be useful imho
The alternatives are not better. I definitely object to adding wchar_t interface. It has little use in portable code (how do you write portable code with this?), thus I consider it being inappropriate for boost. As was already said in this thread "uniformity is a requirement for the abstraction". This is exactly what boost filesystem did wrong. Just use UTF-8 on windows, that's it. Do not tell me that converting UTF-8 to UTF-16 is a significant overhead when creating a process... -- Yakov

On 14 November 2012 19:04 Yakov Galka [mailto:ybungalobill@gmail.com] wrote :-
Just use UTF-8 on windows, that's it. Do not tell me that converting UTF-8 to UTF-16 is a significant overhead when creating a process...
No its not overhead that I'm worried about - its functionality - if I have to call out to foo and this expects 8 bit encoded filenames (or more likely on windows utf16) then I need to be able to do that - and so if passing a "command line style call " must tell the boost process library what encoding my std::string actually is in some manner so that it can apply the correct transformation required. My problem with this is that its confusing so my original point was lets not try to model boost process on the system() call - that exists on all platforms I know of but doesn't really address the issues of writing x-platform code to handle processes.

Alex, I'm sorry, but I cannot make sense of what you write. On Thu, Nov 15, 2012 at 6:06 PM, Alex Perry <Alex.Perry@smartlogic.com>wrote:
[...] its functionality - if I have to call out to foo and this expects 8 bit encoded filenames (or more likely on windows utf16) then I need to be able to do that
What is foo? I assume you mean that foo is some executable. In such case it does not matter what it expects, 8 bit encoded command line or UTF-16 command line, since both are passed through the kernel as UTF-16 strings.
- and so if passing a "command line style call " must tell the boost process library what encoding my std::string actually is in some manner so that it can apply the correct transformation required.
If we recognize that boost process must support Unicode, then it *must* call CreateProcessW. So the only question is what encoding of std::strings it assumes. There are two alternatives on Windows: 1) ANSI 2) UTF-8. The former does not support Unicode, the later is incompatible with the basic execution character set of MSVC compiler. This is not a problem though, you are not required to be compatible with the standard library here. Also it is easy to change between the two based on a compile time switch. My problem with this is that its confusing ... What is confusing? Defining a uniform interface is confusing?
so my original point was lets not try to model boost process on the system() call - that exists on all platforms I know of but doesn't really address the issues of writing x-platform code to handle processes.
I totally lost you here. How system() is related? (Except that it launches a process too.) -- Yakov

Yakov Galka <ybungalobill@gmail.com> writes:
On Thu, Nov 15, 2012 at 6:06 PM, Alex Perry <Alex.Perry@smartlogic.com>wrote:
My problem with this is that its confusing ...
What is confusing? Defining a uniform interface is confusing?
Every C++ runtime implementation, every C++ standard library implementation and other Boost libraries on Windows assume that 8-bit char strings are encoded using the local ANSI codepage. While this is not ideal, it is at least consistent. The confusion arives from unilaterally changing one library to interpret 8-bit strings differently. Alex

On Thu, Nov 15, 2012 at 8:05 PM, Alexander Lamaison <awl03@doc.ic.ac.uk>wrote:
[...] Every C++ runtime implementation, every C++ standard library implementation
Likely to be true at this point in time (although non of us can check *every* C++ runtime), but I know plans to do otherwise.
and other Boost libraries on Windows
AFAIK not boost.locale. [...] it is at least consistent. So does using UTF-8 everywhere.
The confusion arives from unilaterally changing one library to interpret 8-bit strings differently.
This is not unilaterally. There are (non boost) libraries that do this. (E.g. sqlite) Also there will not be confusion if, as I said, it is done as a compile-time switch. -- Yakov

Yakov Galka <ybungalobill@gmail.com> writes:
On Thu, Nov 15, 2012 at 8:05 PM, Alexander Lamaison <awl03@doc.ic.ac.uk>wrote:
[...] Every C++ runtime implementation, every C++ standard library implementation
Likely to be true at this point in time (although non of us can check *every* C++ runtime), but I know plans to do otherwise.
and other Boost libraries on Windows
AFAIK not boost.locale.
I wasn't aware of this but you are right. Hmmmm, perhaps, as consistency is already lost we should just give up and use UTF-8 unilaterlly.
[...] it is at least consistent.
So does using UTF-8 everywhere.
That would indeed be consistent, but, alas, it isn't possible without the consent of the compiler/STL which doesn't seem to be forthcoming. So what do you do when you need at pass a string to the C++ runtime or standard library? (Yes, I'm aware of Artyom's Nowide library due for review but, as far as I'm aware, that doesn't replace all C++/STL functions that take strings).
Also there will not be confusion if, as I said, it is done as a compile-time switch.
There will still be confusion in the situation I mentioned above where you have to pass your UTF-8 string to the C++ library. You would be forced to use something like Boost.Locale to convert your string to the local codepage and vice verse at the interface between your code and the C++ library. Yuk. Alex

On Thu, Nov 15, 2012 at 8:55 PM, Alexander Lamaison <awl03@doc.ic.ac.uk>wrote:
[...]
[...] it is at least consistent.
So does using UTF-8 everywhere.
That would indeed be consistent, but, alas, it isn't possible without the consent of the compiler/STL which doesn't seem to be forthcoming.
As Peter correctly noted, nothing will move if everyone is driven by this logic. The change must be done by incremental steps, especially since I do not see Microsoft changing their implementation unilaterally (vendor lock-in?). People in WG21 can try to push it into the standard in some form (e.g. mandate or at least recommend the basic execution character set to be capable of storing Unicode). This would be great, but I estimate it to be a path of greater resistance. So what do you do when you need at pass a string to the C++ runtime or
standard library?
Depends on what part of the library. Most of the standard library string handling functions are encoding agnostic.
(Yes, I'm aware of Artyom's Nowide library due for review but, as far as I'm aware, that doesn't replace all C++/STL functions that take strings).
It replaces all those standard functions that interact with with the system. Which other do matter? Those that specifically deal with encodings (mbtowc, etc..) are useless on Windows because they do not support UTF-8... On Thu, Nov 15, 2012 at 8:39 PM, Alexander Lamaison <awl03@doc.ic.ac.uk>wrote:
[...] Indeed. But the decisions must happen well above the paygrade of an individual library author. It would need at least a Boost-wide policy change or, preferably, agreement among the Windows C++ compiler/library developers.
Agree, and see above. -- Yakov

Alexander Lamaison wrote:
Every C++ runtime implementation, every C++ standard library implementation and other Boost libraries on Windows assume that 8-bit char strings are encoded using the local ANSI codepage.
Yes they do, and none of them work. UTF-8 is the only 8 bit encoding that works. I know that it is sometimes better, for some values of better, to be consistent and wrong, but mistakes never get fixed if everyone remains consistent and wrong.

"Peter Dimov" <lists@pdimov.com> writes:
Alexander Lamaison wrote:
Every C++ runtime implementation, every C++ standard library implementation and other Boost libraries on Windows assume that 8-bit char strings are encoded using the local ANSI codepage.
Yes they do, and none of them work. UTF-8 is the only 8 bit encoding that works. I know that it is sometimes better, for some values of better, to be consistent and wrong, but mistakes never get fixed if everyone remains consistent and wrong.
Indeed. But the decisions must happen well above the paygrade of an individual library author. It would need at least a Boost-wide policy change or, preferably, agreement among the Windows C++ compiler/library developers. Alex

On 15 November 2012 17:39 Yakov Galka [mailto:ybungalobill@gmail.com] wrote :-
Alex, I'm sorry, but I cannot make sense of what you write.
Sorry was probably trying to be too concise and so didn't make sense.
What is foo? I assume you mean that foo is some executable. In such case it does not matter what it expects, 8 bit encoded command line or UTF-16 command line, since both are passed through the kernel as UTF-16 strings.
If we take some hypothetical executable foo which exists in executable form on both windows and linux and also assume that it takes the same parameters in both cases (this is probably a very restricted use case already but in any other case there would be probably be #ifdef code at the calling site) So at the command prompt / shell a call for foo (ignoring any path searching) would be something like :- "c:\foo_directory\foo.exe -f c:\tmp\somefile.foo" on windows "/usr/bin/foo -f /tmp/somefile.foo" on linux (was going to add VMS here but its been too long since I used it to remember any syntax) I was trying to point out (obviously unclearly) possible problems with the single command line version of execute Francois had suggested the following sensible looking syntax and change to the current boost process candidate
the executable is the first element, and, following your interface, we could simply end up with something like, e.g.
execute(args("test --foo /bar"), ...);
for single string, and a another option like this
execute(args(my_argument_sequence), ...);
for arguments provided as a sequence, where the executable is the first element in the list. No need to always put separate run_exe, and no need for set_cmd_line and set_args to be different. In this case,
So using the foo as given above execute(args("/usr/bin/foo -f /tmp/somefile.foo"")); execute(args("c:\\foo_directory\\foo.exe -f c:\\tmp\\somefile.foo")); I guess my point was simply that this args string would differ (whether it was encoded in utf8 or not) between platforms so doesn't seem to help to provide a uniform interface. So whilst not totally against it (it does provide a very simple usage) I wasn't sure whether it helped. Given boost::filesystem::path foo_path = .... boost::filesystem::path somefile_path = ... which have been determined by some mechanism (hopefully in a x-platform manner without #ifdefs) std::vector<boost::filesystem::path::string_type> vArgs; vArgs.push_back( foo_path.native() ); vArgs.push_back( "-f" ); vArgs.push_back( somefile_path.native() ); execute( args( vArgs ) ); seemed sufficient - trying to force this into a utf8string (I may have misunderstood what you were arguing for) just seems awkward to me. Alex

On Nov 16, 2012, at 6:28 AM, Alex Perry <Alex.Perry@smartlogic.com> wrote:
On 15 November 2012 17:39 Yakov Galka [mailto:ybungalobill@gmail.com] wrote :
What is foo? I assume you mean that foo is some executable. In such case it does not matter what it expects, 8 bit encoded command line or UTF-16 command line, since both are passed through the kernel as UTF-16 strings.
If we take some hypothetical executable foo which exists in executable form on both windows and linux and also assume that it takes the same parameters in both cases (this is probably a very restricted use case already but in any other case there would be probably be #ifdef code at the calling site)
So at the command prompt / shell a call for foo (ignoring any path searching) would be something like :-
"c:\foo_directory\foo.exe -f c:\tmp\somefile.foo" on windows "/usr/bin/foo -f /tmp/somefile.foo" on linux (was going to add VMS here but its been too long since I used it to remember any syntax)
I was trying to point out (obviously unclearly) possible problems with the single command line version of execute
Francois had suggested the following sensible looking syntax and change to the current boost process candidate
the executable is the first element, and, following your interface, we could simply end up with something like, e.g.
execute(args("test --foo /bar"), ...);
for single string, and a another option like this
execute(args(my_argument_sequence), ...);
for arguments provided as a sequence, where the executable is the first element in the list. No need to always put separate run_exe, and no need for set_cmd_line and set_args to be different. In this case,
So using the foo as given above
execute(args("/usr/bin/foo -f /tmp/somefile.foo"")); execute(args("c:\\foo_directory\\foo.exe -f c:\\tmp\\somefile.foo"));
I guess my point was simply that this args string would differ (whether it was encoded in utf8 or not) between platforms so doesn't seem to help to provide a uniform interface. So whilst not totally against it (it does provide a very simple usage) I wasn't sure whether it helped.
Your argument applies to pathnames alone. Any other arguments to a cross-platform application are unlikely to differ.
Given
boost::filesystem::path foo_path = .... boost::filesystem::path somefile_path = ...
which have been determined by some mechanism (hopefully in a x-platform manner without #ifdefs)
std::vector<boost::filesystem::path::string_type> vArgs;
vArgs.push_back( foo_path.native() ); vArgs.push_back( "-f" ); vArgs.push_back( somefile_path.native() );
execute( args( vArgs ) );
That's an inappropriate use of path. It is for filesystem paths, not arbitrary strings Something else is necessary. ___ Rob

On 16 November 2012 13:14 Rob Stewart [mailto:robertstewart@comcast.net] wrote :
[...] Your argument applies to pathnames alone. Any other arguments to a cross- platform application are unlikely to differ.
True but a]The first parameter (the executable to run) will nearly always have platform specific syntax b]Surely passing a path (or several) to some other process is common enough behaviour to consider how it could be done x-platform.
That's an inappropriate use of path. It is for filesystem paths, not arbitrary strings Something else is necessary.
Was using filesystem::path simply to select the appropriate string type rather than abusing path (I think) - However think I've changed my opinion about whether this is a good x-platform approach after reading Yakov's reply. Alex

On Fri, Nov 16, 2012 at 1:28 PM, Alex Perry <Alex.Perry@smartlogic.com>wrote:
[...] I was trying to point out (obviously unclearly) possible problems with the single command line version of execute
OK, so you were talking about the single string versus argument array? Then this is orthogonal to the encoding issue. My input on command lines vs argv ================== Problems in the current implementation ---------------------------- Try using the current implementation of set_args to call cmd.exe so it will do the same as the following call (i.e. set MSVC environment and then print the values of the inc* variables): cmd.exe /C ""%VS80COMNTOOLS%vsvars32.bat">NUL && set inc" Additionally, the current set_args implementation *is not* the right inverse of CommandLineToArgvW! The following: std::vector<std::string> vArgs; vArgs.push_back("a.exe"); vArgs.push_back("a b\\c"); // single backslash -- escape execute( set_args( vArgs ) ); will result in the following command line to CreateProcess (verbatim, no escapes): a.exe "a b\\c" Which will appear in main argv[] as *two* backslashes! Facts -------- POSIX argv[] and Windows command line cannot be mapped bijectively, unfortunately. The problem is that the splitting of the command line into arguments on Windows is done by any program in its own way. Usually, for C++ programs, this is done inside the CRT with the CommandLineToArgvW function (or equivalent), which happens to be bugged (ask me for details if interested). Other programs (like cmd.exe) parse the whole command line in a totally different way. No sensible set_args can be surjective. This may give the impression that providing set_cmd_line is inevitable. However, the use of such function will be limited mostly for Windows. Yet, we do not have to support everything that each platform provides. My opinion ---------- I prefer concise, minimal and uniform interfaces. This implies: * Use only the set_args, no set_cmd_line. Rationale: consistent with POSIX and the standard argv[] passed to main. Removes the need of run_exe or parsing the set_cmd_line to retrieve the exe name from there. * Leave the behavior in case of embedded quotation marks unspecified. Do not escape quotation marks within the argument. Rationale: no problem on POSIX, there it does no parsing anyway. On windows this will increase the image (set theoretic) of set_args. In particular it will be possible to invoke both examples from above. Cons: It's user's responsibility to escape double quotation marks on windows. But hey, she is the only one who could know how to do it properly. [...]
Given
boost::filesystem::path foo_path = .... boost::filesystem::path somefile_path = ...
which have been determined by some mechanism (hopefully in a x-platform manner without #ifdefs)
This is the assumption that fails. filesystem::path works great if you write for POSIX only, it works great if you write for windows only using wide strings, but it fails when you start writing portable code. Try initializing those paths from, for example, fields in an SQLIte database, or a text file. You can solve many issues imbuing a UTF-8 locale into filesystem, but this won't solve all the issues (.c_str()), and you cannot always change the global state to accomplish this. Continuing with your example:
std::vector<boost::filesystem::path::string_type> vArgs;
vArgs.push_back( foo_path.native() ); vArgs.push_back( "-f" );
Error on windows: cannot convert char[3] to std::wstring... vArgs.push_back( somefile_path.native() );
execute( args( vArgs ) );
seemed sufficient - trying to force this into a utf8string (I may have misunderstood what you were arguing for) just seems awkward to me.
Assuming we imbued a UTF-8 codecvt into boost filesystem (or adopted a policy that this is the default), then: std::vector<std::string> vArgs; vArgs.push_back(foo_path.string()); vArgs.push_back("-f"); vArgs.push_back(somefile_path.string()); execute( args( vArgs ) ); just works. Personally I do not like using filesystem::path for various reasons, I just use std::string for paths. So no calls to .string() will be in my code. -- Yakov

On 16 November 2012 14:04 Yakov Galka [mailto:ybungalobill@gmail.com] wrote :-
[...]
I prefer concise, minimal and uniform interfaces. This implies: * Use only the set_args, no set_cmd_line. Rationale: consistent with POSIX and the standard argv[] passed to main. Removes the need of run_exe or parsing the set_cmd_line to retrieve the exe name from there.
+1 Whilst supporting platform specific behaviour is nice - I agree that doing this here would be confusing and make writing x-platform code harder
* Leave the behavior in case of embedded quotation marks unspecified. Do not escape quotation marks within the argument. Rationale: no problem on POSIX, there it does no parsing anyway. On windows this will increase the image (set theoretic) of set_args. In particular it will be possible to invoke both examples from above. Cons: It's user's responsibility to escape double quotation marks on windows. But hey, she is the only one who could know how to do it properly.
Ok
Continuing with your example:
std::vector<boost::filesystem::path::string_type> vArgs;
vArgs.push_back( foo_path.native() ); vArgs.push_back( "-f" );
Error on windows: cannot convert char[3] to std::wstring...
Oops - too used to internal utf8string and utf16string classes which have implicit conversions But you are right and _T() style macro hacks are horrible...
Assuming we imbued a UTF-8 codecvt into boost filesystem (or adopted a policy that this is the default), then:
std::vector<std::string> vArgs; vArgs.push_back(foo_path.string()); vArgs.push_back("-f"); vArgs.push_back(somefile_path.string()); execute( args( vArgs ) );
just works.
Personally I do not like using filesystem::path for various reasons, I just use std::string for paths. So no calls to .string() will be in my code.
-- Yakov
I think I'm convinced by this argument - thanks for explaining I didn't get what you meant before. Will have a think over the weekend whether this would support all our use cases. Alex

On Fri, 16 Nov 2012 15:03:45 +0100, Yakov Galka <ybungalobill@gmail.com> wrote:
[...]I prefer concise, minimal and uniform interfaces. This implies: * Use only the set_args, no set_cmd_line. Rationale: consistent with POSIX and the standard argv[] passed to main. Removes the need of run_exe or parsing the set_cmd_line to retrieve the exe name from there. * Leave the behavior in case of embedded quotation marks unspecified. Do not escape quotation marks within the argument.
set_args() is currently using boost::io::quoted() if a space is found in an argument. If I understand correctly, you propose dropping this function call? Shall set_args() still check for a space and eventually wrap an argument in quotes? Or you think this should also be done by the library user? Boris
[...]

On Sat, Nov 17, 2012 at 1:39 AM, Boris Schaeling <boris@highscore.de> wrote:
On Fri, 16 Nov 2012 15:03:45 +0100, Yakov Galka <ybungalobill@gmail.com> wrote:
[...]I prefer concise, minimal and uniform interfaces. This implies:
* Use only the set_args, no set_cmd_line. Rationale: consistent with POSIX and the standard argv[] passed to main. Removes the need of run_exe or parsing the set_cmd_line to retrieve the exe name from there. * Leave the behavior in case of embedded quotation marks unspecified. Do not escape quotation marks within the argument.
set_args() is currently using boost::io::quoted() if a space is found in an argument. If I understand correctly, you propose dropping this function call?
Yes. The current behavior limits the functionality, as the examples I provided demonstrate.
Shall set_args() still check for a space and eventually wrap an argument in quotes? Or you think this should also be done by the library user?
The former. Why? Using the same criteria as before — wrapping the argument in quotes means that one will not be able to specify the whitespace between the quoted arguments. However, I think that it is a reasonable assumption that it does not matter what kind of whitespace separates the arguments. Take the said above with a grain of salt. -- Yakov

Le 13/11/12 20:14, Boris Schaeling a écrit :
On Wed, 07 Nov 2012 23:30:25 +0100, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote:
Quick questions after having read the doc: 1. Does error handling works with processes we launch and don't wait for too? I assume yes.
Yes.
2. I remember pointing before that in this example:
execute( run_exe("test.exe"), set_cmd_line("test --foo /bar"));
There is repetition of the executable name. I suggested allowing to set_cmd_line() without run_exe(). Did you implement this possibility?
No. But I agree with you that it would be nice to have. I had to draw a line somewhere though if we ever want to get a process management library into Boost. (But drop me a mail if you like to contribute a function which can parse the executable name and ideally considers quotes, blanks and escape characters. :)
3."On Windows a relative path is relative to the work directory of the parent process. On POSIX a relative path is relative to the work directory set with start_in_dir as the directory is changed before the program starts." Does this means that using start_in_dir on Windows have no effect?
start_in_dir() works on all platforms. It's just that on POSIX this happens:
chdir("bla"); execve("../foo", ...);
If the path to the executable is a relative one, it's relative to the new work directory as the directory is changed before execve() is called.
Does the hint in the docs make sense now? Or I should rewrite it to make it clearer?
I guess that in this case the library should either allow only absolute paths or transform the relative path on an absolute path so that it is not subject to interpretations that depend on the platform. Could this be possible? Best, Vicente

On Thu, 15 Nov 2012 18:59:11 +0100, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:
[...]I guess that in this case the library should either allow only absolute paths or transform the relative path on an absolute path so that it is not subject to interpretations that depend on the platform. Could this be possible?
We could call boost::filesystem::absolute() to turn a relative path into an absolute one. I'm not sure where to put this call though. If it's put into start_in_dir to make an absolute path just before chdir() is called, it won't work with this code: execute(start_in_dir("bla"), run_exe("../foo")); Here start_in_dir is run before run_exe. So there is no path yet to turn into an absolute one. But if we put the call into run_exe, this could be surprising: run_exe r("../foo"); chdir("bar"); execute(r, start_in_dir("bla")); ../foo will be relative to whatever the current work directory was when run_exe was instantiated and not relative to bar. But then all those directory changes are confusing anyway. So maybe it makes most sense to call boost::filesystem::absolute() in the constructor of run_exe? Boris

Le 17/11/12 01:04, Boris Schaeling a écrit :
On Thu, 15 Nov 2012 18:59:11 +0100, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:
[...]I guess that in this case the library should either allow only absolute paths or transform the relative path on an absolute path so that it is not subject to interpretations that depend on the platform. Could this be possible?
We could call boost::filesystem::absolute() to turn a relative path into an absolute one. I'm not sure where to put this call though. If it's put into start_in_dir to make an absolute path just before chdir() is called, it won't work with this code:
execute(start_in_dir("bla"), run_exe("../foo"));
Here start_in_dir is run before run_exe. So there is no path yet to turn into an absolute one. But if we put the call into run_exe, this could be surprising:
run_exe r("../foo"); chdir("bar"); execute(r, start_in_dir("bla"));
../foo will be relative to whatever the current work directory was when run_exe was instantiated and not relative to bar. But then all those directory changes are confusing anyway. So maybe it makes most sense to call boost::filesystem::absolute() in the constructor of run_exe?
Maybe 'execute' could take care of the normalization :) Best, Vicente

Le 07/11/12 21:47, Boris Schaeling a écrit :
I've uploaded a new version of Boost.Process 0.5 to <http://www.highscore.de/boost/process0.5/>:
- Made child movable and non-copyable on Windows (destructor closes handles) - Added cross-platform helpers in a new header file boost/process/mitigate.hpp - Updated and improved documentation (eg. FAQ is new) - Added the POSIX initializer notify_io_service (Boost.Asio helper) - Minor bug fixes and improvements (eg. in search_path() on Windows)
At this point there is nothing left on my todo list. I got quite a lot of feedback so far and would appreciate if people can look at this new version again. If there are no major shortcomings found, I'll ask for a formal review.
Hi, I can admit that there could be some extension for specific platforms, but from the documentation it seems that the common part is not enough to make portable programs. Examples are, cleaning up resources, handling errors, the relative path, inheriting env variables, asynchronous i/o, child destructor, return code for wait_for_exit, ... The mix of posix and windows specific in class child is really troubling struct child { // construct/copy/destruct explicit child(const PROCESS_INFORMATION &); explicit child(pid_t); // public member functions HANDLE process_handle() const; // public data members PROCESS_INFORMATION proc_info; pid_t pid; }; Are these really public members? Best, Vicente

On Wed, 07 Nov 2012 23:29:09 +0100, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:
[...]I can admit that there could be some extension for specific platforms, but from the documentation it seems that the common part is not enough to make portable programs. Examples are, cleaning up resources, handling errors, the relative path, inheriting env variables, asynchronous i/o, child destructor, return code for wait_for_exit, ...
If you mean with portable programs not having to use #ifdefs in user code, then everything in the list above can now be done without #ifdefs. The only use case where I can't offer a solution without #ifdefs is waiting asynchronously for processes to exit. But then the solution offered is at least based on a Boost library.
The mix of posix and windows specific in class child is really troubling
struct child { // construct/copy/destruct explicit child(const PROCESS_INFORMATION &); explicit child(pid_t);
// public member functions HANDLE process_handle() const;
// public data members PROCESS_INFORMATION proc_info; pid_t pid; };
Are these really public members?
The constructors allow you to instantiate child if you got process information or a pid for example from a third-party library. process_handle() is used by wait_for_exit() and terminate(). And proc_info and pid are public for convenience. If there is a preference to make the member variables private and define accessors - no objection from me. Thanks for your feedback, Boris

On Wed, 07 Nov 2012 23:29:09 +0100, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:
[...]I can admit that there could be some extension for specific platforms, but from the documentation it seems that the common part is not enough to make portable programs. Examples are, cleaning up resources, handling errors, the relative path, inheriting env variables, asynchronous i/o, child destructor, return code for wait_for_exit, ...
If you mean with portable programs not having to use #ifdefs in user code, then everything in the list above can now be done without #ifdefs. Could you then update the documentation so that there are no such dependencies on the underlying platform?
The only use case where I can't offer a solution without #ifdefs is waiting asynchronously for processes to exit. But then the solution offered is at least based on a Boost library. Yes, I see that Boost.Asio has not reached to provide a platform independent abstraction (I'm wondering how this feature appear in the
Le 13/11/12 20:38, Boris Schaeling a écrit : proposal to the standard C++ committe). Any way, could you explain why do you need to use a non portable feature?
The mix of posix and windows specific in class child is really troubling
struct child { // construct/copy/destruct explicit child(const PROCESS_INFORMATION &); explicit child(pid_t);
// public member functions HANDLE process_handle() const;
// public data members PROCESS_INFORMATION proc_info; pid_t pid; };
Are these really public members?
The constructors allow you to instantiate child if you got process information or a pid for example from a third-party library. process_handle() is used by wait_for_exit() and terminate(). This can be hidden as an implementation detail (use of friend). And proc_info and pid are public for convenience. If there is a preference to make the member variables private and define accessors - no objection from me.
You could define a implementation defined native_handle typedef struct child { typedef platform-specific-type native_handle_type; // construct/copy/destruct explicit child(native_handle); native_handle_type native_handle(); }; HTH, Vicente

On Tue, 13 Nov 2012 20:57:36 +0100, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:
[...]Could you then update the documentation so that there are no such dependencies on the underlying platform?
There are notes referring to boost/process/mitigate.hpp wherever #ifdefs are used in the tutorial (the asynchronous waiting example excluded). The reason why I didn't use the helpers from that header file in the tutorial directly: They don't work necessarily as expected in all cases like the rest of the library. For instance, while boost::process::pipe_end works for the examples in the tutorial it's two different types on POSIX and Windows. And Boost.Process makes no guarantee that these two types behave consistently on POSIX and Windows (especially not as the types are from another Boost library). So I'm trying to make it clear that there is a different quality of service: Most features are fully supported, 100% cross-platform and safe but a few aren't. And you use these few on your own risk. This isn't done to make cross-platform code harder to write but to help users to make a conscious decision and avoid shooting in their own leg. It is of course possible to rearrange the documentation. For instance, all examples with asynchronous operations are in that sense problematic. They could be removed from the tutorial and put somewhere else in the documentation. But I'm not sure whether that makes a difference in the end.
[...]Yes, I see that Boost.Asio has not reached to provide a platform independent abstraction (I'm wondering how this feature appear in the proposal to the standard C++ committe). Any way, could you explain why do you need to use a non portable feature?
To wait asynchronously on POSIX I use boost::asio::signal_set to register a signal handler for SIGCHLD. This is a global setting and must be done before a child process is created (actually exits). To wait asynchronously on Windows I use boost::asio::windows::object_handle to wait on a child process handle. This is a per-child setting and can only be done after the child process has been created (as only then we have the handle to wait on). For a cross-platform solution we would need to merge these two concepts somehow: A global setting before child processes are created on POSIX and a per-child setting after child processes are created on Windows. Here POSIX and Windows are unfortunately totally different.
[...]You could define a implementation defined native_handle typedef
struct child {
typedef platform-specific-type native_handle_type;
// construct/copy/destruct explicit child(native_handle);
native_handle_type native_handle();
};
Would be fine with me. I wish the problems we have to deal with around Boost.Process were all like this. ;) Boris

On Tue, 13 Nov 2012 20:57:36 +0100, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:
[...]Could you then update the documentation so that there are no such dependencies on the underlying platform?
There are notes referring to boost/process/mitigate.hpp wherever #ifdefs are used in the tutorial (the asynchronous waiting example excluded). The reason why I didn't use the helpers from that header file in the tutorial directly: They don't work necessarily as expected in all cases like the rest of the library.
For instance, while boost::process::pipe_end works for the examples in the tutorial it's two different types on POSIX and Windows. And Boost.Process makes no guarantee that these two types behave consistently on POSIX and Windows (especially not as the types are from another Boost library). So I'm trying to make it clear that there is a different quality of service: Most features are fully supported, 100% cross-platform and safe but a few aren't. And you use these few on your own risk. This isn't done to make cross-platform code harder to write but to help users to make a conscious decision and avoid shooting in their own leg.
It is of course possible to rearrange the documentation. For instance, all examples with asynchronous operations are in that sense problematic. They could be removed from the tutorial and put somewhere else in the documentation. But I'm not sure whether that makes a difference in the end. IMHO, the documentation must state clearly in a core section what can be done in a portable way. Then add a non-portable section with each one of
Le 15/11/12 01:09, Boris Schaeling a écrit : the differences. This will be a first step. Then of course, I would expect the library makes the last section no bigger than the fist one ;-)
[...]Yes, I see that Boost.Asio has not reached to provide a platform independent abstraction (I'm wondering how this feature appear in the proposal to the standard C++ committe). Any way, could you explain why do you need to use a non portable feature?
To wait asynchronously on POSIX I use boost::asio::signal_set to register a signal handler for SIGCHLD. This is a global setting and must be done before a child process is created (actually exits). To wait asynchronously on Windows I use boost::asio::windows::object_handle to wait on a child process handle. This is a per-child setting and can only be done after the child process has been created (as only then we have the handle to wait on). For a cross-platform solution we would need to merge these two concepts somehow: A global setting before child processes are created on POSIX and a per-child setting after child processes are created on Windows. Here POSIX and Windows are unfortunately totally different.
Couldn't the library store temporarily some data so that the user is not aware of the difference?
[...]You could define a implementation defined native_handle typedef
struct child {
typedef platform-specific-type native_handle_type;
// construct/copy/destruct explicit child(native_handle);
native_handle_type native_handle();
};
Would be fine with me. I wish the problems we have to deal with around Boost.Process were all like this. ;)
I understand the subject is not easy, but I have the impression that the portability responsibility has left to the user. Best, Vicente

On Thu, 15 Nov 2012 07:39:19 +0100, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:
[...]
To wait asynchronously on POSIX I use boost::asio::signal_set to register a signal handler for SIGCHLD. This is a global setting and must be done before a child process is created (actually exits). To wait asynchronously on Windows I use boost::asio::windows::object_handle to wait on a child process handle. This is a per-child setting and can only be done after the child process has been created (as only then we have the handle to wait on). For a cross-platform solution we would need to merge these two concepts somehow: A global setting before child processes are created on POSIX and a per-child setting after child processes are created on Windows. Here POSIX and Windows are unfortunately totally different. Couldn't the library store temporarily some data so that the user is not aware of the difference?
I'm actually pretty happy with the current solution as I had to create the I/O object boost::asio::windows::object_handle first. :) Before that it wasn't possible to wait asynchronously on Windows. While we don't have cross-platform code unfortunately (still #ifdefs required), we have at least cross-platform concepts. And in that sense I'd say another important goal has been met: The code is much easier to write and understand than having to call POSIX or Windows API functions yourself. Now it might be possible to build something on top of the Boost.Asio classes and like you proposed store temporarily data somewhere. But I'm not much further either. If anyone wants to give it a try and write some code - drop me a mail.
[...]I understand the subject is not easy, but I have the impression that the portability responsibility has left to the user.
It's true that depending on your use case the current library can't help you to write 100% cross-platform code. I think we are pretty close though. More importantly I believe that what the library loses in cross-platform ability it easily wins in usability. I wouldn't want to write all that low-level code myself what is wrapped by Boost.Process (and partly by Boost.Asio). So I associate Boost.Process more with the goal of making C++ easier to learn and to teach. It would be even easier if no #ifdefs were required. But in my opinion 95% cross-platform ability and 100% usability is still better than 0% of everything. Now it could be that the general consensus in the Boost community is that a library must be 100% cross-platform in order to be accepted. If that's the case and the library is rejected on that ground - no problem for me. But the only way to find out is probably going through a review next? Boris

On Thu, 15 Nov 2012 07:39:19 +0100, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:
[...]
To wait asynchronously on POSIX I use boost::asio::signal_set to register a signal handler for SIGCHLD. This is a global setting and must be done before a child process is created (actually exits). To wait asynchronously on Windows I use boost::asio::windows::object_handle to wait on a child process handle. This is a per-child setting and can only be done after the child process has been created (as only then we have the handle to wait on). For a cross-platform solution we would need to merge these two concepts somehow: A global setting before child processes are created on POSIX and a per-child setting after child processes are created on Windows. Here POSIX and Windows are unfortunately totally different. Couldn't the library store temporarily some data so that the user is not aware of the difference?
I'm actually pretty happy with the current solution as I had to create the I/O object boost::asio::windows::object_handle first. :) Before that it wasn't possible to wait asynchronously on Windows. While we don't have cross-platform code unfortunately (still #ifdefs required), we have at least cross-platform concepts. And in that sense I'd say another important goal has been met: The code is much easier to write and understand than having to call POSIX or Windows API functions yourself. I agree. I have not look at the implementation, so that I don't know the constraints. I'm just trying to give you some hints on an alternative design that could take care of these portability issues. Of course, this is not easy, but if at the end the library reach to provide a portable interface, you could always provide later platform specific interfaces
Le 17/11/12 02:33, Boris Schaeling a écrit : that are more efficient.
Now it might be possible to build something on top of the Boost.Asio classes and like you proposed store temporarily data somewhere. But I'm not much further either. If anyone wants to give it a try and write some code - drop me a mail.
[...]I understand the subject is not easy, but I have the impression that the portability responsibility has left to the user.
It's true that depending on your use case the current library can't help you to write 100% cross-platform code. I think we are pretty close though. More importantly I believe that what the library loses in cross-platform ability it easily wins in usability. I wouldn't want to write all that low-level code myself what is wrapped by Boost.Process (and partly by Boost.Asio). So I associate Boost.Process more with the goal of making C++ easier to learn and to teach. It would be even easier if no #ifdefs were required. But in my opinion 95% cross-platform ability and 100% usability is still better than 0% of everything.
Now it could be that the general consensus in the Boost community is that a library must be 100% cross-platform in order to be accepted. If that's the case and the library is rejected on that ground - no problem for me. But the only way to find out is probably going through a review next?
You have requested some feedback to your version 0.5. Some of us have suggested alternative designs. Don't exploring them before the review will postpone the discussion to the short period of the review, which IMO is not good. Best, Vicente

On Nov 14, 2012, at 7:09 PM, "Boris Schaeling" <boris@highscore.de> wrote:
To wait asynchronously on POSIX I use boost::asio::signal_set to register a signal handler for SIGCHLD. This is a global setting and must be done before a child process is created (actually exits). To wait asynchronously on Windows I use boost::asio::windows::object_handle to wait on a child process handle. This is a per-child setting and can only be done after the child process has been created (as only then we have the handle to wait on). For a cross-platform solution we would need to merge these two concepts somehow: A global setting before child processes are created on POSIX and a per-child setting after child processes are created on Windows. Here POSIX and Windows are unfortunately totally different.
How about a global setting and then have the child class, on Windows, inspect that setting to decide whether to add the per-child wait? There's still the need to support signal handler chaining, optionally, on POSIX. ___ Rob

Rob Stewart <robertstewart@comcast.net> writes:
On Nov 14, 2012, at 7:09 PM, "Boris Schaeling" <boris@highscore.de> wrote:
To wait asynchronously on POSIX I use boost::asio::signal_set to register a signal handler for SIGCHLD. This is a global setting and must be done before a child process is created (actually exits). To wait asynchronously on Windows I use boost::asio::windows::object_handle to wait on a child process handle. This is a per-child setting and can only be done after the child process has been created (as only then we have the handle to wait on). For a cross-platform solution we would need to merge these two concepts somehow: A global setting before child processes are created on POSIX and a per-child setting after child processes are created on Windows. Here POSIX and Windows are unfortunately totally different.
How about a global setting and then have the child class, on Windows, inspect that setting to decide whether to add the per-child wait?
I was also going to suggest something like this. Yes, it is a lowest-common-denominator solution but that's the price you pay for portability. Alex

On Nov 13, 2012, at 2:38 PM, "Boris Schaeling" <boris@highscore.de> wrote:
On Wed, 07 Nov 2012 23:29:09 +0100, Vicente J. Botet Escriba <vicente.botet@wanadoo.fr> wrote:
// public data members PROCESS_INFORMATION proc_info; pid_t pid; };
Are these really public members?
The constructors allow you to instantiate child if you got process information or a pid for example from a third-party library. process_handle() is used by wait_for_exit() and terminate(). And proc_info and pid are public for convenience. If there is a preference to make the member variables private and define accessors - no objection from me.
Public and const is not a problem. ___ Rob

"Boris Schaeling" <boris@highscore.de> writes:
I've uploaded a new version of Boost.Process 0.5 to <http://www.highscore.de/boost/process0.5/>:
- Made child movable and non-copyable on Windows (destructor closes handles)
... and from docs "On Windows child is a movable but non-copyable type. The destructor closes handles to the child process when the instance of child goes out of scope. On Windows it's not strictly required to call wait_for_exit to clean up resources." Does this mean child *is* copyable on other platforms? So the class has entirely different semantics on different platforms?
From the docs: "While inherit_env is required on POSIX, environment variables are also inherited without this initializer on Windows as on Windows environment variables are inherited by default."
I still don't get why you're letting this behaviour vary between platform. Sure, the underlying system calls have different defaults but they are both capable of both behaviours. The whole point of a platform-independent library is to align such variation. Alex

On Thu, 08 Nov 2012 01:08:21 +0100, Alexander Lamaison <awl03@doc.ic.ac.uk> wrote:
[...]Does this mean child *is* copyable on other platforms? So the class has entirely different semantics on different platforms?
Yes. It doesn't own resources on POSIX. I understand what you hint at. I'm not sure though whether we can justify making child non-copyable for POSIX developers just because Windows developers can't copy it.
[...]I still don't get why you're letting this behaviour vary between platform. Sure, the underlying system calls have different defaults but they are both capable of both behaviours. The whole point of a platform-independent library is to align such variation.
If you use inherit_env(), you get the cross-platform behavior you are looking for. But I prefer not to break the "pay for what you use"-principle if you don't use inherit_env(). Besides, standardizing the default behavior requires you to pick one. But whether this should be always POSIX, always Windows, always the most convenient or always the easiest to implement - I'm afraid there will be always others who want another default behavior. Boris

Le 13/11/12 22:39, Boris Schaeling a écrit :
On Thu, 08 Nov 2012 01:08:21 +0100, Alexander Lamaison <awl03@doc.ic.ac.uk> wrote:
[...]Does this mean child *is* copyable on other platforms? So the class has entirely different semantics on different platforms?
Yes. It doesn't own resources on POSIX.
I understand what you hint at. I'm not sure though whether we can justify making child non-copyable for POSIX developers just because Windows developers can't copy it. Uniformity is better. The native_handle could be copyable and this could be platform specific. This make evident that the code is platform specific as the type is platform specific.
Maybe you could also provide a child::id class that is copyable as e.g. std/boost::thread does.
[...]I still don't get why you're letting this behaviour vary between platform. Sure, the underlying system calls have different defaults but they are both capable of both behaviours. The whole point of a platform-independent library is to align such variation.
If you use inherit_env(), you get the cross-platform behavior you are looking for. But I prefer not to break the "pay for what you use"-principle if you don't use inherit_env(). Besides, standardizing the default behavior requires you to pick one. But whether this should be always POSIX, always Windows, always the most convenient or always the easiest to implement - I'm afraid there will be always others who want another default behavior.
One alternative is to make it inherit by default, so that a common behavior is ensured. Then you can provide a platform specific constructor on the platform allowing to don't inherit the environment. Best, Vicente

On Nov 13, 2012, at 5:17 PM, "Vicente J. Botet Escriba" <vicente.botet@wanadoo.fr> wrote:
Le 13/11/12 22:39, Boris Schaeling a écrit :
On Thu, 08 Nov 2012 01:08:21 +0100, Alexander Lamaison <awl03@doc.ic.ac.uk> wrote:
[...]Does this mean child *is* copyable on other platforms? So the class has entirely different semantics on different platforms?
Yes. It doesn't own resources on POSIX.
I understand what you hint at. I'm not sure though whether we can justify making child non-copyable for POSIX developers just because Windows developers can't copy it. Uniformity is better.
+1 Why must child be copyable? Just because the underlying implementation permits it doesn't mean that you must permit it in your abstraction.
The native_handle could be copyable and this could be platform specific. This make evident that the code is platform specific as the type is platform specific.
Agreed
[...]I still don't get why you're letting this behaviour vary between platform. Sure, the underlying system calls have different defaults but they are both capable of both behaviours. The whole point of a platform-independent library is to align such variation.
If you use inherit_env(), you get the cross-platform behavior you are looking for. But I prefer not to break the "pay for what you use"-principle if you don't use inherit_env().
What cost is there, in each case, to deviate from the default? Is it possible that Windows inherit default is actually costlier than disabling it? (I don't know how it's implemented or if not inheriting just undoes work already done.)
Besides, standardizing the default behavior requires you to pick one. But whether this should be always POSIX, always Windows, always the most convenient or always the easiest to implement - I'm afraid there will be always others who want another default behavior.
One alternative is to make it inherit by default, so that a common behavior is ensured.
As you noted before, uniformity is better.
Then you can provide a platform specific constructor on the platform allowing to don't inherit the environment.
How about an argument that defaults to whichever you like, but offers three choices: inherit, don't inherit, and don't care. Document the cost of the first two on each platform, and describe the third as giving the platform default. (If there's actually a cost inherent to Windows' default, "don't inherit" could be preferable, and you wouldn't need the "don't care" option.) ___ Rob

On 2012-11-14 11:00, Rob Stewart wrote:
Uniformity is better. +1
Why must child be copyable? Just because the underlying implementation permits it doesn't mean that you must permit it in your abstraction.
+1 I'd even say uniformity is a requirement for the abstraction (unless where it is impossible to achieve)

On Wed, Nov 14, 2012 at 11:00 AM, Rob Stewart <robertstewart@comcast.net>wrote:
Why must child be copyable? Just because the underlying implementation permits it doesn't mean that you must permit it in your abstraction.
I don't understand why it should be either. To me it should be movable only, and as someone suggested, the id could be copyable as it's only a value. As other suggested, to me I will not agree with boost process interface if it don't provide consistent behaviour between platforms, even if it have different costs between platforms. The point is to write the code in one way and then possibly have specific code if you want to do platform specific code. In my case I don't case how it works inside as long as I don't have to have separate code for different platforms. I thought it was the primary goal of the library? The suggestions going to default behaviour but optional constructor arguments to help with platform specific behaviours seems going to the right direction. Joel Lamotte

On 07 November 2012 20:48 Boris Schaeling [mailto:boris@highscore.de] wrote :-
I've uploaded a new version of Boost.Process 0.5 to <http://www.highscore.de/boost/process0.5/>:
...
At this point there is nothing left on my todo list. I got quite a lot of feedback so far and would appreciate if people can look at this new version again. If there are no major shortcomings found, I'll ask for a formal review. Just took a quick look (scan of the documentation rather than trying anything) and it certainly looks much cleaner and better featured than our roll-your-own x-platform process handling that we use internally.
However one thing that stops your library being a direct replacement of our own stuff (I hesitate to use the word library for it) is the cleaning up of resources on POSIX. The problem we have is that not all child processes are equal - some are important and we care about success or otherwise of their operations - others are more throw away. I appreciate that the problem here lies in the underlying signal handling on posix and any application of large enough complexity has probably taken some application wide approach to signals and turned them into something slightly saner for use in a C++ world but still .... Would it be possible for boost process to provide at least a partial solution to this? For example provide a signal handler which can be installed on SIGCHLD which keeps a record of the previous handler and a container of "don't care about" pid's - when this signal is raised iterate through the pids reaping any results (waitpid(*it, &result, WNOHANG) and removing from the list ) - if the signal doesn't appear to come from one of our ignored pid's then call the previous handler if valid. Whilst this isn't a difficult handler to write there are probably a few gotcha's with locking and initialisation so would be nice to have a version already rolled in the library and it might make a nice feature for simple usage. execute( run_exe("test.exe"), no_wait() ); or some such - which could then be x-platform and avoid build-up of zombie processes without having to do :- #if defined(BOOST_POSIX_API) signal(SIGCHLD, SIG_IGN); #endif execute(run_exe("test.exe")); as recommended in the documentation (and also would allow other areas of the application to use asio or some such to handle asynchronous results for different children which just ignoring SIGCHLD would break) Since implementation would probably require a static container doing this in a header only library is possibly more hassle than it's worth but just thought I'd ask. Alex

I just included boost/process.hpp and now VS is looking for the boost iostream library .lib file. Is boost iostream always a requirement with boost process? I thought it was a requirement only when using io between processes? Joel Lamotte

On Sat, 10 Nov 2012 02:06:53 +0100, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote:
I just included boost/process.hpp and now VS is looking for the boost iostream library .lib file. Is boost iostream always a requirement with boost process? I thought it was a requirement only when using io between processes?
Boost.Iostreams types are used by the initializers bind_stdin, bind_stdout and bind_stderr. If you don't use these initializers and pull in header files one by one, I think it's possible to get rid of this dependency. Boris

On Thu, 08 Nov 2012 16:19:14 +0100, Alex Perry <Alex.Perry@smartlogic.com> wrote:
[...]Would it be possible for boost process to provide at least a partial solution to this?
Do you have code you can show me (if you prefer in a mail)? I understand you are using what you described in your own application?
[...]Since implementation would probably require a static container doing this in a header only library is possibly more hassle than it's worth but just thought I'd ask.
No problem. As of today I'm a bit more interested to see though whether we can get to an agreement in Boost whether the current version 0.5 is the way to go (not sure how much you know about this library; but it's now in total six years, two GSoC projects, one sponsorship, one BoostCon presentation, countless drafts and months of discussions on this mailing list that we are trying to get at least something done :). Boris

On Nov 8, 2012, at 10:19 AM, Alex Perry <Alex.Perry@smartlogic.com> wrote:
On 07 November 2012 20:48 Boris Schaeling [mailto:boris@highscore.de] wrote :-
At this point there is nothing left on my todo list. I got quite a lot of feedback so far and would appreciate if people can look at this new version again. If there are no major shortcomings found, I'll ask for a formal review.
However one thing that stops your library being a direct replacement of our own stuff (I hesitate to use the word library for it) is the cleaning up of resources on POSIX.
The problem we have is that not all child processes are equal - some are important and we care about success or otherwise of their operations - others are more throw away.
I appreciate that the problem here lies in the underlying signal handling on posix and any application of large enough complexity has probably taken some application wide approach to signals and turned them into something slightly saner for use in a C++ world but still ....
Would it be possible for boost process to provide at least a partial solution to this?
For example provide a signal handler which can be installed on SIGCHLD which keeps a record of the previous handler and a container of "don't care about" pid's - when this signal is raised iterate through the pids reaping any results (waitpid(*it, &result, WNOHANG) and removing from the list ) - if the signal doesn't appear to come from one of our ignored pid's then call the previous handler if valid.
Whilst this isn't a difficult handler to write there are probably a few gotcha's with locking and initialisation so would be nice to have a version already rolled in the library and it might make a nice feature for simple usage.
execute( run_exe("test.exe"), no_wait() );
or some such - which could then be x-platform and avoid build-up of zombie processes without having to do :-
#if defined(BOOST_POSIX_API) signal(SIGCHLD, SIG_IGN); #endif execute(run_exe("test.exe"));
as recommended in the documentation (and also would allow other areas of the application to use asio or some such to handle asynchronous results for different children which just ignoring SIGCHLD would break)
If Boost.Process provided a hook for installing a SIGCHLD signal handler, users could choose when and how to install it. For example an overloaded function/ctor can take a boost::function that can be called with Boost.Process' signal handler, leaving it to the caller to register it. That handler, and the boost::function overload can have two forms: one that chains and one that doesn't.
Since implementation would probably require a static container doing this in a header only library is possibly more hassle than it's worth but just thought I'd ask.
In the spirit of don't pay for what you don't use, the library can be header only unless one uses this particular feature. ___ Rob

On Wed, Nov 7, 2012 at 10:47 PM, Boris Schaeling <boris@highscore.de> wrote:
I've uploaded a new version of Boost.Process 0.5 to < http://www.highscore.de/boost/process0.5/>:
I do not see the documentation mentioning thread safety, although the implementation *is not* thread safe. The following is original research done by me when implementing my own process library. I could not find any references for this issue, and in fact for most applications this is not a problem. Yet, I think that my duty is to share this, and as a library author, you should give it a thought. Consider the following function (taken directly from the FAQ): void f(const std::string &path) { boost::process::pipe p = create_pipe(); { // 1 file_descriptor_sink sink(p.sink, close_handle); execute( run_exe(path), bind_stdout(sink) ); // 2 } // 3 file_descriptor_source source(p.source, close_handle); stream<file_descriptor_source> is(source); std::string s; while (is >> s) std::cout << s << std::endl; } Let there be two threads A and B, one calling f("a.exe") and the other f("b.exe"). Assume a.exe being a long living program and b.exe a short living one. (A more realistic scenario is that we launch a.exe, do short communication, and then detaching from it, that is closing the pipe on the parent side and letting the child run indefinitely. For example it may be an external app launched by the user from a GUI thread. b.exe may be some other utility used internally by our program launched by a back-ground thread.) Now, consider the following execution order: A1, B1, A2, B2, A3, B3. The result is that the write-end of the pipe of b.exe is inherited by a.exe too (i.e. 'leaked' into a.exe). b.exe exits, closing its write-end of the pipe. However, a.exe still holds (indefinitely) the write-end of b's pipe! Result: thread B will hang until the unrelated process a.exe exits (and perhaps all the children that a.exe created in its own turn). Morale: The idea that the inheritability of the handle is a global state is fundamentally flowed. It was fine on UNIX when one process meant one thread, but it just does not fit into the world of multi-threaded applications + synchronous i/o. I guess Windows inherited the concept from UNIX. How can it be solved? Well, on Windows, I know only one way to ensure that a handle gets inherited directly by one process only. Unfortunately it relies on undocumented features (create suspended process, DuplicateHandle, Read/WriteProcessMemory, resume process). Something similar should be implementable on other platforms too. Does it worth it? Do not know. Perhaps just declare binding child i/o to be thread unsafe. DISCLAIMER: All the said above was verified on Windows, although not with this library, but the test code did essentially the same. I *am not* sure about other platforms, but to my understanding POSIX suffers from exactly the same problem. If someone can refute my claims, please enlighten me. Hope this helps, -- Yakov

On Tue, 13 Nov 2012 23:42:08 +0100, Yakov Galka <ybungalobill@gmail.com> wrote:
[...]Now, consider the following execution order: A1, B1, A2, B2, A3, B3. The result is that the write-end of the pipe of b.exe is inherited by a.exe too (i.e. 'leaked' into a.exe). b.exe exits, closing its write-end of the pipe. However, a.exe still holds (indefinitely) the write-end of b's pipe! Result: thread B will hang until the unrelated process a.exe exits (and perhaps all the children that a.exe created in its own turn).
On Windows create_pipe() creates non-inheritable handles for the pipe. They only become inheritable when an initializer like bind_stdout is used. Do you think this solves the problem you outlined? On POSIX one could use initializers like close_fds or close_fds_if to explicitly close file descriptors for child processes. Then no "leak" should be possible either. I don't know whether that would be enough to turn Boost.Process into a thread-safe library (I didn't think about thread-safety yet). But then your example could be worth adding to the FAQ? Boris
[...]

On Wed, Nov 14, 2012 at 2:24 AM, Boris Schaeling <boris@highscore.de> wrote:
On Tue, 13 Nov 2012 23:42:08 +0100, Yakov Galka <ybungalobill@gmail.com> wrote:
[...]Now, consider the following execution order: A1, B1, A2, B2, A3, B3.
The
result is that the write-end of the pipe of b.exe is inherited by a.exe too (i.e. 'leaked' into a.exe). b.exe exits, closing its write-end of the pipe. However, a.exe still holds (indefinitely) the write-end of b's pipe! Result: thread B will hang until the unrelated process a.exe exits (and perhaps all the children that a.exe created in its own turn).
On Windows create_pipe() creates non-inheritable handles for the pipe. They only become inheritable when an initializer like bind_stdout is used. Do you think this solves the problem you outlined?
No, this is *not* enough. There is a race condition between SetHandleInformation and CreateProcess. That is, the code is essentially equivalent to: SetHandleInformation(h, HANDLE_FLAG_INHERIT, HANDLE_FLAG_INHERIT); // 1 CreateProcess(...) if during (1) you call to CreateProcess from another thread (with inheritHandles set to true) then h is leaked to that other process.
On POSIX one could use initializers like close_fds or close_fds_if to explicitly close file descriptors for child processes. Then no "leak" should be possible either.
I don't know whether that would be enough to turn Boost.Process into a thread-safe library (I didn't think about thread-safety yet). But then your example could be worth adding to the FAQ?
To my understanding on POSIX the situation is even worse, because by default you inherit *all* descriptors, except those that you explicitly mark to be closed. The API you need is basically the opposite: inherit none except those you really want. -- Yakov

On Wed, Nov 14, 2012 at 9:15 AM, Yakov Galka <ybungalobill@gmail.com> wrote:
[...] SetHandleInformation(h, HANDLE_FLAG_INHERIT, HANDLE_FLAG_INHERIT); // 1 CreateProcess(...)
Sorry, the critical section is broader. it should be : SetHandleInformation(h, HANDLE_FLAG_INHERIT, HANDLE_FLAG_INHERIT); CreateProcess(...); CloseHandle(h); and the problem will be triggered if another CreateProcess gets called anywhere between SetHandleInformation and CloseHandle. Also in my original post the mark // 2 had to be on the bind_stdout(sink) line. -- Yakov

On Wed, 14 Nov 2012 08:15:27 +0100, Yakov Galka <ybungalobill@gmail.com> wrote:
[...]No, this is *not* enough. There is a race condition between SetHandleInformation and CreateProcess. That is, the code is essentially equivalent to:
Thanks for the explanation! I added this to the FAQ (not yet online). Boris
[...]
participants (10)
-
Alex Perry
-
Alexander Lamaison
-
Boris Schaeling
-
Francois Duranleau
-
Klaim - Joël Lamotte
-
Peter Dimov
-
Rob Stewart
-
Roland Bock
-
Vicente J. Botet Escriba
-
Yakov Galka