
Boost.Process 0.5 has been released! Documentation and library can be found at: <http://www.highscore.de/boost/process0.5/> Why version 0.5? We have no official process management library in Boost yet. But we had countless drafts since 2006. In order not to get completely lost between all those drafts I gave them version numbers. For example, 0.1 was the very first draft in 2006. 0.31 is the draft documented at <http://www.highscore.de/cpp/process/>. And 0.4 was the draft which was rejected in a formal review in spring 2011. Version 0.5 is the latest attempt to create a process management library. What's new? Version 0.5 has a very different design than all other drafts before. The design was proposed by Jeff Flinn who introduced it briefly at BoostCon 2011 when we gave a presentation on Boost.Process. The goals of this new version are: * A lightweight implementation (no overly complicated code anymore). * A fully extensible library (no need for arbitrary "extension points"). * Support for system-level cross-platform concepts only (no Java-like constructs). Other goals like making the library easy to use were of course also targeted. But these were goals before. The goals above are important as those things have been criticized in earlier drafts and were as far as I see the main reasons why version 0.4 had been rejected. What now? Please have a look at the documentation and think about whether this library could fit into Boost. If you have more time, please download and play around with the new version. Concentrate on the bigger picture: Does the design make sense? Is it easy to understand? Can you do what you want to do? Does it look and feel like modern C++? This is not a formal review. But it would help to get some feedback to see a tendency. After six years, two Google Summer of Code projects, countless drafts and still a lot of interest from many people in a process management library in Boost, we might be able to get somewhere this time. (If you are on Windows: There is a bug in Boost.Iostreams which has been fixed in the release candidate 1 of Boost 1.51.0. See the documentation if you run into a problem when closing pipes.) Anything missing? We had a closed beta test in the previous two weeks with about ten people who had contacted me because of their interest in Boost.Process. I got already some good feedback and have to change a few details. These are only details though and shouldn't stop anyone from using this new version now. If everyone is happy with the overall design and architecture and it's only details we have to discuss, it's already a huge step forward. Last but not least: Boost.Process 0.5 wouldn't exist without the help of Intra2net AG (<http://www.intra2net.com/>). Intra2net AG sponsored a project to support the development of Boost.Process. It was this sponsorship which made it possible to create this new version. Thanks to them! Boris

Congratulation for reaching this important milestone. I'm very interested, I'll read the doc right now. I will be able to test using it in the week to give feedback (I planned to add some process manipulations that week, perfect timing). I'm using boost 1.50 right now (on Windows7 64bit with VS2012 RC - that have a bug in std streams too) , so will I have a problem if I stick to this version for now? Joel Lamotte

On Sat, 18 Aug 2012 18:02:45 +0200, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote: Hi Joel,
the problem is this bug here: <https://svn.boost.org/trac/boost/ticket/6576>. If you use boost::iostreams::file_descriptor_source to read from a pipe whose write-end has been closed, something goes wrong on Windows. If for example you close the read-end before the write-end is closed, you won't have a problem. Boris

Ok, boost 1.51.0 release being targeted at monday I will try with the new version before using Boost.Process 0.5. I just finished reading the doc, here are quick remarks (so don't take them as strong critics): 1. It appears simple to understand by reading the doc, except the suggested need for macros (but I am not a specialist so maybe I'm missing knowledge here). 2. I suspect that there are specific reasons why the 'child' class isn't designed in the same way than std::thread? I think it might have been discussed before but I don't remember. Is there a rational somewhere about this? 3. Why do I need to repeat the executable name if I provide the full command line? Can I just provide the command line? 4. I see that some example use lambdas, so C++11 enabled. Why not use auto too then? I think some of the macro use (related to return types in particular) can be removed by using auto. Obviously that's true only if the example are meant to use C++11. 5. It is not clear to me what you mean by "resources" in the "Cleaning up resources" part? I just read this and I'm guessing that it's for solving the problem of zombie process? http://en.wikipedia.org/wiki/SIGCHLD Otherwise good work. Joel Lamotte

On Sat, 18 Aug 2012 18:45:53 +0200, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote:
What exactly do you mean if you compare child with std::thread?
3. Why do I need to repeat the executable name if I provide the full command line? Can I just provide the command line?
Good point, that sounds like a nice improvement!
Can you give me an example what you are referring to? It's not a macro like WEXITSTATUS I guess (as they are POSIX-only)?
Yes, it's zombie processes on POSIX and process and thread handles on Windows. I referred to both of them as resources. Any other idea what to call them? Thanks for your quick feedback, :) Boris

On Sat, Aug 18, 2012 at 8:06 PM, Boris Schaeling <boris@highscore.de> wrote:
What exactly do you mean if you compare child with std::thread?
I mean that an instance of std::thread reprensents the thread (executing or executed), that you can't copy it, you can move it (in C++11) so it can be contained in a standard (c++11) containers. Basically if you have a std::thread object (or a boost::thread), then you can't assume that another part of the code hold the same thread. I thought that the child class could be designed like the std::thread because, AFAIK, the design of std::thread forbid a lot of potential errors by misleading the developer. Now, as processes are a bit different than threads, I don't know if it makes sense that child would follow the same design as std::thread, but my first reaction was to be a bit suspicious. After all, this means that several child instances can manipulate the same object. Making it non-copyable would force the user to use shared pointers to achieve the same, but it would then make it obvious that an instance of child is the unique representation of a child process, and that if you want to share it, like std::thread, you have to share references to it in a way or another, like with a shared_ptr. I guess it have to be discussed, I don't feel confident in my expertise on the subject.
Ignore my comment, I think I misread some details of the code. By the way, in this example: #if defined(BOOST_WINDOWS_API) typedef boost::asio::windows::stream_handle pipe_end;#elif defined(BOOST_POSIX_API) typedef boost::asio::posix::stream_descriptor pipe_end;#endif boost::asio::io_service io_service; pipe_end pend(io_service, p.source); Can't there be a utility function that would provide the right type depending on the target platform? I mean, any way to avoid users messing with macro would be positive, if possible. If this library is proposed to the standard later, removing any macro use for the user would be mandatory anyway. Joel Lamotte

On Sat, 18 Aug 2012 20:27:04 +0200, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote:
Maybe the name child is a bit misleading. It should really be a PID (POSIX) and process and thread handle (Windows) only. And as you can copy PIDs and handles as much as you want (and no one has a problem with those being copyable), child should work the same. From this point of view, I think boost::noncopyable would be an overkill.
I wonder whether this should be done in Boost.Asio? But if I compare windows::stream_handle or posix::stream_descriptor there are some slight differences. Then we are at the core problem again we struggle with in Boost.Process for several years: What do we do with differences between platforms if we don't like #ifdefs? Either we support the minimum set of cross-platform features or implement no-ops on those platforms a feature is not supported on or invent a new high-level layer abstracting the differences away?
Thanks for the encouragement! I'd already be happy if we can get something into Boost after six years. ;) Boris

On Sun, Aug 19, 2012 at 12:12 PM, Boris Schaeling <boris@highscore.de>wrote:
I cannot agree with this argument: whatever the implementation, child still represent the only way to manipulate the child process. It's interface should be thought from the point of view of the user, not only the implementation. Making it non-copyable (but movable) as the benefit to avoid the user to share it accidentally and would force him to explicitly express sharing (as pointed before). What I'm trying to say is that Posix implementation is based on a context, the C language, different than ours, C++ language, where we can avoid sharing data when it's dangerous. I wasn't talking about boost::noncopyable. Is there another reason for making it copyable, other than posix providing an identifier value? Also as pointed by others, it goes agains RAII based design to force the user to explicitely release resources.
Looks like a not that simple problem. My first reflex would be to ask why is it two totally different types? Boost.Asio doesn't provide a type that would use the platform-specific type in it's implementation? I'm not sure to understand exactly why different types are needed here. (not a boost asio specialist either...)
I agree ^^; Maybe can you provide us a list of different cases where platform-specific macros seems necessary? If the tutorial doesn't list them all already. That way we could try to find a way to eliminate them. I'm sure most of them can be encapsulated in utility functions or optional strategies provided to child. After all it seems that most cases you used macros in the tutorials are custom behaviours that could be injected into child or something wrapping it. In the end the only two concerns I see in this discussions are 1. Requiring user to use macros for general cases 2. child isn't really idiomatic RAII I'm sure those problems can be solved in no time and boost process can be reviewed again. :) Joel Lamotte

About the pid stuff, I need to add an argument that might help design child: std::thread (and boost::thread) also provide the id of the thread. They just provide the identifier using a platform specific type, letting the user use it with platform API functions. If child was designed this way and with RAII and move-only you would get: 1. execute() could be the child constructor (like thread constructor launch the thread if arguments are provided) - maybe child could be changed to child_process then. 2. no need for id-based constructors. 3. clean interface with only one function that return a platform-specific type. 4. sharing would be explicit, as said before A note looking quickly at the source: In process/config.hpp #elif defined(BOOST_WINDOWS_API) # include <Windows.h> This is a bit heavy. Don't you need to precede this header with WIN32_LEAN_AND_MEAN ? (see http://en.wikipedia.org/wiki/Windows.h) In child.hpp Why are (platform-specific) members public? Joel Lamotte

On Sun, 19 Aug 2012 12:46:12 +0200, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote:
I don't think of the child class has a RAII type (but I see how this made you propose a non-copyable class). It could be a RAII type on Windows as the process and thread handles (which are member variables of the class) have to be closed. But it wouldn't work on POSIX as you have to clean up differently. So I treat the child class as a value-based type which works on Windows and POSIX. I'd also prefer a guaranteed cleanup. But it's not clear how to do this with a RAII type on POSIX (unless you have an idea? :).
Unfortunately not. For asynchronous read/write operations posix::stream_descriptor and windows::stream_handle could maybe be merged into a class which could be initialized with a boost::iostreams::file_descriptor_sink or boost::iostreams::file_descriptor_source. These classes wrap a file descriptor and a Windows HANDLE already nicely. So it might be possible to put something together. Boris
[...]

On Sun, Aug 19, 2012 at 6:54 PM, Boris Schaeling <boris@highscore.de> wrote:
I'd also prefer a guaranteed cleanup. But it's not clear how to do this with a RAII type on POSIX (unless you have an idea? :).
I don't know much about POSIX so I cannot say anything about that at the moment. If the example you provide in the tutorial is representative of the work necessary to request cleanup on posix, then I don't see how putting it as a constructor policy wouldn't work? Something like (assuming the design I suggested and assuming the cleanup would be optional- would it be?): child_process cp( cmd_line("test --foo /bar"), auto_cleanup() ); The work would then be done in the constructor with posix implementation, in destructor in windows. I'll take a look at POSIX documentation for processes, see if I can help you with that. I think you certainly intuitively know some problems I don't see, because of your experience with boost process. Joel Lamotte

On 2012-08-19 19:36, Klaim - Joël Lamotte wrote: this specific child process. That's why I suggested a method that sets the signal handler (for POSIX) and does nothing for Windows. Some kind of boost::process::init(ignoreChildSignals, maybe more parameters). This could be called once during startup of your program. Then, the destructor could call discard() on Windows, nothing on POSIX. Or are there additional things to be done for cleanup that I missed? Regards, Roland

On Sun, 19 Aug 2012 20:36:38 +0200, Roland Bock <rbock@eudoxos.de> wrote:
Yes, Roland highlights a problem with signals. As signal handlers are set globally in an application, libraries can't assume they can do what they want. So I rather let the application developer handle signals explicitly - he is the only one who can do it right. It's a pity as that means you have to deal with a platform-specific concept (and maybe even code). But then I personally think it's not a big problem. It's just a matter of fact that platforms can be unfortunately very different. Boris

On 2012-08-22 01:21, Boris Schaeling wrote:
Still, as mentioned in a later mail, I think that Joel was actually right with his idea to call signal in the constructor of a not so lightweight child (is that a fat child? sounds politically incorrect) because almost everybody will have to do it anyway. Hence my suggestion: Have a lightweight child class as is for those 5% who need that extra bit flexibility. Offer a RAII child that performs signal and discard on its own for 95% of the users. Percentages are just a guess, of course :-) Regards, Roland

I think a first thing would be to decide if child have to stay an id value instead of the representation of the child process. At the moment, the current proposition is a bit unclear on this point. I see two solutions: 1. change the name of child. Basically, boost::process::child appears to me as the representation of a child process. If it was child_id, then I wouldn't have made the suggestions I made that is.. 2. make it a real representation of the child process (as suggested before) with RAII and calling automatically the wait function on destruction if not explicitely detached. This is more work obviously and different design than the current proposal but at least it would impact only the child type. Once decided it is easier to see how to setup a solution to avoid macros for 95% of users in cross-platform context. Again I'm not a specialist in the domain... I'll be able to find time to learn more about POSIX in the coming days. Joel Lamotte

On Wed, 22 Aug 2012 15:02:08 +0200, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote:
I think if CreateProcess() would return eg. only a process ID, boost::process::child would have been a typedef (as on POSIX we have only the pid which we get back from fork()). As CreateProcess() returns PROCESS_INFORMATION which has four member variables, I felt it makes more sense to wrap the return values in a class. And given that you can copy the pid on POSIX and PROCESS_INFORMATION on Windows, I didn't really want to enforce tighter restrictions. Someone who doesn't like this can still wrap boost::process::child in a non-copyable user defined class? But I'm not sure whether the library should forbid you to do what you can do with the native types (and it's not that developers have serious problems with those copyable native types?). I wouldn't want to use the child class either if it does something extra which I don't need. For example, if I don't care about my child processes and call signal() to ignore SIGCHLD, I don't want the child class to wait for the child process to exit. While those extra features can be useful, I think they would need to be provided as opt-ins (like another class developers have to use explicitly if they don't want to use the light-weight class child). Boris
[...]

Am 19.08.2012 18:54, schrieb Boris Schaeling:
I would expect from 'modern' C++ that RAII is heavily used - and a process is a system resource. I would find it a little bit odd if a process is value-base. What happened if I copy a child class and invalidate one of the copies. I suggest to make the process moveable-only. The default ctor represents an 'invalid' process class - testable by child::operator <unspecified_bool >() and child::operator!(). child::discard() could move-away the internal representation so that the child instance becomes 'invalid' after this function. Oliver

On Sun, 19 Aug 2012 23:06:07 +0200, Oliver Kowalke <oliver.kowalke@gmx.de> wrote:
[...]process is value-base. What happened if I copy a child class and invalidate one of the copies.
The same what happens if you do that with file descriptors and Windows HANDLEs. And what also happens if you do that with boost::iostreams::file_descriptor_sink or boost::iostreams::file_descriptor_source if use boost::iostreams::close_handle. Not that I'm against RAII - but I wonder whether we are trying a bit too hard to solve a problem which (seems to me at least) has never been a problem for anyone in practice. I would be fine if we could implement the RAII type in a platform-independent way but ...
... isn't this a Windows-only solution? Boris

"Boris Schaeling" <boris@highscore.de> writes:
Of course it's a problem. That's why we're trying to do it in C++, right? Otherwise you might as well just write it in C where people expect to do their own cleanup.
I would be fine if we could implement the RAII type in a platform-independent way but ...
We haven't found one ... yet but we certainly must before adding this library to Boost. Once it's in Boost it becomes 'cannon' and therefore much trickier to make the required changes. Before then we can screw around with the API as much as we like without worrying about breaking existing clients. Alex

On 2012-08-18 17:31, Boris Schaeling wrote:
it is really good to see that this project is continued! Thanks for the effort! Just two weeks ago I needed something like this: Start a child process (shell script in my case), feed it some data via STDIN, read whatever it sends to STDOUT or STDERR, with timeouts and size limits. So basically, your tutorial fits perfectly and I certainly would have used boost::process if it were a boost library already :-) After reading the tutorial and taking a quick look at some of the reference pages, my only concern is that I see blocks like these too often for my taste: #if defined(BOOST_WINDOWS_API) ... #elif defined(BOOST_POSIX_API) ... #endif It is probably good that it is possible to get down to the dirty system specific details if required, but I'd like to be able to handle child processes without thinking about the operating system (unless I want to do something really special). For instance, I like the inherit_env initializer. Does something on POSIX, probably a no-op on Windows. I can just use it and be done with the environment. But things like wait_for_exit seem strange to me: If the library takes care of the forking and handles setting up pipes in such a nice way, why should the user of the library be bothered with the different ways of interpreting the exit information (unless he really, really wants to)? Same with a /dev/null sink. Offering a factory method would be easy and spares the user. Or this one: #if defined(BOOST_POSIX_API) signal(SIGCHLD, SIG_IGN); #endif child c = execute(run_exe("test.exe")); #if defined(BOOST_WINDOWS_API) c.discard(); #endif I understand that the signal stuff is not required for Windows, but why not offer a convenience function that sets the signal handler on POSIX and does nothing on Windows? Same with the discard method. Does something on Windows, does nothing on POSIX, and I get to write code without using the #ifdefs every other line. The other thing about the discard: I think one of the first things I'd do is write a wrapper for child that is non-copyable (as also suggested by Joel) and calls discard in its destructor. I wonder if that should not be part of the library, too? With the probable exception of the asynchronous IO and wait, I think it should be possible to get rid of the #ifdefs in the tutorial. And IMHO that would be a nice improvement for what looks like a cool library already :-) Regards, Roland

On Sat, 18 Aug 2012 23:54:24 +0200, Roland Bock <rbock@eudoxos.de> wrote: Hi Roland,
can you show me some sample code how you'd like to interpret the exit code?
Same with a /dev/null sink. Offering a factory method would be easy and spares the user.
I think this is a Boost.Iostreams problem. We have a null device in Boost.Iostreams (see <http://www.boost.org/doc/libs/1_50_0/libs/iostreams/doc/classes/null.html>) but it's a class with no-op functions. If the class would open /dev/null or NUL, one could use the null device. What I can do for now is changing the example in the Boost.Process documentation. Instead of writing to /dev/null or NUL, it should write to foo.txt - problem solved. ;)
I'm afraid it leads to code obfuscation if you have a superset of all existing functions platforms provide and they are all used together in the same code with no clear indication which ones do something useful and which ones are no-ops. Imagine you had no-op Windows API functions and no-op POSIX API functions. You could use all system API functions from Windows and POSIX without #ifdefs in the same source file and claim your code is platform-indepedent. But I think I wouldn't call it platform-independent but a complete mess. ;) It is maybe no problem for the example you are referring to as we are talking only about two lines here. But we start trading convenience vs. clarity? And if you call discard() and think you are fine on all platforms, you'd make a mistake?
This is related to the paragraph above. The RAII type would only make sense on Windows but the code would compile on all platforms. If you don't know that you must do something extra on POSIX (like ignoring SIGCHLD), your program will leak resources (leaking is maybe the wrong word as init will clean up after the program exits; but you might be tricked into thinking the platform-independent RAII type does something for you on all platforms which wouldn't be true).
Thanks! I mentioned it already in another email I think: If I rewrite some and remove other examples from the tutorial, most of the #ifdefs could disappear. Boost.Process would still support all of that (and I know anyway how to do all of that myself :). But it would make it more difficult for others as everyone would have to figure out himself how to do certain things like asynchronous I/O. I agree that the best solution would still be to support everything with truly platform-independent classes and functions. But I think that some problems have be solved in Boost.Asio (posix::stream_descriptor vs. windows::stream_handle) or Boost.Iostreams (/dev/null vs. NUL) and that other problems are a bit more complicated that it's not clear at all how to solve them in a platform-independent way (signals vs. wait functions)? Boris
[...]

Hi Boris, On 2012-08-21 00:11, Boris Schaeling wrote:
a) Yes, redirecting to foo.txt would make for a nice example as well :-) b) Hmm. Not following. Why not use the null sink from boost::iostreams? You can create a stream from it that does what I would expect from redirecting to /dev/null: Discard all data. boost::iostreams::stream<boost::iostreams::null_sink> null((boost::iostreams::null_sink())); While this is something the library user could figure out on his own, it would still be a very helpful example, I guess. BTW: It might also help to explain whether or not you could just add "> /dev/null" to the argument list (and if you could what the differences are, for example if one way is expected to be more performant). c) I would add yet another example by redirecting output to a string.
You are certainly right that using a wrapper for all individual system API calls that does something for one system and is a no-op for the other leads to total confusion. I am hoping for a way that lets 95% of the users write platform independent code without the need for #ifdefs. For example, my idea was to not require the user to call discard() but have this done in the destructor and the longer I think about it, the more I like Joel's idea of calling signal to ignore SIGCHLD in the constructor. Why? If you say that in POSIX, the typical, the 95% case is to use the signal call, then I would expect the library to do that for me. And if I happen to be one of the 5% who don't want to do what most of the others need, then I need to take special care, and probably use ifdefs. Thus, maybe you could offer a lightweight child class (as it is right now) and a RAII child class that does the signal/discard calls.
See above: If 95% of the users would perform the signal call, you could also make that the standard behavior in the constructor.
If the #ifdefs are required right now for the given examples, then using different examples removes a symptom, not the cause. I'd prefer the same examples. * #boost_process.tutorial.cleaning_up_resources As written above, I'd offer a second version of the child class which does the signal/discard in constructor/destructor. * #boost_process.tutorial.setting_up_standard_streams Using boost::iostreams::stream<boost::iostreams::null_sink> null((boost::iostreams::null_sink())); obsoletes the system dependency (and I would add the examples and hints mentioned above) * #boost_process.tutorial.asynchronous_i_o It might make sense to add such a typedef to boost process for convenience. Maybe not. Not sure. * #boost_process.tutorial.waiting_for_a_program_to_exit o The first #ifdef is hidden in the text: WEXITSTATUS required for POSIX? I guess that 90%+ would want to just get the exit code and be done with it. o I am not sure about the #ifdef in the code. Maybe somebody has a nice idea, otherwise I would just let leave it as it is. Regards, Roland

On Tue, 21 Aug 2012 10:06:49 +0200, Roland Bock <rbock@eudoxos.de> wrote:
Indeed it would be nice if you could write something like: execute( run_exe("test.exe"), bind_stdout(boost::iostreams::null_sink()) ); Unfortunately null_sink doesn't open a file descriptor or HANDLE (it's just that its read and write functions are no-ops). And without a file descriptor or HANDLE there is nothing to inherit by the child process. That's why I had to open /dev/null or NUL myself in the example (and had to use #ifdefs ;).
That's a good point, noted!
[...]I am hoping for a way that lets 95% of the users write platform independent code without the need for #ifdefs.
So far I'm already happy that for 95% of process related tasks I can write platform-independent code. I'm looking at this other 5% where I have to use #ifdefs as a mild but bearable annoyance. :)
I don't know whether it's 95% to 5% or the opposite. I find it quite difficult to set a standard here. If we assume for example that most of the time people would like to get the exit code from a child process (which doesn't sound like a bad assumption either?), and it happens that this should be done asynchronously, ignoring SIGCHLD by default would be rather bad. As ignoring SIGCHLD and catching SIGCHLD are also very different operations (one line vs. a Boost.Asio I/O object) it doesn't sound easy either to put all of this into a RAII type. (If boost::asio::signal_set would provide a function to ignore a signal, one wouldn't need to use a system API function like signal() directly. But then I'm not sure whether Chris would agree to extend boost::asio::signal_set for that. :)
This is indeed the most important problem. Cleaning up resources is something which can't be ignored (unless you have short-running programs).
Here we seem to have a misunderstanding (see my explanation above). It would be indeed great if null_sink or null_source could be used.
To move somewhat forward with Boost.Process: I (or others if they like - just send me an email) will create some utility classes and functions which are not truly mapping existing concepts on all platforms and put them into their own namespace (and probably header file). They would provide the convenience some people ask for. And they would also make it clear that they don't fit perfectly in with the rest of the library and can have quirks.
o I am not sure about the #ifdef in the code. Maybe somebody has a nice idea, otherwise I would just let leave it as it is.
I agree. Even an utility class wouldn't probably be easy to use. Boris

Hi, On 21.08.2012 00:11, Boris Schaeling wrote:
what about an exit_code class which simply encapsulates WIFEXITED and WEXITSTATUS on Unix and does nothing on windows, like the one attached. I think this would be enough for 95% of all users. And from my point of view the exited() member on windows has sane behavior, too. Regards, Florian Sowade

"Boris Schaeling" <boris@highscore.de> writes:
Boost.Process 0.5 has been released! Documentation and library can be found at: <http://www.highscore.de/boost/process0.5/>
I'm already a Boost.Process user and am very glad to see it's not been forgotten. Without having had a chance to try out the new code, I've got a few comments from reading the docs. They're similar to what others have already said. Macros. Boost Libraries rarely have macros but when they do, it's almost never for platform specific behaviour. The whole point is that that gets hidden in the implementation. At first glance it seems to me that all the uses of macros show in the documentation (including the ASIO one) could be 'taken inside'. Non-RAII. IMHO, modern C++ code should not expect the caller to manage the cleanup themselves. If `discard` should only be called once the (shared) child process is no longer needed then either don't share the child process between instances (i.e. they become non-copyable) or manage the `child` instances' resources in a shared manner e.g. a shared_ptr that does whatever cleanup `discard` previously did in its destructor. Environment. The child inherits the environment on one platform be default but not the other. This sounds like a big gotcha. The docs say "on Windows environment variables are inherited by default". Quite so. But there must be a way to stop them being inherited. So why not do that by default on Windows and only let the true Windows default behaviour happen when the inherit_env initialiser is passed? Or vice versa with a suppress_env initialiser? Nevertheless, I'm please to see the improvements you've been making to the library. Keep up the good work. Alex -- Swish - Easy SFTP for Windows Explorer (http://www.swish-sftp.org)

On Sun, 19 Aug 2012 10:24:08 +0200, Alexander Lamaison <awl03@doc.ic.ac.uk> wrote: Hi Alexander,
if I look at the tutorial at http://www.highscore.de/boost/process0.5/boost_process/tutorial.html I see three scenarios where #ifdefs must be used: * Cleaning up resources * Asynchronous I/O * Asynchronously waiting for a program to exit I'll talk about cleaning up resources after the next paragraph (where you mentioned RAII). For asynchronous I/O Boost.Process relies on Boost.Asio. Boost.Asio provides two I/O objects which are unfortunately platform-specific. While unfortunate I'm not sure whether this is a Boost.Process problem? It's just that because of Boost.Process we realize that we have no platform-independent class in Boost.Asio for native handles? It would be nice if there was something which for example could be initialized with a boost::iostreams::file_descriptor_sink or boost::iostreams::file_descriptor_source. But I wonder now whether I shouldn't have mentioned asynchronous I/O in the Boost.Process documentation as this wouldn't be a Boost.Process problem then. ;) Asynchronously waiting for a program to exit is a similar problem but worse. While boost::asio::posix::stream_descriptor and boost::asio::windows::stream_handle are somewhat similar, waiting requires to use the very different I/O objects boost::asio::signal_set and boost::asio::windows::object_handle. The system APIs are unfortunately that different: While you need to catch a signal on POSIX you have to use a wait function on Windows. The challenge is to rewrite the code in the example at http://www.highscore.de/boost/process0.5/boost_process/tutorial.html#boost_p... in a platform-independent way. (Actually I tried this in Boost.Process 0.4. We had a class there called status which had an async_wait() function which worked on POSIX and Windows - if you were careful. The implementation was rather horrible and heavily criticized. Because of that I created boost::asio::windows::object_handle - that's the only reason why we can now actually use Boost.Asio to wait asynchronously on POSIX and Windows even though we need to use platform-specific I/O objects. But it was a step forward.)
The challenge is here to rewrite the example from http://www.highscore.de/boost/process0.5/boost_process/tutorial.html#boost_p... in a platform-independent way. In this example I ignore SIGCHLD as that's rather easy to do. But if you come up with a platform-independent solution you need of course also consider that someone might want to clean up by fetching the exit code from the child process. The whole issue with signals is complicated anyway as it's a global setting in an application. Libraries need to cooperate and shouldn't steal signals or overwrite signal handlers. This looks like a pretty tough job to create a RAII type which works on Windows and POSIX?
I agree, we need more initializers here. Right now there is no initializer either to easily define a new set of environment variables. As of today you can only use inherit_env. I blame my lack of time for not having created more useful initializers yet. :)
Nevertheless, I'm please to see the improvements you've been making to the library. Keep up the good work.
Thanks! And also thanks for your feedback! Boris

"Boris Schaeling" <boris@highscore.de> writes:
[snip]
Could these not be typedefs in Boost.Process. After all, you go on to use them the same way. [snip]
That would be a shame. I like that you mentioned it there. And if you take it away, user will only ask you questions about it :P
I only vaguely remember the last review. Can you remind me what the problem was? [snip]
Yuk. At this point my only thought is how have others done it? (It must have been done before) What about the Java or Python runtimes? I suppose they get to steal the signal all for themselves which makes their job easier.
My criticism wasn't directed at any lack of initialisers, just that the default behaviour of the two platforms varied when the library could make it match, either favouring the POSIX way of inheriting nothing by default or the Windows way of inheriting everything. Just a couple of other points that occurred to me: Exceptions. The error reporting interface of the `execute` function looks a little unusual. Have you looked at how Boost.Filesystem does it? Their functions are overloaded so that the ones taking an error code argument don't throw but set the argument and the ones without just throw. No need for `set_on_error`, `throw_on_error`. Unicode. Again, you might want to look at how Boost.Filesystem (v3 only) handles it. It always calls the wchar_t versions of the Windows API and converts the strings internally using a locale. By default the behaviour is exactly the same but the advantage is you can pass in a custom locale to interpret the strings. This is particularly important for narrow char strings in Windows which, are *not* treated as Unicode (UTF-8) but rather the local user's code page. The result is that you cannot launch a program with a mix of characters from different code pages. For example, a Greek user (whose username is presumably in Greek) could not launch a program in his user directory and pass it a Russian word as an argument. With a custom UTF-8 locale object, it all works fine. Of course, you can use the wchar_t interface to do that but that makes it difficult to write cross-platform code which is, after all the purpose of a cross-platform process library. I know Artyom, for one, has strong opinions on this. Not a priority at the moment though but I thought I'd mention is before I forget. Alex -- Swish - Easy SFTP for Windows Explorer (http://www.swish-sftp.org)

On Mon, 20 Aug 2012 00:53:28 +0200, Alexander Lamaison <awl03@doc.ic.ac.uk> wrote:
Yes, this could work. posix::stream_descriptor has a few more member functions than windows::stream_handle. But there is no difference in regard to the asynchronous read/write operations. I wonder though whether the typedef shouldn't be provided by Boost.Asio? Or instead of a typedef Boost.Asio could provide a new class boost::asio::file_descriptor_stream which could be initialized with a boost::iostreams::file_descriptor_sink or boost::iostreams::file_descriptor_source?
[...]I only vaguely remember the last review. Can you remind me what the problem was?
The main problem was to create a class which basically does what boost::asio::signal_set and boost::asio::windows::object_handle do. These classes are very different, and if you put all their code into one class it's an interesting mixture of signal handling code and Windows wait functions. ;) That signals are handled globally in an application doesn't make it easier. If you look at the two classes in the Boost.Asio documentation I guess no one would ever think that it's a good idea to merge them (it makes much more sense for posix::stream_descriptor and windows::stream_handle).
I think they don't need to care much about the system level. They can invent a new concept Foo, and the rest is a matter of implementation details. But I'm not sure whether this is the way to go for a Boost C++ library? What I've been trying to do in Boost.Process 0.5 is making existing cross-platform concepts platform-independent in code. I didn't try to invent new concepts. So where I didn't find cross-platform concepts I accepted the fact that platforms are different and used the appropriate C++ tool to express the difference: #ifdef. Maybe someone has a great idea how to make SIGCHLD and Windows wait functions platform-independent. I gave it a shot once, and it was rather messy. Maybe it's not a SIGCHLD vs. Windows wait function problem either but must be solved on a very different level. I don't know. I think that's a task and maybe even a library of its own. It took me already half a year to create windows::object_handle after the status class in Boost.Process 0.4 was rejected. I don't know how long it will take to solve the SIGCHLD/Windows wait function problem. And this is not the first time that it is discussed on the mailing list (only six years since the very first Boost.Process draft ;).
Ah, I see. Well, I think I prefer the Law of Economy: If I don't need something, I don't add it. Here it means undefined behavior if you don't use an initializer for environment variables. But if you don't care about environment variables, that's fine and there is no need to do something extra in the library.
Yes, I modeled wait_for_exit() and terminate() after the functions in Boost.Filesystem. But execute() has such a different interface that I felt it makes more sense if everything is an initializer. Besides there is this issue on POSIX that you need another initializer return_on_error if you want to know somehow whether execve() failed in the child process.
And I was already happy to support Unicode API functions on Windows at all. :) Thanks for this feedback and the example! I agree it would be useful if the library supported a UTF-8 locale. Sounds like a nice idea to work on once all major issues have been resolved. ;) Boris

"Boris Schaeling" <boris@highscore.de> writes:
Exactly. And right now the library /users/ are expected to be able to do that o_O [...]
They can hardly reinvent processes. Their runtimes must call the OS system calls at some point and handle all the issues we are discussing here. The big advantage I can see that they have is that they don't need to play nicely with other libraries. They have the sandpit all to themselves and can steal the POSIC signals. Although, maybe Python doesn't do that because 3rd party C plugins are so common. I'm afraid I don't have the time to check right now.
I'd strongly argue that #ifdef is not the appropriate tool on the /client/ side of the library. After all the entire point of a Boost library is to abstract such details away. Certainly in the common use cases.
I'm not very familiar with POSIX processes. Is SIGCHLD required for *all* process use-cases or just more advanced ones that give you more control/introspection than Windows processes?
Economy of what? I'm sure you don't mean performance because blanking out the environment on Windows is just a matter of passing L"\0\0" as the lpEnvironment paramteter of CreateProcessW. I'd advocate making that the default and setting the parameter to NULL if the caller passes inherit_env. That way you get the same behaviour on both platforms. Alex -- Swish - Easy SFTP for Windows Explorer (http://www.swish-sftp.org)

On Wed, 22 Aug 2012 21:14:25 +0200, Alexander Lamaison <awl03@doc.ic.ac.uk> wrote:
I'm not necessarily happy about the #ifdefs either. But I don't rule them out completely. It all depends on the alternatives. And for the most complicated use cases (cleaning up resources and waiting asynchronously for processes to exit) we don't have anything else right now? Whatever you come up with has to be at least as good as what we have right now. And if you look eg. at the example at <http://www.highscore.de/boost/process0.5/boost_process/tutorial.html#boost_process.tutorial.cleaning_up_resources> I think it will be tough to beat that one in clarity and simplicity.
I think there are basically three scenarios: - You don't mess around with SIGCHLD at all: Terminated child processes must be reaped with a blocking call to wait. - You call signal(SIGCHLD, SIG_IGN): Terminated child processes are reaped automatically (by the kernel). - You set a signal handler for SIGCHLD: You still must call the blocking function wait. But if you do this in the signal handler, wait returns immediately (this is the asynchronous waiting model). As it all works differently on Windows, and as this is not only about cleaning up resources but also partly about fetching the status of the terminated child process (it is returned by wait), all of this has to be considered for a platform-independent solution and weighed against the #ifdef alternatives.
If you want to have a certain behavior on all platforms, you only need to use an initializer. But if someone else doesn't care about a certain behavior, the library shouldn't do something no one asked for. (Not to think about what the same behavior should actually be. One could also argue that the Windows behavior should be the default. This leads to the question how the library user will know and remember what the default is. And then there are other settings like for standard streams which I'm afraid someone wants then to have equalized next.) Boris

On Sat, Aug 25, 2012 at 10:37 PM, Boris Schaeling <boris@highscore.de>wrote:
I can't agree on this because I think the terms are misleading. The library is supposed to provide safe and and useful (and correct) behaviour as a priority - as far as I understand boost libraries. It is no about "something no one asked for". The minimal behaviour should be the safe, useful one, not the the lighter on resources, until specifed otherwise. Look at other similar examples: std::thread forces you to detach() or join() threads before destruction, calling std::terminate() if it have not been done: this forces the user to define the relationship between the threads, making it very hard to "leak" a thread without doing it explicitly. A thread instance is the unique object through which you can manipulate one specific thread. I think it is sound design because it is cheap behavior that still force relatively safe usage (at least far more safe than if threads would be manipulable only by handles). Here it seems that the most useful use case of such library should be defined before fixing the design. Which behaviour is the safer? Which behaviour is the most expected? I don't have data personally as it requires research but my personal guess is that nobody wants to : - loose handle on the child process when going out of scope without explicitly detaching it from the current process - not release resources by default when the child process ends (this one is not clear to me yet in this discussion) - allow different part of the code to implicitly manipulate a process which handle is shared everywhere. And I think that's why std::thread was designed that way. So my question is: have previous versions of boost.process been designed following the same principles than boost.thread (I suppose it had at some point?) and why did it fail? Couldn't boost process follow the same design principles? Is there something specific to processes that make them far different from threads, from user code point of view I mean? Side optional question: have there been proposals for a process library to the standard? Joel Lamotte

On Sun, 26 Aug 2012 00:13:47 +0200, Klaim - Joël Lamotte <mjklaim@gmail.com> wrote:
I was only referring to the environment initializer here where it's just a matter of preference what the default behavior should be? The RAII issue is indeed a different one where I agree that safe and correct behavior is very much desired.
I had a look at version 0.4. We did have a handle type there which would close a file descriptor or Windows HANDLE automatically (kind of specialized shared_ptr). On Windows this RAII type would reliably close the process handle when child went out of scope. Interestingly, I don't find any reference in the entire documentation to what to do on POSIX. I don't remember whether we forgot or expected users on POSIX to always fetch the status (or we knew we can avoid a long discussion on the mailing list if we don't mention anything in the documentation ;). I had another look at <http://www.highscore.de/cpp/process/> which is about version 0.31. There I found this: "[...] it is recommended to always reap child processes by calling wait()." Anyway, let me ask now a few questions: - Would you prefer if child would close the process handle on Windows in its destructor automatically? - Would that be still OK given that it doesn't help at all on POSIX? - Could such a type trick library users into thinking their code is safe and correct on all platforms while in fact they must do something extra on POSIX to avoid zombies? I guess it would be interesting if others answer these questions, too.
Not that I know. Boris

On Mon, Aug 27, 2012 at 1:07 AM, Boris Schaeling <boris@highscore.de> wrote:
I would go for something like thread: the user have to call wait() or detach() otherwise the destructor just call terminate() if the user didn't specify what to do. For "releasing the resources", the POSIX case is still obscure to me, but I see the problem with the "global" handlers etc. By default the I would prefer the child process resources to be handled automatically, until I say otherwise (like the release() function in smart pointers). However as I don't suggest the class destructor to release resource itself, it would be whatever that execute the process (executor?) that would do the cleaning in the end of the process. Is it unrealistic or innappropriate with processes to go this way? Joel Lamotte

How is asynch. signal handling done? int status; boost::asio::signal_set set(io_service, SIGCHLD); set.async_wait( [&status](const boost::system::error_code&, int) { ::wait(&status); } ); Please note that on POSIX you are limited on what you can do inside a signal handler (must not call non-reentrant functions)! I suggest you do not invoke the callback from the user in the signal handler - instead you set inside the signal handler a flag indicating the io-demultiplexer (io_service) to call the associated callback if you have returned from the signal handler (io_service has to continue with dispatching). Oliver

On 2012-08-18 17:31, Boris Schaeling wrote:
- Does the design make sense? Partially: I like a lot of aspects, for instance the initializers, especially bind_stdin and Co. But to be confronted with Windows/POSIX specific stuff for even the simplest tasks, is bad design in my eyes. - Is it easy to understand? Yes and no. I mean, hell, yes! Much easier than handling fork and opening pipes on my own. I did that myself and it would be so much nicer using the current version of Boost.Process. But then my current code is POSIX only. If I wanted to write platform independent code, I would still feel lost. - Can you do what you want to do? Yes and no. Yes, I could replace my own current stuff. No, I want to turn my current code into a platform independent version and I don't know how unless I add a bunch of #ifdefs (and I don't want to). - Does it look and feel like modern C++? Not to me. I'd expect a modern C++ process library to allow me to write platform independent code where possible. Running a process, knowing the environment, doing IO, waiting for it to finish, knowing the exit code should not require different code and should behave in the same way (e.g. the default environment for child processes should be the same). Of course, if I want to know the child::pid or the child::proc_info and do something with it: sure, that would be platform dependent. It is certainly required to allow the library user to perform platform dependent things. But it feels wrong to require the library user to perform platform dependent things. As a summary: Even though I really think you did a really terrific job in wrapping the fork() and the like for Windows and for POSIX, I would not vote for inclusion into boost since the library user is still required to take a lot of care to write platform independent code. It is too hard to get it right and too easy to get it wrong. It seems to me that the amount of work required to turn Boost.Process into a library that allows to write platform independent code easily is rather small compared to what has been invested to get it to the current state. It would certainly be worth it. Regards, Roland

Hi, first of all I would like to thank you, Boris for your effort in writing this library. Recently I wrote quite a lot of code where (Boost.)Process would be of great help, provided some things were fixed. As most of the things I would like to be changed were already addressed by the previous posters to this thread, I'm going to use Roland's post (with which I agree 100%) and just add some notes. On Thu, Aug 23, 2012 at 10:18 AM, Roland Bock <rbock@eudoxos.de> wrote:
+1 + as others said, the special devices/'files' like /dev/null, NUL and friends should be platform independent if possible.
+1 Not to say how lost might feel the people who didn't write the code but have to maintain/bugfix it.
+1
+1 There certainly are differences between POSIX and WINAPI, but for the 95% of cases where you don't have to do special tweaking (handling SIGCHLD differently, etc.) the programmer should not be forced to do the #if-#else dance. This should be kept only for those who want/need to do that, but certainly not the default (simple) usage. Of course the POSIX-specific and WINAPI-specific things from which applications targeted only for a single platform could benefit should stay included (and maybe a couple more added). And, it is nice that there is a lightweight 'handle' to the child process, but I (and IMHO others) would not mind if there was a more heavyweight but also a more powerful equivalent (for example using RAII as others suggested to clean things up properly in a platform independent way).
Same here. Best regards, Matus

On 8/23/2012 4:58 AM, Matus Chochlik wrote:
My thought is these may belong to the boost.iostreams library. Or this could be internalized by the platform executor for "unconnected" io channels.
In our organization, no platform dependent #ifdefs in client code is a precondition to a process library's use. Certainly extension points using platform specific code is required, but these are implemented as separate files that are appropriately included by each platforms project build system. Windows and POSIX use vastly disparate paradigms for creating processes that there is little to no reuse of code between the implementations. The only commonality are the type constructors, which provide the platform independence. My version at https://github.com/JeffFlinn/boost-process demonstrates how this can be done.
I've struggled with just what a "child" entails as well. In our uses, a "child" instance may be a data member of a class. In trying to maintain process as a header only library, having the child hold a window's handle exposes windows.h too widely. I've thought the child could merely hold a pid (like posix), and the corresponding handle could be retrieved only when needed. I'm not sure yet if there are any issues with doing this in general on windows. This windows.h visibility is the motivation for the separate "monitor" type in my design. The monitor, typically being isolated to a few translation units as required, avoiding windows.h in any headers. There are both "monitor" and a "scoped_monitor" types. Jeff

Jeff Flinn <Jeffrey.Flinn@gmail.com> writes:
Why is it necessary to Windows.h at all? Projects that don't want to have a dependency on it usually just redeclare the API function (in this case CreateProcess) and the necessary structs (PROCESS_INFORMATION?). Alex -- Swish - Easy SFTP for Windows Explorer (http://www.swish-sftp.org)

On Fri, Aug 24, 2012 at 10:27:25PM +0100, Alexander Lamaison wrote:
Files like Windows.h are not immutable things. There's many providers of it, like mingw-w64, all the Windows SDKs, the Visual Studio include directories before the introduction of the Platform SDK, and so on. Do you really want the maintenance nightmare of ensuring that your declarations and definitions of everything remains correct for all future releases of the headers, both backwards and forwards? Implementation-wise, I'd say that a mingw-w64 and a WSDK Windows.h differ greatly in implementation, and if you start reinventing them, you'll end up with all sorts of awesome mismatches, particularly if the user then includes the Real Deal. -- Lars Viklund | zao@acc.umu.se

Lars Viklund <zao@acc.umu.se> writes:
The create process API is not going to change!
If your redeclaration of the CreateProcess differs from what's in those headers you program won't even link so you'll find out you screwed up pretty quickly. Alex -- Swish - Easy SFTP for Windows Explorer (http://www.swish-sftp.org)

On Sat, Aug 25, 2012 at 08:51:53AM +0100, Alexander Lamaison wrote:
I thought that the discussion had threaded into the generic approach of reinventing a platform header, not your particular use case. For many things, there are rather significant differences in the declarations, of which not all of them are hard errors. I'd rather have a slightly more polluted library that is sure to follow changes going forward, instead of praying that things you cannot foresee won't happen. Worst of all, if you have a slightly different declaration, you'll get massive rantage from the compilers, and at worst, subtle changes in semantics. -- Lars Viklund | zao@acc.umu.se

Lars Viklund <zao@acc.umu.se> writes:
It applies to an core OS API. The whole point of an OS API (more importantly, an OS ABI) is that it cannot change (modulo architecture) or binaries are unable to run.
For many things, there are rather significant differences in the declarations, of which not all of them are hard errors.
The declaration differences are purely syntactic. Eventually they all have to desugar into the exact same types (modulo architecture). Even assembly programmers have to be able to call these functions and know it will work.
There aren't going to be any breaking changes unless we move to 128-bit architectures or Microsoft decide to drop backward compatibility and start Windows again from scratch.
If you have a semantically different declaration it's not going to work anyway. The compiler should ignore syntax differences. Alex

I've updated code and documentation at <http://www.highscore.de/boost/process0.5/>: - On POSIX set_on_error and throw_on_error report now if execve() failed in the forked process. - return_on_error has been removed. - set_env has been added to make it possible to set environment variables. - close_fds_except has been changed to close_fds_if (it expects now a predicate). - The executors call now also an initializer function after a system call returned successfully. More changes to come. Boris
participants (9)
-
Alexander Lamaison
-
Boris Schaeling
-
Florian Sowade
-
Jeff Flinn
-
Klaim - Joël Lamotte
-
Lars Viklund
-
Matus Chochlik
-
Oliver Kowalke
-
Roland Bock