
Jeremy Maitin-Shepard wrote:
On 09/14/2010 10:58 AM, Stewart, Robert wrote:
Jeremy Maitin-Shepard
On 09/14/2010 04:04 AM, Stewart, Robert wrote:
Indeed, calling waitpid() for each child, in turn, is not a good idea from a scalability perspective. Putting all of Boost.Process' children into a unique process group and using waitpid(), with WNOHANG, to wait for children in that process group could work though.
The process group approach does have some nice properties, but I'm not sure it is reasonable for the library to impose such a restriction. (Maybe it is, I am simply not sufficiently familiar with process groups and their intended purpose/practical use to know.) One thing of note is that a process can change its own process group id.
From Wikipedia: "Process groups are used to control the distribution of signals."
The idea is that you can send a signal to all members of a process group, but using that to monitor a signal from any member of the group via waitpid() seems quit within scope, particularly since that is part of the waitpid() interface.
I'm also not familiar with the behaviors and uses of process groups so I don't know if this would just shift the problem in some way, albeit to presumably less common application types.
In addition to the fact that you could still run into the same problem if you allow users to change process groups (you would have to loop through arbitrarily many process groups), I feel that imposing non-standard restrictions, even if seemingly minor, is not really appropriate for a Boost Process library.
I've done some more reading on the subject. According to [1], the purpose of process groups is, as suggested by, but not stated in, Wikipedia [2], for shells to manage processes associated with a tty. However, [3] notes the reason for creating a process group, in the context of the tutorial at least, is so "one may kill all the processes in the process group without having to keep track of how many processes have been forked and all of their process id's." Of course, "kill" can mean send any signal, not just SIGTERM or SIGKILL, for example. Note that Boost.Process would need to set the process group for each child and would likely use the first child's PID as the process group ID. Since the parent would not be part of the new process group and the children would inherit the parent's session, the process group won't be orphaned unless the parent process dies. In that case, the children will get SIGHUP followed by SIGCONT to alert them to the loss of the controlling process; that should be documented so library clients can consider handling that case (likely by ignoring SIGHUP). See [5] for more on this. Aside from wanting to signal all child processes, I consider it to be unusual for an application to set its process group and much more so for a client of Boost.Process to do so [4]. A Boost.Process client would only want to set the process group of its children to permit the use of kill() to signal them all at once. If there's a mechanism for doing that on Windows, the library would need to provide a portable means to do so, and that means clients wouldn't need to rely on kill() and a process group to support it. Consequently, Boost.Process could preclude the manipulation of process groups and using kill() on child processes via a process group.
I took a look at the process spawning facilities in GLib and Qt (documentation only). There might be other libraries that would also be relevant. See:
<http://library.gnome.org/devel/glib/2.25/glib-Spawning-Processes.html> <http://library.gnome.org/devel/glib/2.25/glib-The-Main-Event-Loop.html>
Note: The Main Loop documentation describes the "child watch" facilities, which are particularly relevant to our discussion.
I found little of use there except that they disallow calling waitpid(-1); did I miss something important? Using a process group means there's no need to do that anyway.
It seems a key goal is for Boost Process to be interoperable with the process creation in GLib and Qt. GLib may also provide some useful design inspiration.
I agree on both counts. Avoiding waitpid(-1) seems necessary for this and reasons you've stated before. Using waitpid(-pgid), however, would fit nicely. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com [1] <http://www.cis.temple.edu/~ingargio/old/cis307s96/readings/docs/signals.html#Pgrps> [2] <http://en.wikipedia.org/wiki/Process_group> [3] <http://www.yolinux.com/TUTORIALS/ForkExecProcesses.html>, see section entitled, "Kill all processes in a process group" [4] A client of Boost.Process, being somewhat hampered by the restraints of portability, would necessarily not have as complex process management needs as a shell, unless I'm much mistaken. [5] <http://www.win.tue.nl/~aeb/linux/lk/lk-10.html#ss10.2> IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.