Child processes

Angus Leeming

22 Jun 2004 22 Jun '04

5:34 p.m.

I am in the process of writing a little library to control the interaction of a parent program with any child processes that it spawns. Would there be any interest in such a library here? Assuming that there would, please excuse me diving in at the deep end. I'm currently mulling over how best to ensure that system resources are cleaned up automatically when the child exits and would value some advice. Under unix, I've defined a handler for SIGCHLD signals. The handler does two things only: * reaps the zombie * stores the return status in a std::map<pid_t, int> completed_children; variable. Each time a new child is spawned, the pid of the child process is registered in the map, so the handler needs do nothing other than find the entry. namespace { std::map<pid_t, int> completed_children; int status; sig_atomic_t pid; } extern "C" void child_handler(int) { pid = wait(&status); std::map<pid_t, int>::iterator it = completed_children.find(pid); if (it != completed_children.end()) it->second = status; } I'm pretty sure that the above is safe code. It does the absolute minimum in the handler routine. wait() is guaranteed to be async-safe. Moreover, the handler receives only SIGCHLD signals and so cannot receive multiple calls simultaneaously. However, I'm unsure whether it Ok to search the map like this. Any advice? My one other worry is that the SIGCHLD handler could be redefined elsewhere in the code, but AFAICS there's nothing that I can do about that. Furthermore, it means only that the user of my Child library won't be able to ascertain the exit status of child processes launched using it. (If the zombie is reaped elsewhere then a subsequent call to waitpid(pid, &status, WNOHANG); will result in errno being set to ECHILD. Ie, I'll know that the child process has ended but just won't know its exit status.) Under windows, I'm proposing to use a separate thread to wait() for each child process. This seems to be the standard way of dealing with the problem, but I'm a total novice when it comes to Windows programming so know only what I've read. Again, any advice would be greatly appreciated. Regards, Angus

Show replies by date

Rob Stewart

22 Jun 22 Jun

8:16 p.m.

From: Angus Leeming <angus.leeming@btopenworld.com>

...

I am in the process of writing a little library to control the interaction of a parent program with any child processes that it spawns. Would there be any interest in such a library here?

I wanted to comment on a question you raised, but don't construe this as interest. I actually don't have a need for such a library, but I can see the value in it.

...

Under unix, I've defined a handler for SIGCHLD signals. The handler does two things only: * reaps the zombie * stores the return status in a std::map<pid_t, int> completed_children; variable. Each time a new child is spawned, the pid of the child process is registered in the map, so the handler needs do nothing other than find the entry.

namespace {

std::map<pid_t, int> completed_children;

int status; sig_atomic_t pid;

}

extern "C" void child_handler(int) { pid = wait(&status); std::map<pid_t, int>::iterator it = completed_children.find(pid); if (it != completed_children.end()) it->second = status; }

I'm pretty sure that the above is safe code. It does the absolute minimum in the handler routine. wait() is guaranteed to be async-safe. Moreover, the handler receives only SIGCHLD signals and so cannot receive multiple calls simultaneaously. However, I'm unsure whether it Ok to search the map like this. Any advice?

The code does very little work. The map is not likely to every be very large; not many applications create and track a large number of child processes. That means that finding the child will be fast; creating the map entry ahead of time saves some work in the signal handler, which is good. The range of PIDs is limited, so maybe a std::deque or std::vector would be better. Once created, accessing the status would be a simple indexing operation. (That requires initializing elements to some sentinel value to indicate no value.) Another scheme to handle large numbers of child processes would be to put a std::pair<pid_t, int> into a queue or stack and let non-signal-handler code populate the data structure from the queued pair.

...

My one other worry is that the SIGCHLD handler could be redefined elsewhere in the code, but AFAICS there's nothing that I can do about that. Furthermore, it means only that the user of my Child library won't be able to ascertain the exit status of child processes launched using it.

The proper way to install a signal handler is to chain to any previous signal handler for the same signal. Therefore, a library client should write code that ensures that your handler gets called. You could also go the other way around. You could provide a mechanism for clients to install their own handler (a boost::function<void, int>?) for your handler to call. -- Rob Stewart stewart@sig.com Software Engineer http://www.sig.com Susquehanna International Group, LLP using std::disclaimer;

Angus Leeming

9 p.m.

Rob Stewart wrote:

...

...
I'm pretty sure that the above is safe code. It does the absolute minimum in the handler routine. wait() is guaranteed to be async-safe. Moreover, the handler receives only SIGCHLD signals and so cannot receive multiple calls simultaneaously. However, I'm unsure whether it Ok to search the map like this. Any advice?

The code does very little work. The map is not likely to every be very large; not many applications create and track a large number of child processes. That means that finding the child will be fast; creating the map entry ahead of time saves some work in the signal handler, which is good.

Indeed. But primarily it should be safe, which means I should not create new elements in whatever container I use. [ snip alternative schemes ] Thanks for the ideas. I'll mull them over.

...

...
My one other worry is that the SIGCHLD handler could be redefined elsewhere in the code, but AFAICS there's nothing that I can do about that. Furthermore, it means only that the user of my Child library won't be able to ascertain the exit status of child processes launched using it.

The proper way to install a signal handler is to chain to any previous signal handler for the same signal. Therefore, a library client should write code that ensures that your handler gets called.

You could also go the other way around. You could provide a mechanism for clients to install their own handler (a boost::function<void, int>?) for your handler to call.

Many thanks for these suggestions. That's exactly what I was looking for. Regards, Angus

Jonathan Biggar

10:26 p.m.

Angus Leeming wrote:

...

namespace {

std::map<pid_t, int> completed_children;

int status; sig_atomic_t pid;

}

extern "C" void child_handler(int) { pid = wait(&status); std::map<pid_t, int>::iterator it = completed_children.find(pid); if (it != completed_children.end()) it->second = status; }

I'm pretty sure that the above is safe code. It does the absolute minimum in the handler routine. wait() is guaranteed to be async-safe. Moreover, the handler receives only SIGCHLD signals and so cannot receive multiple calls simultaneaously. However, I'm unsure whether it Ok to search the map like this. Any advice?

That's not safe code, since the child_handler() function could be called at any time, interrupting other code that is modifying the completed_children map. -- Jon Biggar Floorboard Software jon@floorboard.com jon@biggar.org

Angus Leeming

23 Jun 23 Jun

8:46 a.m.

Jonathan Biggar wrote:

...

Angus Leeming wrote:

...
namespace {

std::map<pid_t, int> completed_children;

int status; sig_atomic_t pid;

}

extern "C" void child_handler(int) { pid = wait(&status); std::map<pid_t, int>::iterator it = completed_children.find(pid); if (it != completed_children.end()) it->second = status; }

I'm pretty sure that the above is safe code. It does the absolute minimum in the handler routine. wait() is guaranteed to be async-safe. Moreover, the handler receives only SIGCHLD signals and so cannot receive multiple calls simultaneaously. However, I'm unsure whether it Ok to search the map like this. Any advice?

That's not safe code, since the child_handler() function could be called at any time, interrupting other code that is modifying the completed_children map.

Yes, I've thought of that. We just need to block any signals when we're modifying the map: unix_reaper::unix_reaper() { signal(SIGCHLD, boost_child_handler); sigemptyset(&old_mask_); sigemptyset(&new_mask_); sigaddset(&new_mask_, SIGCHLD); } void unix_reaper::register(pid_t pid) { // Block the SIGCHLD signal. sigprocmask(SIG_BLOCK, &new_mask_, &old_mask_); // This is the map. children_[pid] = -1; // Unblock the SIGCHLD signal and restore the old mask. sigprocmask(SIG_SETMASK, &old_mask_, 0); } Does this address your concern? Angus

Gregory Colvin

22 Jun 22 Jun

10:46 p.m.

On Jun 22, 2004, at 11:34 AM, Angus Leeming wrote:

...

I am in the process of writing a little library to control the interaction of a parent program with any child processes that it spawns. Would there be any interest in such a library here?

Assuming that there would, please excuse me diving in at the deep end. I'm currently mulling over how best to ensure that system resources are cleaned up automatically when the child exits and would value some advice.

Under unix, I've defined a handler for SIGCHLD signals. The handler does two things only: * reaps the zombie * stores the return status in a std::map<pid_t, int> completed_children; variable. Each time a new child is spawned, the pid of the child process is registered in the map, so the handler needs do nothing other than find the entry.

namespace {

std::map<pid_t, int> completed_children;

int status; sig_atomic_t pid;

}

extern "C" void child_handler(int) { pid = wait(&status); std::map<pid_t, int>::iterator it = completed_children.find(pid); if (it != completed_children.end()) it->second = status; }

I'm pretty sure that the above is safe code.

It's unsafe because it's not reentrant. But why do you need to reap the zombie? I've implemented the java process control natives with no need for reaping.

Stefan Seefeld

6:59 p.m.

Gregory Colvin wrote:

...

...
extern "C" void child_handler(int) { pid = wait(&status); std::map<pid_t, int>::iterator it = completed_children.find(pid); if (it != completed_children.end()) it->second = status; }

I'm pretty sure that the above is safe code.

It's unsafe because it's not reentrant.

I believe to be truely reentrant yet flexible you'd need something more intrusive, such as a semaphore and then require the application to implement some checking, for example a callback that can be triggered from the main loop if ever the application has one.

...

But why do you need to reap the zombie? I've implemented the java process control natives with no need for reaping.

I don't know how java deals with this but on posix platforms a sub-process holds some resources that have to be claimed back by the parent process. This includes (but is not restricted to) the result of the process, i.e. its return value. In case you are familiar with (posix) threads: there you have to join a thread after it terminated. It's conceptually the same thing here. Regards, Stefan

Gregory Colvin

23 Jun 23 Jun

1:13 a.m.

On Jun 22, 2004, at 12:59 PM, Stefan Seefeld wrote:

...

Gregory Colvin wrote:

...
...
extern "C" void child_handler(int) { pid = wait(&status); std::map<pid_t, int>::iterator it = completed_children.find(pid); if (it != completed_children.end()) it->second = status; }

I'm pretty sure that the above is safe code. It's unsafe because it's not reentrant.

I believe to be truely reentrant yet flexible you'd need something more intrusive, such as a semaphore and then require the application to implement some checking, for example a callback that can be triggered from the main loop if ever the application has one.

Some platforms have sighold() and sigrelse(), but not all.

...

...
But why do you need to reap the zombie? I've implemented the java process control natives with no need for reaping.

I don't know how java deals with this but on posix platforms a sub-process holds some resources that have to be claimed back by the parent process.

Which you can do by waiting for the child. If you need the exit value you might as well wait anyway, if you don't you create a thread to wait.

...

This includes (but is not restricted to) the result of the process, i.e. its return value. In case you are familiar with (posix) threads: there you have to join a thread after it terminated. It's conceptually the same thing here.

Regards, Stefan

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Stefan Seefeld

22 Jun 22 Jun

9:34 p.m.

Gregory Colvin wrote:

...

...
...
But why do you need to reap the zombie? I've implemented the java process control natives with no need for reaping.

I don't know how java deals with this but on posix platforms a sub-process holds some resources that have to be claimed back by the parent process.

Which you can do by waiting for the child. If you need the exit value you might as well wait anyway, if you don't you create a thread to wait.

yes, this was my point. The platform you are on imposes how to clean up sub-processes, not the language. So you (or the runtime environment) have to deal with the resource management somehow. In case you don't need the return status/value (which I would guess is quite rare) you can do something similar to detaching a thread: you create a new session for the subprocess (in fact, the subprocess in question is then the grand-child of the parent process, i.e. the parent's child process is gone and thus there is nothing to wait for). Anyways, this is getting offtopic... The point I was trying to make was that IMO you normally do want to reclaim dead child processes ('zombies'), so any general solution should provide the means to do that. Regards, Stefan

Gregory Colvin

23 Jun 23 Jun

2:08 a.m.

On Jun 22, 2004, at 3:34 PM, Stefan Seefeld wrote:

...

Gregory Colvin wrote:

...
...
...
But why do you need to reap the zombie? I've implemented the java process control natives with no need for reaping.

I don't know how java deals with this but on posix platforms a sub-process holds some resources that have to be claimed back by the parent process. Which you can do by waiting for the child. If you need the exit value you might as well wait anyway, if you don't you create a thread to wait.

yes, this was my point. The platform you are on imposes how to clean up sub-processes, not the language. So you (or the runtime environment) have to deal with the resource management somehow. In case you don't need the return status/value (which I would guess is quite rare) you can do something similar to detaching a thread: you create a new session for the subprocess (in fact, the subprocess in question is then the grand-child of the parent process, i.e. the parent's child process is gone and thus there is nothing to wait for).

Anyways, this is getting offtopic...

It's on topic, I think, but perhaps a premature excursion into details of implementation.

...

The point I was trying to make was that IMO you normally do want to reclaim dead child processes ('zombies'), so any general solution should provide the means to do that.

Normally you don't want to leak resources, which may or may not involve zombies. And even if zombies are involved, reclaiming them need not require a signal handler, which was my original point.

Stefan Seefeld

22 Jun 22 Jun

10:31 p.m.

Gregory Colvin wrote:

...

Normally you don't want to leak resources, which may or may not involve zombies. And even if zombies are involved, reclaiming them need not require a signal handler, which was my original point.

oh ? Any call to fork() should be accompagnied by a call to wait() (or one of its cousins such as 'waitpid()'). The only question is whether that has to happen synchronously or asynchronously. Calling 'wait()' synchronously basically means to block right after 'fork()' has been called untill the child process terminates. One technique that uses this approach is to detach the subprocess, i.e. to fork twice, and while the grand-child carries out the actual task, the child finishes immediately. The much more frequent situation is to call it asynchronously, i.e. the call would be either executed from inside the SIGCHLD handler or somehow triggered by it (such as a semaphore, which is reentrant). Do I miss something ? Stefan

Gregory Colvin

23 Jun 23 Jun

3:11 a.m.

On Jun 22, 2004, at 4:31 PM, Stefan Seefeld wrote:

...

Gregory Colvin wrote:

...
Normally you don't want to leak resources, which may or may not involve zombies. And even if zombies are involved, reclaiming them need not require a signal handler, which was my original point.

oh ? Any call to fork()

On systems that have fork().

...

should be accompagnied by a call to wait() (or one of its cousins such as 'waitpid()').

On systems that have wait().

...

The only question is whether that has to happen synchronously or asynchronously.

And that question could be answered by the process library, or left to the user.

...

Calling 'wait()' synchronously basically means to block right after 'fork()' has been called untill the child process terminates. One technique that uses this approach is to detach the subprocess, i.e. to fork twice, and while the grand-child carries out the actual task, the child finishes immediately.

Which won't work if the parent cares about the child's exit() value?

...

The much more frequent situation is to call it asynchronously, i.e. the call would be either executed from inside the SIGCHLD handler or somehow triggered by it (such as a semaphore, which is reentrant).

Which was the implementation Angus first suggested, which a few of us pointed out was unsafe as written. I meant to suggest the choice could be left to the user. You can also avoid blocking with a separate thread to do the wait(), or by calling waitpid() with WNOHANG in a polling loop.

...

Do I miss something ?

No, I think we agree on what the possibilities are for Posix.

Stefan Seefeld

22 Jun 22 Jun

11:33 p.m.

Gregory Colvin wrote:

...

...
oh ? Any call to fork()

On systems that have fork().

...
should be accompagnied by a call to wait() (or one of its cousins such as 'waitpid()').

On systems that have wait().

yes, I'm only talking about posix systems.

...

...
The only question is whether that has to happen synchronously or asynchronously.

And that question could be answered by the process library, or left to the user.

I believe this should be left to the user. However, it would be nice if the library had some way of encapsulating various strategies to execute non-reentrant callbacks in a safe (and of course portable) way.

...

...
Calling 'wait()' synchronously basically means to block right after 'fork()' has been called untill the child process terminates. One technique that uses this approach is to detach the subprocess, i.e. to fork twice, and while the grand-child carries out the actual task, the child finishes immediately.

Which won't work if the parent cares about the child's exit() value?

of course not. The above is used to implement demons.

...

...
The much more frequent situation is to call it asynchronously, i.e. the call would be either executed from inside the SIGCHLD handler or somehow triggered by it (such as a semaphore, which is reentrant).

Which was the implementation Angus first suggested, which a few of us pointed out was unsafe as written. I meant to suggest the choice could be left to the user.

You can also avoid blocking with a separate thread to do the wait(),

that doesn't sound good: each sub-process would have to be accompagnied by a thread...that's quite a lot of unnecessary overhead, especially since any reentrant solution would be able to dispatch all sigchilds from a single handler, i.e. the cost would be constant and not dependent on the number of child processes.

...

or by calling waitpid() with WNOHANG in a polling loop.

that's even more expensive. I think, any efficient solution to this problem will be quite intrusive, i.e. it won't be possible to provide some sort of black box which you hand over a callback. It needs tight integration with the rest of the application such as an event loop or a general purpose signal handler (I used to run one thread per application to be responsible for all signals the process could possibly catch (well, beside those that are thread-bound)). Probably the closest what you can get in terms of flexibility / portability is ACE, as was already pointed out earlier. But that's a huge framework, both, in terms of scope as well as code. That's the kind of scope though at which such a facility is best implemented. Regards, Stefan

Gregory Colvin

23 Jun 23 Jun

6:04 a.m.

On Jun 22, 2004, at 5:33 PM, Stefan Seefeld wrote:

...

Gregory Colvin wrote:

...
...
oh ? Any call to fork() On systems that have fork(). should be accompagnied by a call to wait() (or one of its cousins such as 'waitpid()'). On systems that have wait().

yes, I'm only talking about posix systems.

But Boost usually cares about other systems as well.

...

...
...
The only question is whether that has to happen synchronously or asynchronously. And that question could be answered by the process library, or left to the user.

I believe this should be left to the user. However, it would be nice if the library had some way of encapsulating various strategies to execute non-reentrant callbacks in a safe (and of course portable) way. ...

It would be nice, but ...

...

...
You can also avoid blocking with a separate thread to do the wait(),

that doesn't sound good: each sub-process would have to be accompagnied by a thread...that's quite a lot of unnecessary overhead, especially since any reentrant solution would be able to dispatch all sigchilds from a single handler, i.e. the cost would be constant and not dependent on the number of child processes.

Agreed. I'm just trying to be sure all the possiblities are on the table.

...

...
or by calling waitpid() with WNOHANG in a polling loop.

that's even more expensive.

Unless the app has a main event loop anyway? Anyway, just another possibility.

...

I think, any efficient solution to this problem will be quite intrusive, i.e. it won't be possible to provide some sort of black box which you hand over a callback. It needs tight integration with the rest of the application such as an event loop or a general purpose signal handler (I used to run one thread per application to be responsible for all signals the process could possibly catch (well, beside those that are thread-bound)).

Probably the closest what you can get in terms of flexibility / portability is ACE, as was already pointed out earlier. But that's a huge framework, both, in terms of scope as well as code. That's the kind of scope though at which such a facility is best implemented.

Yes, ACE looks like a lot of framework. The Java design is much simpler. I don't know that either is appropriate for Boost.

Angus Leeming

8:47 a.m.

Gregory Colvin wrote:

...

On Jun 22, 2004, at 5:33 PM, Stefan Seefeld wrote:

...
Gregory Colvin wrote:

...
...
oh ? Any call to fork() On systems that have fork(). should be accompagnied by a call to wait() (or one of its cousins such as 'waitpid()'). On systems that have wait().

yes, I'm only talking about posix systems.

But Boost usually cares about other systems as well.

Nice to see I triggered some feedback ;-) I don't think that cleaning up global resources such as zombies should be left to the user if the library can clean them up automatically. Posix platforms emit SIGCHLD signals, so it is natural to use a signal handler to reap the zombies. Windows does not provide such a mechanism, so we are forced to use threads. I'm still at the stage of building thin wrappers around the OS-specific API. These wrappers have the same public interface, but should be otherwise free to do "what is natural" on that platform. They can then be used as blocks to build more complex classes. As an example of what I mean, consider "class process". This provides a handle to the child process and, indeed, multiple instances of the class can point to the same child. Thus, it is natural to implement the class as: class process { process() : child_(new process_instance) {} process(process_data const & data) { spawn(data); } void spawn(process_data const & data) { child_.reset(new process_instance); child_->spawn(data); } // Other functions invoking child_. ... private: boost::shared_ptr<process_instance> child_; }; The copy semantics of such as class are those of a pointer. Moreover, it becomes relatively easy to ensure that class process is thread safe because shared_ptr has the necessary machinery built in. process_instance has the semantics that one instance of the class represents a single child. It cannot be copied because the underlying child cannot be duplicated. Such a design allows system resources to be freed unconditionally in the destructor. (Eg, file descriptors can be closed.) In this design, process_instance is an implementation detail. Posix and Windows have separate implementations of this class, but these implementations have the same public interface. Thereafter, process_instance is free to do what is natural on that platform. On Posix, that means use fork(), exec() and a signal handler. On Windows, something else. Regards, Angus

John Maddock

10:15 a.m.

...

...
Normally you don't want to leak resources, which may or may not involve zombies. And even if zombies are involved, reclaiming them need not require a signal handler, which was my original point.

oh ? Any call to fork() should be accompagnied by a call to wait() (or one of its cousins such as 'waitpid()'). The only question is whether that has to happen synchronously or asynchronously.

Calling 'wait()' synchronously basically means to block right after 'fork()' has been called untill the child process terminates. One technique that uses this approach is to detach the subprocess, i.e. to fork twice, and while the grand-child carries out the actual task, the child finishes immediately.

The much more frequent situation is to call it asynchronously, i.e. the call would be either executed from inside the SIGCHLD handler or somehow triggered by it (such as a semaphore, which is reentrant).

Do I miss something ?

Surely there's no need to call wait until the parent asks for the return value: In fact I'd kind of like the library to be similar to Boost.Threads - a child process is an object that can be waited upon, the library would only need to do something "fancy" like installing a signal hander, if the child object's destructor is called, without the object ever being waited upon. John.

Angus Leeming

10:38 a.m.

John Maddock wrote:

...

Surely there's no need to call wait until the parent asks for the return value: In fact I'd kind of like the library to be similar to Boost.Threads - a child process is an object that can be waited upon, the library would only need to do something "fancy" like installing a signal hander, if the child object's destructor is called, without the object ever being waited upon.

Consider this: int main() { child::process mozilla; mozilla.spawn("mozilla"); return 0; }; You're suggestion, that child::process::~process() invokes wait() means that this program will not exit until mozilla exits. That doesn't seem reasonable here. Even less so if the function launching mozilla was in a control loop... child::process is a *handle* on the child. If the handle goes out of scope, then you lose the abiity to communicate with the child but you shouldn't kill it. I think that if you want to terminate the child, then you should be explicit about it. Either invoke process::terminate yourself or have a wrapper class that does so when the wrapper goes out of scope. Regards, Angus

John Maddock

12:40 p.m.

...

...
Surely there's no need to call wait until the parent asks for the return value: In fact I'd kind of like the library to be similar to Boost.Threads - a child process is an object that can be waited upon, the library would only need to do something "fancy" like installing a signal hander, if the child object's destructor is called, without the object ever being waited upon.

Consider this:

int main() { child::process mozilla; mozilla.spawn("mozilla"); return 0; };

You're suggestion, that child::process::~process() invokes wait() means that this program will not exit until mozilla exits. That doesn't seem reasonable here. Even less so if the function launching mozilla was in a control loop...

I didn't say that (well I hope I didn't), I'm saying that if there has been no explicit wait then the destructor has to take care of cleanup (as far as is possible in such a case), it should *not* wait as you rightly point out. John.

Angus Leeming

2:16 p.m.

John Maddock wrote:

...

...
...
Surely there's no need to call wait until the parent asks for the return value: In fact I'd kind of like the library to be similar to Boost.Threads - a child process is an object that can be waited upon, the library would only need to do something "fancy" like installing a signal hander, if the child object's destructor is called, without the object ever being waited upon.

Consider this:

int main() { child::process mozilla; mozilla.spawn("mozilla"); return 0; };

You're suggestion, that child::process::~process() invokes wait() means that this program will not exit until mozilla exits. That doesn't seem reasonable here. Even less so if the function launching mozilla was in a control loop...

I didn't say that (well I hope I didn't), I'm saying that if there has been no explicit wait then the destructor has to take care of cleanup (as far as is possible in such a case), it should *not* wait as you rightly point out.

Ok, John, I now see what you're talking about. Apologies for getting the wrong end of the stick first time around. See below for my interpretation of how your suggestion would look in terms of code. Here's my take on it: 1. If I don't invoke is_running() or exit_status() then any zombie is left until the destructor is called. That seems to be bad, period. 2. The resulting code seems overly complex. If I install a signal handler then is_running() and exit_status() become a trivial lookup of the std::map<pid_t, int> that stores the exit status of reaped children. 3. What happens if a signal handler has been installed? Eg, by the destructor of a previous instance of the class. Should the signal handler reap only specific pids? Otherwise, the calls to waitpid below are going to fail because the zombie will have been reaped already. Can you explain why you favour doing things your way? (Assuming that I've interpreted "your way" correctly this time ;-) Regards, Angus /** @class unix_instance controls the interaction of the parent * with a child process. * * An instance of unix_instance maps to a single child process. * It cannot be copied because the underlying child process cannot * be duplicated. * * This child process is launched through the @c spawn() member * function which can be invoked successfully only once. * Therafter, an internal flag is set and an attempt to launch * a second child will fail. */ class unix_instance : noncopyable { public: /** * Initializes member data but is otherwise a no-op. * The @c spawn() member function must be invoked explicitly * to launch a child process. */ unix_instance() : ppid_(0), exit_status_(-1), error_(0), has_been_spawned_(false) {} /** * Closes any open file descriptors. * Can do this because of the one-to-one semantic between * unix_instance and the child process. * * If (ppid_ != 0), then the child is still running. Install * a signal handler to ensure that the zombie is reaped. */ ~unix_instance(); /** * @brief Report whether the process is consuming * system resources. * @return @c true if the child process has been launched and * has not yet exited. */ bool is_running() { if (ppid_ == 0) // The process has not yet been started or // has been reaped already. return false; int status; pid_t const pid = ::waitpid(ppid_, &status, WNOHANG); if (pid == 0) return true; if (pid == -1) { // Shouldn't happen? The process has been reaped already. ppid_ = 0; return false; } // The process has ended and the zombie was reaped by the // call to ::waitpid. ppid_ = 0; exit_status = status; return false; } /** * @return The exit status of the child process. * If the process has not completed (or, indeed, started), * returns -1. */ int exit_status() { // Code similar to that in is_running(), above. } /** * @return The error code of the most recently failed * operation, or zero. */ int error() const { return error_; } /** * @brief Launch the child process. * @param data user-specified data defining the child process. * @returns true if the process was started successfully. * * This function can be called successfully only once for any * one instance of the class. Thereafter, an internal flag is * set and the function will no longer attempt to launch a * child process. This occurs because the class has the * semantics that one instance of the class maps to a single * child process. * * If an error occurs whilst launching a child process, then * the error code will be set to one of the possible errors * for @c pipe(), @c fork() etc. See your system's * documentation for these error codes. * * If this happens, then the user <em>can</em> invoke * spawn again. */ bool spawn(process_data const & data); private: /// PID of process. pid_t ppid_; /// Holds the exit status of the child process. int exit_status_; /// Holds errno if, eg, @c ::pipe() or @c ::fork() fails. int error_; /** * Used internally to flag whether a child process has been * spawned successfully. A process can be spawned once only, * so once @c spawn() returns @c true, any subsequent invocation * will return @c false. */ bool has_been_spawned_; };

Gregory Colvin

2:46 p.m.

On Jun 23, 2004, at 8:16 AM, Angus Leeming wrote:

...

... Can you explain why you favour doing things your way? (Assuming that I've interpreted "your way" correctly this time ;-)...

I can't speak for John, but I wouldn't want to pay the price for your signal handler if I'm going to wait for the child anyway.

Angus Leeming

3:52 p.m.

Gregory Colvin wrote:

...

On Jun 23, 2004, at 8:16 AM, Angus Leeming wrote:

...
... Can you explain why you favour doing things your way? (Assuming that I've interpreted "your way" correctly this time ;-)...

I can't speak for John, but I wouldn't want to pay the price for your signal handler if I'm going to wait for the child anyway.

Right. In light of this and the comments from both John Maddock and John Fuller, I think that it is reasonable to say that waiting for child processes is a complex undertaking. What I propose to do, therefore, is to farm it off to a separate class with an interface something like: class unix_reaper : noncopyable { public: static unix_reaper & get(); bool is_running(pid_t); int exit_status(pid_t); // Invoked from unix_process::spawn(). void register(pid_t); // Invoked from unix_process::~unix_process(). void clean_up(pid_t); private: unix_reaper(); ~unix_reaper(); }; Thereafter, how this class goes about its business is an implementation detail that is independent of the rest of the business of interacting with a child process. I'll code up versions that do it using either a signal handler or using John M's approach and will get on with finishing off unix_process itself. Many thanks for the input, Angus

Mathew Robertson

25 Jun 25 Jun

1:02 a.m.

...

...
Can you explain why you favour doing things your way? (Assuming that I've interpreted "your way" correctly this time ;-)...

I can't speak for John, but I wouldn't want to pay the price for your signal handler if I'm going to wait for the child anyway.

that would work... until you run out of system resources due to not cleaning up after your children. The price to pay for installing a signal handler, is less (on most Posix systems) than the price of leaving stale process data hanging around in the OS. Mathew Robertson

Angus Leeming

23 Jun 23 Jun

8:56 a.m.

Gregory Colvin wrote:

...

...
namespace {

std::map<pid_t, int> completed_children;

int status; sig_atomic_t pid;

}

extern "C" void child_handler(int) { pid = wait(&status); std::map<pid_t, int>::iterator it = completed_children.find(pid); if (it != completed_children.end()) it->second = status; }

I'm pretty sure that the above is safe code.

It's unsafe because it's not reentrant.

Really? I thought that Posix guaranteed that a handler receiving only one type of signal (SIGCHLD in this case) would not receive multiple calls simultaneaously? Is this not true in the presence of threads?

...

But why do you need to reap the zombie? I've implemented the java process control natives with no need for reaping.

Addressed elsewhere. Angus

John Fuller

2:24 p.m.

Just some memories about UNIX child processes here - On UNIX systems where I have a daemon fork children, often I build into its service loop a nonblocking wait for any suspended children. (NOHANG). I agree it's good to get the return status of the children. - As a last resort, I suppose one could ioctl the file in the proc directory for a process. - Using SIGCHLD can be tricky if some children are forked by third party code: for example a database client "shadow process" may have its own handling, may change its own process group, and/or on termination may have an unexpected effect on the parent. Additionally modifying the priority of children can have unexpected results in these cases as well. - I recall that signal handling in general requires a lot of care due to the asynchronous nature of signals. - I recall that process and group permissions are a tricky area, some capabilities require sudo authority and some children are better off running as different users, but the other side of the problem is preventing malicious use of the capabilities. On Jun 23, 2004, at 3:56 AM, Angus Leeming wrote:

...

Gregory Colvin wrote:

...
...
namespace {

std::map<pid_t, int> completed_children;

int status; sig_atomic_t pid;

}

extern "C" void child_handler(int) { pid = wait(&status); std::map<pid_t, int>::iterator it = completed_children.find(pid); if (it != completed_children.end()) it->second = status; }

I'm pretty sure that the above is safe code.

It's unsafe because it's not reentrant.

Really? I thought that Posix guaranteed that a handler receiving only one type of signal (SIGCHLD in this case) would not receive multiple calls simultaneaously? Is this not true in the presence of threads?

...
But why do you need to reap the zombie? I've implemented the java process control natives with no need for reaping.

Addressed elsewhere.

Angus

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Jeff Garland

22 Jun 22 Jun

11:24 p.m.

On Tue, 22 Jun 2004 18:34:47 +0100, Angus Leeming wrote

...

I am in the process of writing a little library to control the interaction of a parent program with any child processes that it spawns. Would there be any interest in such a library here?

Sign me up as an interested party. Spawning and controlling processes is extremely handy for doing integration. I don't have time to critique your code at the moment, but you might have a look at the ACE library implementations of these same concepts. I realize it might be tough to look at the implementation, but it is a C++ cross-platform solution that I've used successfully several times now. http://www.dre.vanderbilt.edu/Doxygen/Current/html/ace/classACE__Process.htm... http://www.dre.vanderbilt.edu/Doxygen/Current/html/ace/classACE__Process__Ma... Jeff

7698

Age (days ago)

7701

Last active (days ago)

List overview

Download

24 comments

9 participants

participants (9)

Angus Leeming
Gregory Colvin
Jeff Garland
John Fuller
John Maddock
Jonathan Biggar
Mathew Robertson
Rob Stewart
Stefan Seefeld

Child processes

Angus Leeming

Rob Stewart

Angus Leeming

Jonathan Biggar

Angus Leeming

Gregory Colvin

Stefan Seefeld

Gregory Colvin

Stefan Seefeld

Gregory Colvin

Stefan Seefeld

Gregory Colvin

Stefan Seefeld

Gregory Colvin

Angus Leeming

John Maddock

Angus Leeming

John Maddock

Angus Leeming

Gregory Colvin

Angus Leeming

Mathew Robertson

Angus Leeming

John Fuller

Jeff Garland

tags

participants (9)