[GSoC] Some Ideas about the Boost.Process

Hello, I have been studying the project idea about Boost.Process for a while. Some ideas/opinions poped up in my mind and I feel like to share. Please, comment. IMHO, the best design for the library is not to split the APIs from Windows and POSIX. I Believe that boost should provide a higher abstraction level, and get rid of the platform problems. 1.The current implementation of Boost.Process library has the class boost::process::child, and two classes that inherit it, win32_child and posix_child. I don't believe that this is the best solution. There is only one concept of child in both APIs; some resources are different, it is true, but the concept is only one. For example, the call "boost::process::win32_launch()" is defined to launch a process "the windows API way", with the Win32 context. I don't see why can't we create a process with this call in a POSIX system ignoring what don't apply to it (or find some similar concept and use it instead, as named pipes<->fifos). After all, if you are using a Win32 only class, there is no way to run in linux, so if the code works or don't, it does not matter. By doing this way, we could easy a possibly port to another system. This could be done with programming practices as suggested by Vicente Botet, in a previous topic [1] 2.The wait() method, to wait a process (child actually, but this is another problem) to die, is not asynchronous. Sometimes programmers want a synchronous wait() method, so there is no motive for we to remove this. We should implement a new asynchronous wait() method. This method should not be in the asio library, because it's process specific. It may require the asio library, or the asynchronous piece should be implemented in the boost.process. I'm not sure about this, both seems good for me, so please comment. 3.About the process creation, I think the following practice from the documentation should be avoided: “Another important requirement is to always pass at least one argument to the executable. This must be the name of the executable itself”. There is no motive to require a parameter that you can deduce by yourself. The first parameter, in this case, should be implicit. 4.Would really be nice boost::process::launch() be able to receive any kind of string? This tend to let the programmer go for a single platform solution. As a multiplatform library, this may be avoided. I'm not sure about this matter as well, but if the library is here to provide a higher abstraction level, should it also accept API related types? I believe it should not. There is also another points that I would like to discuss in my GSoC proposal; They are well known problems so I'll just cite they: List and access other processes functionality Named pipes and FIFO (following the item #1, would be the same thing? Something would change? Something to be researched yet) move wait from child to process (as well as future asynchronous wait) Deal with process killing bugs (as well as any other known issue) Test platform issues. Should this be enough for a GSoC proposal? Do you have any other suggestion about what could I discuss in it? Thank you for your time, []'s [1] http://thread.gmane.org/gmane.comp.lib.boost.devel/188736 -- Felipe de Oliveira Tanus E-mail: fotanus@gmail.com Blog: http://www.itlife.com.br Site: http://www.inf.ufrgs.br/~fotanus/ ----- "All we have to decide is what to do with the time that is given us." - Gandalf

I'm in no position to answer your question. I just wanted to chime in that I'm already using Boost.Process, and really like it. I'd like to see it extended and someday become an official part of Boost. If you're careful about what you do, you can write code that uses the library without specifying the platform you're on, but I get what you're going for here. Seems like a reasonable suggestion. As for the first program argument - I already have my own helper function for that, so for some like me there would certainly be a use, but I could see reasons why someone would intentionally mess with argv[0]. Maybe support the current API and add a convenience function? I suspect that accepting any class that models a string would be quite difficult, and probably not worth it. On Tue, Mar 23, 2010 at 10:41 PM, Felipe Tanus <fotanus@gmail.com> wrote:
Hello,
I have been studying the project idea about Boost.Process for a while. Some ideas/opinions poped up in my mind and I feel like to share. Please, comment.
IMHO, the best design for the library is not to split the APIs from Windows and POSIX. I Believe that boost should provide a higher abstraction level, and get rid of the platform problems.
1.The current implementation of Boost.Process library has the class boost::process::child, and two classes that inherit it, win32_child and posix_child. I don't believe that this is the best solution. There is only one concept of child in both APIs; some resources are different, it is true, but the concept is only one. For example, the call "boost::process::win32_launch()" is defined to launch a process "the windows API way", with the Win32 context. I don't see why can't we create a process with this call in a POSIX system ignoring what don't apply to it (or find some similar concept and use it instead, as named pipes<->fifos). After all, if you are using a Win32 only class, there is no way to run in linux, so if the code works or don't, it does not matter. By doing this way, we could easy a possibly port to another system. This could be done with programming practices as suggested by Vicente Botet, in a previous topic [1]
2.The wait() method, to wait a process (child actually, but this is another problem) to die, is not asynchronous. Sometimes programmers want a synchronous wait() method, so there is no motive for we to remove this. We should implement a new asynchronous wait() method. This method should not be in the asio library, because it's process specific. It may require the asio library, or the asynchronous piece should be implemented in the boost.process. I'm not sure about this, both seems good for me, so please comment.
3.About the process creation, I think the following practice from the documentation should be avoided: “Another important requirement is to always pass at least one argument to the executable. This must be the name of the executable itself”. There is no motive to require a parameter that you can deduce by yourself. The first parameter, in this case, should be implicit.
4.Would really be nice boost::process::launch() be able to receive any kind of string? This tend to let the programmer go for a single platform solution. As a multiplatform library, this may be avoided. I'm not sure about this matter as well, but if the library is here to provide a higher abstraction level, should it also accept API related types? I believe it should not.
There is also another points that I would like to discuss in my GSoC proposal; They are well known problems so I'll just cite they:
List and access other processes functionality Named pipes and FIFO (following the item #1, would be the same thing? Something would change? Something to be researched yet) move wait from child to process (as well as future asynchronous wait) Deal with process killing bugs (as well as any other known issue) Test platform issues.
Should this be enough for a GSoC proposal? Do you have any other suggestion about what could I discuss in it?
Thank you for your time,
[]'s
[1] http://thread.gmane.org/gmane.comp.lib.boost.devel/188736
-- Felipe de Oliveira Tanus E-mail: fotanus@gmail.com Blog: http://www.itlife.com.br Site: http://www.inf.ufrgs.br/~fotanus/ ----- "All we have to decide is what to do with the time that is given us." - Gandalf _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Hi John, thank you for the answer. On Wed, Mar 24, 2010 at 3:34 PM, John B. Turpish <jbturp@gmail.com> wrote:
I'm in no position to answer your question. [...]
Since you use the library, I believe you are in position to give your opinion. That is exactly why I asked away, not only to Boris.
[...] but I could see reasons why someone would intentionally mess with argv[0]. Maybe support the current API and add a convenience function?
I can't see why someone would mess with argv[0]. Considering that there is a motive to do that, your suggestion would fit perfectly. Can you please give me an example?
I suspect that accepting any class that models a string would be quite difficult, and probably not worth it.
Agreed. But ignoring the difficult and worthiness of the patch, do you think it would be good? I'm not sure, but something tells me that this would difficult a latter migration from one platform to other. []'s -- Felipe de Oliveira Tanus E-mail: fotanus@gmail.com Blog: http://www.itlife.com.br Site: http://www.inf.ufrgs.br/~fotanus/ ----- "All we have to decide is what to do with the time that is given us." - Gandalf

Felipe Tanus wrote:
On Wed, Mar 24, 2010 at 3:34 PM, John B. Turpish <jbturp@gmail.com> wrote:
[...] but I could see reasons why someone would intentionally mess with argv[0]. Maybe support the current API and add a convenience function?
I can't see why someone would mess with argv[0]. Considering that there is a motive to do that, your suggestion would fit perfectly. Can you please give me an example?
<http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=574069> The reasons come down to making the process look different to ps or similar tools, whether to remove path information, indicate process state, hide the real process name for (weak) security reasons, or make a process reveal its real purpose. In the latter case, I mean changing an interpreter's argv[0] to something that relates to the script it is interpreting. Perl allows modifying $0 for example, on many platforms anyway. None of the foregoing necessarily justifies allowing such manipulation in the Process library. _____ Rob Stewart robert.stewart@sig.com Software Engineer, Core Software using std::disclaimer; Susquehanna International Group, LLP http://www.sig.com IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses.

Hi Stewart, thank you for your reply On Wed, Mar 24, 2010 at 4:35 PM, Stewart, Robert <Robert.Stewart@sig.com> wrote:
Felipe Tanus wrote:
[...] I can't see why someone would mess with argv[0]. Considering that there is a motive to do that, your suggestion would fit perfectly. Can you please give me an example? [...] The reasons come down to making the process look different to ps or similar tools, whether to remove path information, indicate process state, hide the real process name for (weak) security reasons, or make a process reveal its real purpose. In the latter case, I mean changing an interpreter's argv[0] to something that relates to the script it is interpreting. [...]
That sounds reasonable. In this case, Instead of creating a helper as Turpish suggested, I propose to create another method to create a process assuming argv[0] = process name. This way, if using it when needed, may provide a more readable code by decreasing the redundancy in the code. Also, thanks for the info. I didn't research the matter yet, but maybe the argv[0] can help with the problem to identify and send signals to other processes in both platforms. []'s -- Felipe de Oliveira Tanus E-mail: fotanus@gmail.com Blog: http://www.itlife.com.br Site: http://www.inf.ufrgs.br/~fotanus/ ----- "All we have to decide is what to do with the time that is given us." - Gandalf

Felipe Tanus wrote:
Hi John, thank you for the answer.
On Wed, Mar 24, 2010 at 3:34 PM, John B. Turpish <jbturp@gmail.com> wrote:
[...] but I could see reasons why someone would intentionally mess with argv[0]. Maybe support the current API and add a convenience function?
I can't see why someone would mess with argv[0]. Considering that there is a motive to do that, your suggestion would fit perfectly. Can you please give me an example?
There are executables that behave differently depending on the name in argv[0], such as ccache (http://ccache.samba.org/) which emulates different compilers. Normally this happens by creating appropriately named symlinks, but perhaps someone might want to do it directly? Would the current interface support that? John Bytheway

Hi Bytheway, thank you for your relpy.
There are executables that behave differently depending on the name in argv[0], such as ccache (http://ccache.samba.org/) which emulates different compilers. Normally this happens by creating appropriately named symlinks, but perhaps someone might want to do it directly? Would the current interface support that?
Yes, the current interface support it, but also is needed to every process launched set an argv[0]. I'm proposing to give a choice to the user, creating a new method that do it automatically. It's a simple add to the library that would generate more readable codes by decreasing the redundancy. []'s -- Felipe de Oliveira Tanus E-mail: fotanus@gmail.com Blog: http://www.itlife.com.br Site: http://www.inf.ufrgs.br/~fotanus/ ----- "All we have to decide is what to do with the time that is given us." - Gandalf

On Wed, 24 Mar 2010 03:41:42 +0100, Felipe Tanus <fotanus@gmail.com> wrote:
[...]1.The current implementation of Boost.Process library has the class boost::process::child, and two classes that inherit it, win32_child and posix_child. I don't believe that this is the best solution. There [...]This could be done with programming practices as suggested by Vicente Botet, in a previous topic [1]
I agree that Vicente's proposal looks very interesting. In fact there has been a Boost.Process draft from Ilya Sokolov in 2008 without Windows and POSIX types and functions (his drafts can be found at http://www.boostpro.com/vault/index.php?directory=Process). He later said though that he thinks that removing those types and functions was an error (see http://article.gmane.org/gmane.comp.lib.boost.devel/180309). Unfortunately I didn't ask him at that time why he thinks it was an error (probably because it didn't seem to make sense to me at that time either). Now Ilya doesn't seem to follow the Boost mailing lists anymore. :-/
2.The wait() method, to wait a process (child actually, but this is another problem) to die, is not asynchronous. Sometimes programmers want a synchronous wait() method, so there is no motive for we to remove this. We should implement a new asynchronous wait() method.
I agree.
[...]3.About the process creation, I think the following practice from the documentation should be avoided: “Another important requirement is to always pass at least one argument to the executable. This must be the name of the executable itself”. There is no motive to require a parameter that you can deduce by yourself. The first parameter, in this case, should be implicit.
There should be the possibility to set the executable name somehow (and if only not to force developers to use another library just because it's not possible to do this with Boost.Process). However for convenience I agree it makes sense if developers don't have to set the executable name explicitly (the filename can be used by default).
4.Would really be nice boost::process::launch() be able to receive any kind of string? This tend to let the programmer go for a single platform solution. As a multiplatform library, this may be avoided. I'm not sure about this matter as well, but if the library is here to provide a higher abstraction level, should it also accept API related types? I believe it should not.
I also think we can get rid of the Executable concept which currently must be a std::string anyway.
[...]Named pipes and FIFO (following the item #1, would be the same thing? Something would change? Something to be researched yet)
Here the problem is that on Windows only named pipes support asynchronous I/O. Currently a macro has to be defined to make Boost.Process use a named pipe internally. If we distinguish between anonymous and named pipes anyway we could provide appropriate classes which could be used by developers elsewhere. This is something which could even end up in Boost.Interprocess one day. If we want to support asynchronous I/O on Windows we have to think about all of this.
[...]Test platform issues.
The unit tests of Ilya's Boost.Process drafts (the one in the vault) are pretty good because everything works automatically if I remember correctly (and it doesn't with the original Boost.Process draft as described here: http://article.gmane.org/gmane.comp.lib.boost.devel/180285). Boris

Hi Schaeling, thanks for your answer. On Wed, Mar 24, 2010 at 6:28 PM, Boris Schaeling <boris@highscore.de> wrote:
[...] In fact there has been a Boost.Process draft from Ilya Sokolov in 2008 without Windows and POSIX types and functions [...]. He later said though that he thinks that removing those types and functions was an error [...] Unfortunately I didn't ask him at that time why he thinks it was an error (probably because it didn't seem to make sense to me at that time either). [...]
I'll try to contact he and hope to he remembers; This should be the most dedicated matter, so I would not like to do any proposals involving this without knowing why remove this API specific classes would be an error.
I also think we can get rid of the Executable concept which currently must be a std::string anyway.
I agree. In future would not be hard re-enable this concept if needed, differently from the number 1.
Here the problem is that on Windows only named pipes support asynchronous I/O. [..] If we distinguish between anonymous and named pipes anyway we could provide appropriate classes which could be used by developers elsewhere. [...]. If we want to support asynchronous I/O on Windows we have to think about all of this.
I don't have a final idea about this yet, but I'm quite sure we can bypass it. One way to do this, would define two helpers create_named_pipe and create_anonymous_pipe. In the POSIX asynchronous implementation it can receive any pipe, and in Windows only named_pipes. There is the need of other helper, create_async_pipe, to create in windows named and in POSIX unnamed, this closing all combinations. A little tricky, but multiplataform and the user keep the control about the name of his pipes.
The unit tests of Ilya's Boost.Process drafts (the one in the vault) are pretty good because everything works automatically if I remember correctly [...]
I'm not sure what issues I was talking about. Sorry for the noise. -- Felipe de Oliveira Tanus E-mail: fotanus@gmail.com Blog: http://www.itlife.com.br Site: http://www.inf.ufrgs.br/~fotanus/ ----- "All we have to decide is what to do with the time that is given us." - Gandalf

On Wed, Mar 24, 2010 at 8:36 PM, Felipe Tanus <fotanus@gmail.com> wrote:
On Wed, Mar 24, 2010 at 6:28 PM, Boris Schaeling <boris@highscore.de> wrote:
[...] In fact there has been a Boost.Process draft from Ilya Sokolov in 2008 without Windows and POSIX types and functions [...]. He later said though that he thinks that removing those types and functions was an error [...] Unfortunately I didn't ask him at that time why he thinks it was an error (probably because it didn't seem to make sense to me at that time either). [...]
I'll try to contact he and hope to he remembers; [...]
Hello, Ilya answered my e-mail. [quote] At that time I thought that removing win32_* and posix_* may prevent adding some OS-specific features. I've changed my opinion since then. As I see it now, the library should concentrate on simple use cases, and more complex cases will be simpler to implement using OS API directly. [/quote] It sure would be far more complex, but IMHO it is worth. Having a no OS specific library could be the motive in the first place for someone look for boost.process. What you think about try to remodeling the library again? Would it be nice if we manage to do it, but in the future we may give it up and back to the first model, loosing our times. -- Felipe de Oliveira Tanus E-mail: fotanus@gmail.com Blog: http://www.itlife.com.br Site: http://www.inf.ufrgs.br/~fotanus/ ----- "All we have to decide is what to do with the time that is given us." - Gandalf

On Thu, 25 Mar 2010 11:48:22 +0100, Felipe Tanus <fotanus@gmail.com> wrote:
[...]It sure would be far more complex, but IMHO it is worth. Having a no OS specific library could be the motive in the first place for someone look for boost.process. What you think about try to remodeling the library again? Would it be nice if we manage to do it, but in the
Sounds good to me! We could look at Ilya's draft to see how he modeled the class interfaces. And we could take the implementation from my draft as it has been thoroughly tested on several platforms. Boris

On Thu, Mar 25, 2010 at 8:30 AM, Boris Schaeling <boris@highscore.de> wrote:
Sounds good to me! We could look at Ilya's draft to see how he modeled the class interfaces. And we could take the implementation from my draft as it has been thoroughly tested on several platforms.
Boris
Great! I have been thinking about include some basic signal IPC for the library. This would help increase the possibilities when a user want to do IPC. What do you think about use POSIX or Windows API signals according to the machine? This way if the programmer use the definitions, we could use an equivalent in other platform. I discovery that Windows has only 2 signals which are defined with a different number than POSIX, SIGBREAK and SIGABRT. If was not by theses, would be trivial define the constants. Also, of course, the signals may be used with user defined constants. -- Felipe de Oliveira Tanus E-mail: fotanus@gmail.com Blog: http://www.itlife.com.br Site: http://www.inf.ufrgs.br/~fotanus/ ----- "All we have to decide is what to do with the time that is given us." - Gandalf

On Thu, 25 Mar 2010 20:42:01 +0100, Felipe Tanus <fotanus@gmail.com> wrote:
[...]I have been thinking about include some basic signal IPC for the library. This would help increase the possibilities when a user want to do IPC. What do you think about use POSIX or Windows API signals according to the machine? This way if the programmer use the definitions, we could use an equivalent in other platform. I discovery that Windows has only 2 signals which are defined with a different number than POSIX, SIGBREAK and SIGABRT. If was not by theses, would be trivial define the constants.
As far as I am concerned Boost.Process shouldn't be required to support signals. First there are already so many other things to do. Second it's true that Windows doesn't support signals at all (apart from those two you mentioned and I think SIGFPE). You are then in POSIX land anyway? Thus I wouldn't care about signals for now. Boris

Boris, Thanks for your answer On Thu, Mar 25, 2010 at 5:51 PM, Boris Schaeling <boris@highscore.de> wrote:
As far as I am concerned Boost.Process shouldn't be required to support signals. First there are already so many other things to do.
True.
Second it's true that Windows doesn't support signals at all (apart from those two you mentioned and I think SIGFPE). You are then in POSIX land anyway?
Windows support signals. I got this from signal.h (from a windows system): [quote] #define SIGINT 2 /* interrupt */ #define SIGILL 4 /* illegal instruction - invalid function image */ #define SIGFPE 8 /* floating point exception */ #define SIGSEGV 11 /* segment violation */ #define SIGTERM 15 /* Software termination signal from kill */ #define SIGBREAK 21 /* Ctrl-Break sequence */ #define SIGABRT 22 /* abnormal termination triggered by [/quote] More info in MSDN[1]
Thus I wouldn't care about signals for now.
I got your point. There is more work to do instead keep adding unfinished features to the lib. In my proposal, I'll not include this, but I would like to program this someday. [1] http://msdn.microsoft.com/en-us/library/xdkz3x12%28VS.71%29.aspx -- Felipe de Oliveira Tanus E-mail: fotanus@gmail.com Blog: http://www.itlife.com.br Site: http://www.inf.ufrgs.br/~fotanus/ ----- "All we have to decide is what to do with the time that is given us." - Gandalf

On Thu, 25 Mar 2010 22:16:34 +0100, Felipe Tanus <fotanus@gmail.com> wrote:
[...]
Second it's true that Windows doesn't support signals at all (apart from those two you mentioned and I think SIGFPE). You are then in POSIX land anyway?
Windows support signals. I got this from signal.h (from a windows system):
[quote] #define SIGINT 2 /* interrupt */ #define SIGILL 4 /* illegal instruction - invalid function image */ #define SIGFPE 8 /* floating point exception */ #define SIGSEGV 11 /* segment violation */ #define SIGTERM 15 /* Software termination signal from kill */ #define SIGBREAK 21 /* Ctrl-Break sequence */ #define SIGABRT 22 /* abnormal termination triggered by [/quote]
According to MSDN SIGILL, SIGSEGV and SIGTERM are not generated. Thus we have support for incredible four signals - not sure if it's worth to create a platform independent library because of this. :) Boris

Hi Boris, On Thu, Mar 25, 2010 at 6:37 PM, Boris Schaeling <boris@highscore.de> wrote:
According to MSDN SIGILL, SIGSEGV and SIGTERM are not generated. Thus we have support for incredible four signals - not sure if it's worth to create a platform independent library because of this. :)
Ow, you are right. Without even SIGTERM I guess it's pointless. []'s -- Felipe de Oliveira Tanus E-mail: fotanus@gmail.com Blog: http://www.itlife.com.br Site: http://www.inf.ufrgs.br/~fotanus/ ----- "All we have to decide is what to do with the time that is given us." - Gandalf

Qui, 2010-03-25 às 18:16 -0300, Felipe Tanus escreveu:
Boris, Thanks for your answer
On Thu, Mar 25, 2010 at 5:51 PM, Boris Schaeling <boris@highscore.de> wrote:
As far as I am concerned Boost.Process shouldn't be required to support signals. First there are already so many other things to do.
True.
Second it's true that Windows doesn't support signals at all (apart from those two you mentioned and I think SIGFPE). You are then in POSIX land anyway?
Windows support signals. I got this from signal.h (from a windows system):
[quote] #define SIGINT 2 /* interrupt */ #define SIGILL 4 /* illegal instruction - invalid function image */ #define SIGFPE 8 /* floating point exception */ #define SIGSEGV 11 /* segment violation */ #define SIGTERM 15 /* Software termination signal from kill */ #define SIGBREAK 21 /* Ctrl-Break sequence */ #define SIGABRT 22 /* abnormal termination triggered by [/quote]
Don't get yourself fooled by that, that is just the C library. When you setup a handler for the SIGBREAK a background thread is created which waits for Control-C notifications (see the win32 console API) and then runs the handler. To make this clear, the win32 subsystem does not support signals! By the way, the POSIX subsystem uses an undocumented NT mechanism (a debug port) to properly implement signals.
More info in MSDN[1]
Thus I wouldn't care about signals for now.
I got your point. There is more work to do instead keep adding unfinished features to the lib. In my proposal, I'll not include this, but I would like to program this someday.
[1] http://msdn.microsoft.com/en-us/library/xdkz3x12%28VS.71%29.aspx

Hi Santos, On Thu, Mar 25, 2010 at 9:22 PM, Bruno Santos <bsantos@av.it.pt> wrote:
[quote] #define SIGINT 2 /* interrupt */ #define SIGILL 4 /* illegal instruction - invalid function image */ #define SIGFPE 8 /* floating point exception */ #define SIGSEGV 11 /* segment violation */ #define SIGTERM 15 /* Software termination signal from kill */ #define SIGBREAK 21 /* Ctrl-Break sequence */ #define SIGABRT 22 /* abnormal termination triggered by [/quote]
[...] To make this clear, the win32 subsystem does not support signals! By the way, the POSIX subsystem uses an undocumented NT mechanism (a debug port) to properly implement signals.
I got a good lesson here: Don't be fooled by windows header files! Thanks for the warning. -- Felipe de Oliveira Tanus E-mail: fotanus@gmail.com Blog: http://www.itlife.com.br Site: http://www.inf.ufrgs.br/~fotanus/ ----- "All we have to decide is what to do with the time that is given us." - Gandalf
participants (7)
-
Boris Schaeling
-
Bruno Santos
-
Daniel Trebbien
-
Felipe Tanus
-
John B. Turpish
-
John Bytheway
-
Stewart, Robert