[GSoC] Opinions and suggestions for improving CGI library

From my experience with PHP, session handling tends to be a heavily used feature that this CGI library lacks. I intend to design an interface for session handling that would allow the use of memory mapped files (with the help of Boost.Interprocess) and relational databases as storage options. I will also implement session handling with one of the possible storage
Hi everyone, I hope to participate in this year's Google Summer of Code by working on a project for Boost, and would like to get opinions and suggestions on what I intend to do. In GSoC 2007, Darren Garvey worked on a CGI library under the mentorship of Christopher Kohlhoff. This sounded interesting to me, as my first forays into programming were for the Web with PHP, but I never actually used CGI with C++, even though I have been aware of cgicc and Wt for some time. I contacted Darren, who informed me that while he was still working on the library, he lacked the time to implement many features, and would welcome help. There were a number of features that he highlighted, but three of them stood out for me: * Persistent sessions options. * Improved multipart/form-data parsing Currently, the parsing is done with regex, which is something of a hack in this case. The aim is to replace the use of regex with the use of Boost.Spirit. From what I see, this is also an area that needs more testing, hence developing the test suite would be an important part of this subproject. * Windows support for Fast CGI Portability is one of the goals of Boost, and this is of interest to me personally as I use Windows quite often. However, I am unable to conceive a solution (or even pinpoint the problem) to this right now, but I believe that with time I can figure it out, especially since Darren has some idea as to what might need to be done. For reference: Darren's proposal abstract: http://code.google.com/soc/2007/boost/appinfo.html?csaid=5869D5120647336D Current documentation: http://omnisplat.com/docs/ In Boost sandbox: https://svn.boost.org/svn/boost/sandbox/SOC/2007/cgi/ Thank you, Eugene Wee

Eugene Wee skrev:
Hi everyone,
I hope to participate in this year's Google Summer of Code by working on a project for Boost, and would like to get opinions and suggestions on what I intend to do.
In GSoC 2007, Darren Garvey worked on a CGI library under the mentorship of Christopher Kohlhoff. This sounded interesting to me, as my first forays into programming were for the Web with PHP, but I never actually used CGI with C++, even though I have been aware of cgicc and Wt for some time.
I contacted Darren, who informed me that while he was still working on the library, he lacked the time to implement many features, and would welcome help. There were a number of features that he highlighted, but three of them stood out for me:
Eugene, It would be great if you could work with Darren to get this library ready for review. Just my two cents. -Thorsten

Hi Eugene, 2009/3/26 Eugene Wee <crystalrecursion@gmail.com>
I hope to participate in this year's Google Summer of Code by working on a project for Boost, and would like to get opinions and suggestions on what I intend to do.
In GSoC 2007, Darren Garvey worked on a CGI library under the mentorship of Christopher Kohlhoff. This sounded interesting to me, as my first forays into programming were for the Web with PHP, but I never actually used CGI with C++, even though I have been aware of cgicc and Wt for some time.
I contacted Darren, who informed me that while he was still working on the library, he lacked the time to implement many features, and would welcome help. There were a number of features that he highlighted, but three of them stood out for me:
Having spoken to Eugene off-list, I'm really interested in reading his proposal. I was (another) one of those people who thought I'd be able to make time to complete the GSoC work after it ended - but here we are nearly 2 years later and there are still some glaring areas that I haven't had time to properly implement, document and write unit tests for. Completing it (or any of the other unfinished GSoC projects for that matter) would be really quite cool, IMHO. I think the features Eugene mentioned should fit into the GSoC timeline, as long as the proposal is reasonably focused. I'd love to get involved in helping, but I'm not sure how best to do that. Is there an 'approved list of mentors' (note: this question has already been asked here: http://lists.boost.org/Archives/boost/2009/03/149883.php)? Cheers, Darren

Darren Garvey wrote:
I think the features Eugene mentioned should fit into the GSoC timeline, as long as the proposal is reasonably focused.
I'm hoping to get a way to give students a way to make their proposals public, to facilitate open discussion. The depends on the google team working on the gsoc webapp. I'll let you know more as I get it. Meanwhile Eugene should put his proposal up someplace public... -t

2009/3/27 troy d. straszheim <troy@resophonic.com>
Darren Garvey wrote:
I think the features Eugene mentioned should fit into the GSoC timeline, as long as the proposal is reasonably focused.
I'm hoping to get a way to give students a way to make their proposals public, to facilitate open discussion. The depends on the google team working on the gsoc webapp. I'll let you know more as I get it. Meanwhile Eugene should put his proposal up someplace public...
Perhaps I could put myself forward to mentor this project? Clearly it's up to the discretion of others, but consider my name in the hat nonetheless. Kind regards, Darren

Darren Garvey wrote:
2009/3/27 troy d. straszheim <troy@resophonic.com>
Darren Garvey wrote:
I think the features Eugene mentioned should fit into the GSoC timeline, as long as the proposal is reasonably focused.
I'm hoping to get a way to give students a way to make their proposals public, to facilitate open discussion. The depends on the google team working on the gsoc webapp. I'll let you know more as I get it. Meanwhile Eugene should put his proposal up someplace public...
Perhaps I could put myself forward to mentor this project? Clearly it's up to the discretion of others, but consider my name in the hat nonetheless.
For now, why don't you apply as a mentor on the gsoc website. Of course it'd be great to have you work with Eugene and try to bring the library to review. We can dig up a 'statutory mentor' and make this happen, should all the other selection criteria fall in to line. -t

Hi, I'm hoping to get a way to give students a way to make their proposals
public, to facilitate open discussion. The depends on the google team working on the gsoc webapp. I'll let you know more as I get it. Meanwhile Eugene should put his proposal up someplace public...
As suggested, I have uploaded my proposal draft to a web page: http://www.comp.nus.edu.sg/~weehke/gsoc/proposal2009.html Thank you, Eugene

Eugene Wee wrote:
As suggested, I have uploaded my proposal draft to a web page: http://www.comp.nus.edu.sg/~weehke/gsoc/proposal2009.html
Hi Eugene, Quoting from your web page:
Session handling allows data to persist across requests. Each session has a unique session identifier, and stores an arbitrary number of name/value pairs. The name will be a string, while the value will be a serializable object. The session identifier could be propagated via a cookie, the query string, or a hidden form field. To avoid session fixation, session identifiers can be regenerated.
At the moment, the CGI library lacks session handling entirely. My aim is to design a session handler interface that allows for pluggable storage formats: a (possibly memory mapped) file or a database table could be used to store the session's name/value pairs. As such, this part of the project will involve designing both the session interface and the session handler interface, and then an implementation will be made with at least one possible storage format.
Web applications have persistent data associated with sessions, but they also have persistent data associated with e.g. a username or page, etc. I would think that IFF the framework provides a good way to access cookies, form variables etc AND handle the data storage (database, Boost.Serialisation, Boost.Interprocess) THEN the user should be allowed to plumb these together as they wish: there are many possible combinations with no "one size fits all".
Currently, multipart/form-data parsing is done with regular expressions. However, regular expressions are not adequate to describe the structure, hence this is difficult to read and maintain. I intend to re-implement multipart/form-data parsing by using Boost.Spirit.
multipart/form-data is basically MIME, so providing a library that can also be used more generally would be useful. It would be a shame to implement a MIME parser and then to hide it as an implementation detail inside another library. I have previously implemented a basic multipart/form-data parser using "traditional" string manipulation (i.e. find, substr etc). It's not clear to me that Boost.Spirit offers a significant benefit for this task. Anyway, this is all important stuff that we should have in Boost. Go for it. Phil.

Hi Phil, Thank you for your feedback :) Web applications have persistent data associated with sessions, but they
also have persistent data associated with e.g. a username or page, etc. I would think that IFF the framework provides a good way to access cookies, form variables etc AND handle the data storage (database, Boost.Serialisation, Boost.Interprocess) THEN the user should be allowed to plumb these together as they wish: there are many possible combinations with no "one size fits all".
The CGI library already provides ways to access cookies and form variables. I think that, in general, data storage is separate from this library. Certainly, with say a database library and what is already implemented, a user can implement her/her own session handling, but the point is to provide a convenient interface to do so rather than require users to reinvent it. Since data storage is separate, I view data storage types as something that should be pluggable. multipart/form-data is basically MIME, so providing a library that can also
be used more generally would be useful. It would be a shame to implement a MIME parser and then to hide it as an implementation detail inside another library.
I agree, but I am not sure if the GSoC timeframe will allow me to be so general and yet work on the other two aspects that I wish to improve for the CGI library. However, this could be a precursor to such a library (perhaps for GSoC 2010, haha). I have previously implemented a basic multipart/form-data parser using
"traditional" string manipulation (i.e. find, substr etc). It's not clear to me that Boost.Spirit offers a significant benefit for this task.
Certainly, I would like to keep things simple, and the main point is that the current use of regular expressions is overly complex and difficult to read and maintain. Should I then modify my proposal to aim to investigate the best way of re-implementing this part of the CGI library, and then perform the re-implementation? Thanks, Eugene

Hi Eugene, Some time ago I've proposed a new parser for program_options that would allow so retrieve CGI parameters the program_options way. It was based on a personal implementation, and I had told Darren that it would be great if he could finalize his library in order to base this parser on it. Since then I've had many personal things to do and was busy with the geometry library but I still want to add this parser. However, I don't think it will handle FastCGI, at least at first. Another reason to reimplement my parser in terms of Darren's library. So I'm interested by your proposal. Bruno
participants (6)
-
Bruno Lalande
-
Darren Garvey
-
Eugene Wee
-
Phil Endecott
-
Thorsten Ottosen
-
troy d. straszheim