CGI / FastCGI library update

Darren Garvey

19 May 2010 19 May '10

12:14 a.m.

Hi, I have packaged a version of a CGI / FastCGI library* up to sourceforge and would very much appreciate feedback and critique from interested parties. Feedback from this list has been invaluable in the past. You can download the library from: http://sf.net/projects/cgi/files The documentation still needs work, which I am doing as I can find the time. It can be found online at: http://cgi.sf.net/docs and a brief feature list can be found from: http://cgi.sf.net The library interface has reached a relatively stable point after a savage bunch of changes recently and I'm quite pleased with how it is shaping up. A brief list of features of the library are: * Out of the box support for FastCGI and CGI, tested with Apache 2.2 and mod_fcgid on Windows XP, 7 and linux. * Type-safe access to different request data (eg. get / post / environment / cookies) * Access to request data in STL-like constructs, with some CGI-specific helper functions. * Optional support for user-defined, strongly-typed sessions. * A lightweight utility wrapper for using HTML / XML / etc. templates - dubbed "stencils". * A traits-based design that allows library users to alter the implementation of parts of the library at compile time. For reference, the code is also in the Boost sandbox at: https://svn.boost.org/svn/boost/sandbox/SOC/2007/cgi/trunk Cheers, Darren * the one started way back in the GSoC of 2007...

Show replies by date

Artyom

19 May 19 May

9:14 a.m.

...

I have packaged a version of a CGI / FastCGI library* up to sourceforge and would very much appreciate feedback and critique from interested parties. Feedback from this list has been invaluable in the past.

I had take a look in the project in about half an hour. I have **very good** experience in C++ web application/libraries development as I'm an author of CppCMS. So I took a look especially at points I know that tend to be weak. Lets start. Project ------- Please describe what is your final target. Protocols: ---------- - Unix domain sockets - MUST - Make request abstract class rather then concept. Unless you want to recompile your application to work with each type of connection. I you only what to change configuration if you work over Unix sockets, TCP sockets or you work with scgi instead of fcgi. Beleive me I've been there I know what I'm talking about. The way asio works fine for work for general cases of implementing specific protocols but very bad for applications that may work with different sources of data So if I would be doing formal review I would say that this is no-go solution. - I'd recommend you also implement SCGI as it is very simple and very similar to fcgi in abilities. - You need to handle signals: http://www.fastcgi.com/docs/faq.html#Signals This is how web server would shutdown your application: Unicode ------- - What is basic_cookie<wchar_t> ? Have you seen anything like that in RFC? How would you convert octets cookie to wide? What encoding? What library do you use for code-page conversion? Just use plain string. Don't try to push wchar_t to your application as you'll get in deep troubles. Want Unicode? Use UTF-8 and stay away from "wide characters" they would may your life much harder and would not bring you a single advantage over UTF-8. I've been there too... Acceptors ---------- You need acceptors to be configurable to work with: - Unix domain sockets from arbitrary external socket and **stdin** - TCP/IP sockets from arbitrary port and **stdin** Cookies: -------- - Use max age rather then expiration date, it is much more reliable accorss various system with unsichronized clocks. - You must not URL-Decode cookies (see apropriate RFC) - You must parse cookies inside quotes as well. Cookie as foo="שלום" is actually valid cookie. Sessions: -------- - Not thread safe. The code you written is no-go. You need to do some hard work to make it safe. - Session managements I would strongly not recommend you using boost::serialization for this purpose. Performance is terrible (from my experience). But this is up to you. General Notes: -------------- - Don't use boost::lexical_cast for conversion between numbers and string - you may be surprised what happens with it when you start using localization... (bad things) Web Servers: ------------ - Test with more then one web server: test with lighttpd and nginx as you find some things different. - I can say from the begging you will have problem with IIS. I don't think IIS FastCGI connector is good enough today. You will probably need to work over pipes. IIS FastCGI has some TCP/IP support but I don't think it is mature enough. Now few additional unrelated points: ------------------------------------ Take a look on two following projects: - CppCMS (the web framework I develop) http://art-blog.no-ip.info/wikipp/en/page/main The wiki is build on CppCMS. - CgiCC is a cgi library (that supports fcgi as well) and supports about almost everything you have in your code. (maybe no sessions, but I don't think your session implementation is not safe at least at this point) I use this library in CppCMS 0.0.x but removed it from CppCMS 1.x.x for several reasons. Note they licensed under LGPL and not Boost license. Best, Artyom

Darren Garvey

11:28 p.m.

Hi Artyom, Thanks for taking a look. I'm aware of CppCMS so I appreciate the comments. On 19 May 2010 10:14, Artyom <artyomtnk@yahoo.com> wrote:

...

Project -------

Please describe what is your final target.

Noted.

...

Protocols: ----------

- Unix domain sockets - MUST

As noted in the docs, this is a future goal. What makes them such a necessity?

...

- Make request abstract class rather then concept. Unless you want to recompile your application to work with each type of connection. I you only what to change configuration if you work over Unix sockets, TCP sockets or you work with scgi instead of fcgi.

Beleive me I've been there I know what I'm talking about.

As I see it, recompiling just to use Unix sockets instead of TCP is unacceptable, but recompiling to use a different protocol is ok. The way the library is currently designed makes the Protocol a strictly compile-time choice. A library user can switch between protocols at runtime using, for example: fcgi::service service; fcgi::acceptor acceptor(service); if (acceptor.is_cgi()) // handle a fcgi::request (or more than one) else // handle a cgi::request A single, templated request handler can work transparently with cgi::request or fcgi::request so the above code is about all that is needed to support both. CGI and FastCGI are so different there is non-trivial overhead in supporting both transparently. I would rather a user explicitly do something like the above than the library impose this overhead on them... That said, I could be convinced that there is a valid use-case for needing to support both transparently. As an example, FastCGI doesn't require stdio at all, so simple FastCGI applications tend to be smaller than an equivalent CGI one due to the overhead of pulling in <stdio>. Complex FastCGI applications will tend to be very different to CGI ones due to the different abilities of a FastCGI, which may well include expensive state information. The way asio works fine for work for general cases of implementing

...

specific protocols but very bad for applications that may work with different sources of data

So if I would be doing formal review I would say that this is no-go solution.

- I'd recommend you also implement SCGI as it is very simple and very similar to fcgi in abilities.

Indeed. I started support for SCGI in the past but have removed those parts for now since they aren't complete. Adding support is just a matter of fnding the time as the design of the library should support an implementation.

...

- You need to handle signals: http://www.fastcgi.com/docs/faq.html#Signals This is how web server would shutdown your application:

For now users have to handle signals, although I agree that adding some handling of signals and allowing a graceful shutdown would be useful. Unicode

...

-------

- What is basic_cookie<wchar_t> ? Have you seen anything like that in RFC? How would you convert octets cookie to wide? What encoding? What library do you use for code-page conversion?

Just use plain string.

Fair point. I'll remove this.

...

Don't try to push wchar_t to your application as you'll get in deep troubles. Want Unicode? Use UTF-8 and stay away from "wide characters" they would may your life much harder and would not bring you a single advantage over UTF-8.

I've been there too...

I agree that I18N support is a must, but unfortunately I'm pretty ignorant with respect to Unicode. Every char and string in the library derives from the traits of the protocol being used, so this is configurable throughout the library, but I have not done any more to support them. I think basic_char<wchar_t> is the only typedef that uses wchar_t and it isn't used anywhere, fwiw. Presumably different algorithms are required for decoding / encoding algorithms for wide chars too and these are not yet included. Patches are welcome! Acceptors

...

----------

You need acceptors to be configurable to work with:

- Unix domain sockets from arbitrary external socket and **stdin** - TCP/IP sockets from arbitrary port and **stdin**

The acceptors currently work on one type of connection, based on the Protocol. Support for Unix domain sockets is pending, as mentioned in the docs. TCP sockets are supported from an arbitrary port on linux using: fcgi::service service; fcgi::acceptor acceptor(service, 8008); // port 8008 // ... This was the default behaviour on both Windows and Linux until recently, when I swapped out TCP for anonymous pipes on Windows to make getting started easier. The above is still possible on windows, but only by defining a custom protocol that uses TCP, which is a bit OTT. I will add an example to the documentation that shows how to do this for now, but supporting multiple transport types would be nicer. Cookies:

...

--------

- Use max age rather then expiration date, it is much more reliable accorss various system with unsichronized clocks. - You must not URL-Decode cookies (see apropriate RFC)

I've found conflicting resources on this, eg. http://cephas.net/blog/2005/04/01/asp-java-cookies-and-urlencode/ Rereading the RFC, there is no mention of decoding, so I'll stop that by default (but might make it configurable at compile-time).

...

- You must parse cookies inside quotes as well. Cookie as foo="שלום" is actually valid cookie.

Interesting, I wasn't aware this was valid for receiving cookies on a server. Do you have a reference for this?

...

Sessions: --------

- Not thread safe. The code you written is no-go. You need to do some hard work to make it safe.

The default session support is very basic. The session support has been specifically included in a configurable way so a library user needs only define a session type (the base class for session) and a session manager type, which provides three public functions in order to work. The idea is that real-world apps might want to plug in their own database libraries in place of the built-in (optional) session support. This needs to be documented, added to my TODO. - Session managements I would strongly not recommend you

...

using boost::serialization for this purpose. Performance is terrible (from my experience).

What would you suggest as an alternative?

...

But this is up to you.

General Notes: --------------

- Don't use boost::lexical_cast for conversion between numbers and string - you may be surprised what happens with it when you start using localization... (bad things)

I think supporting lexical conversion is a requirement of any CGI library, so the as<> and pick<> data access functions are provided for this use. In the absence of better an alternative, if these two functions were documented with caveat about lexical_cast, would that be sufficient?

...

Web Servers: ------------

- Test with more then one web server: test with lighttpd and nginx as you find some things different.

Indeed, I am testing with lighttpd and nginx at the moment. I have found an experimental plugin for for nginx that supports "proper" multiplexing too, which looks promising. - I can say from the begging you will have problem with IIS. I don't

...

think IIS FastCGI connector is good enough today. You will probably need to work over pipes.

FastCGI works with IIS on Windows as anonymous pipes are used on that platform. I see the part of the documentation that is misleading you so I'lll fix that. The FastCGI connector in IIS certainly seems tailored to PHP use as it doesn't reuse connections particularly well. This may be an issue with my configuration though. Now few additional unrelated points:

...

------------------------------------

Take a look on two following projects:

- CppCMS (the web framework I develop) http://art-blog.no-ip.info/wikipp/en/page/main

The wiki is build on CppCMS.

Your library is higher-level than the one I am proposing and somewhat orthogonal. This library aims to provide a means of developing several different high-level web development frameworks.

...

- CgiCC is a cgi library (that supports fcgi as well) and supports about almost everything you have in your code. (maybe no sessions, but I don't think your session implementation is not safe at least at this point)

CgiCC is not compatible with the BSL so I can't use it. It also does not support lazy loading of requests or multiplexed FastCGI, which have always been goals of this library. Thanks for the feedback. Cheers, Darren

Artyom

20 May 20 May

7:04 a.m.

...

...
- Unix domain sockets - MUST

As noted in the docs, this is a future goal. What makes them such a necessity?

Because they are much more efficient then TCP/IP onces, so if you deploy application on Unix you will want to use them. They also have several other advantages like not colliding with other application when listening on same port accidentally.

...

As I see it, recompiling just to use Unix sockets instead of TCP is unacceptable,

I agree that there is no reason to do this for FastCGI and CGI (even official libfcgi supports it) but if you once implement SCGI you will want to be able to switch between FCGI/SCGI without recompilation.

...

Indeed. I started support for SCGI in the past but have removed those parts for now since they aren't complete. Adding support is just a matter of fnding the time as the design of the library should support an implementation.

I fully understand this, I just suggested, not SCGI protocol is so simple that it can be implemented in several hours. (At least that what It had take for CppCMS)

...

I agree that I18N support is a must, but unfortunately I'm pretty ignorant with respect to Unicode. Every char and string in the library derives from the traits of the protocol being used, so this is configurable throughout the library, but I have not done any more to support them. I think basic_char<wchar_t> is the only typedef that uses wchar_t and it isn't used anywhere,

Once again, don't try to leave in illusion that support of wide character would give you any advantage in Unicode support. See: http://cppcms.sourceforge.net/boost_locale/html/tutorial.html#myths

...

...
- You must parse cookies inside quotes as well. Cookie as foo="שלום" is actually valid cookie.

Interesting, I wasn't aware this was valid for receiving cookies on a server. Do you have a reference for this?

Take a look on: http://www.ietf.org/rfc/rfc2109.txt I would generally recommend always refer to proper RFC.

...

...
Session managements I would strongly not recommend you using boost::serialization for this purpose. Performance is terrible (from my experience).

What would you suggest as an alternative?

Use key value pairs as they most popular data stored in sessions. And allow value be serializeable object for complex data structures.

...

I think supporting lexical conversion is a requirement of any CGI library, so the as<> and pick<> data access functions are provided for this use. In the absence of better an alternative, if these two functions were documented with caveat about lexical_cast, would that be sufficient?

You can always use std::stringstream (what lexical_cast is actually uses) but you must imbue std::locale to the stream like this. template<typename Number> Number to_number(std::string const &s) { std::stringstream ss; ss.imbue(std::locale::classic()); ss.str(s); Number r; ss >> r; if(!ss) throw .. return r; }

...

Indeed, I am testing with lighttpd and nginx at the moment. I have found an experimental plugin for for nginx that supports "proper" multiplexing too, which looks promising.

Few words about multiplexing: 1. There is not a single web server that implements multiplexing as it much simpler and efficient to just open another socket. 2. More then that the only web server I have ever seen using Keep-Alive was cherooke (and I think IIS over pipes because pipes are not sockets) 3. Even official fastcgi library do not support multiplexing. 4. There is always way to tell to web server if application supports multiplexing or not. (on of commands of fcgi) 5. There is deep problem with multiplexing as there is no way to tell fastcgi application that it can't send data meanwhile For example you have two clients downloading big csv file of 1G one has connection slower in two times then other. So if they have multiplexed connection then either both clients will revive data at the lowest speed or web server will have to store about 0.5G in its internal buffers. So multiplexing is generally bad idea. My suggestion - don't waste your time on it. It useless feature that theoretically could be useful but nobody uses it. Also fastcgi specifications allow you as library developer not to support multiplexing (actually I hadn't seen any fastcgi client that implements multiplexing).

...

CgiCC is not compatible with the BSL so I can't use it. It also does not support lazy loading of requests or multiplexed FastCGI, which have always been goals of this library.

As I mentioned before multiplexed FastCGI exists only on paper. Don't waste your time. Best, regards, Artyom

Darren Garvey

5:41 p.m.

Hi Artyom, On 20 May 2010 08:04, Artyom <artyomtnk@yahoo.com> wrote:

...

...
...
- Unix domain sockets - MUST

As noted in the docs, this is a future goal. What makes them such a necessity?

Because they are much more efficient then TCP/IP onces, so if you deploy application on Unix you will want to use them.

I'm going to add this support soon so I can take some performance metrics. I'd be suprised if the performance difference was huge, as TCP sockets are pretty efficient these days. They also have several other advantages like not colliding with other

...

application when listening on same port accidentally.

IIUC mod_fcgid will only try to use a free port and carry on regardless if if finds one in its working range which is in use. The library will simply use the port (or ports) assigned to it unless you explicitly bind to a particular port. I've considered adding support to the library for accepting on a range of ports but it slipped my mind! Boost.Range might be fit for this purpose, so I'll get back on it.

...

...
As I see it, recompiling just to use Unix sockets instead of TCP is unacceptable,

I agree that there is no reason to do this for FastCGI and CGI (even official libfcgi supports it) but if you once implement SCGI you will want to be able to switch between FCGI/SCGI without recompilation.

I don't think there's any foolproof way of doing this interrogation automatically? If it's a manual, runtime configuration option anyway, then the library user would be able to use the sort of selecting I mentioned in my previous post. Again, there is non-trivial overhead in supporting both SCGI and FastCGI under-the-hood so library users shouldn't have to pay for this unless they want it.

...

...
Indeed. I started support for SCGI in the past but have removed those parts for now since they aren't complete. Adding support is just a matter of fnding the time as the design of the library should support an implementation.

I fully understand this, I just suggested, not SCGI protocol is so simple that it can be implemented in several hours. (At least that what It had take for CppCMS)

Ok.

...

Once again, don't try to leave in illusion that support of wide character would give you any advantage in Unicode support.

Oh I'm not.

...

See: http://cppcms.sourceforge.net/boost_locale/html/tutorial.html#myths

Looks interesting. You should get this to review.

...

...
...
Session managements I would strongly not recommend you using boost::serialization for this purpose. Performance is terrible (from my experience).

What would you suggest as an alternative?

Use key value pairs as they most popular data stored in sessions. And allow value be serializeable object for complex data structures.

Sounds like reinventing Boost.Serialization, but with fewer features. I'd rather support things that people already know & use when there is something available. FWIW, adding a SessionManager that uses Boost.Interprocess is on my TODO list.

...

I think supporting lexical conversion is a requirement of any CGI library,

...
so the as<> and pick<> data access functions are provided for this use. In the absence of better an alternative, if these two functions were documented with caveat about lexical_cast, would that be sufficient?

You can always use std::stringstream (what lexical_cast is actually uses) but you must imbue std::locale to the stream like this.

template<typename Number> Number to_number(std::string const &s) { std::stringstream ss; ss.imbue(std::locale::classic()); ss.str(s); Number r; ss >> r; if(!ss) throw .. return r; }

If library users want this they can use it like so: cgi::request req; int first_way = req.get.as<int>("some-get-variable"); int second_way = atoi(req.get["some-get-variable"].c_str()); int third_way = to_number<int>(req.get["some-get-variable"]); assert(first_way == second_way == third_way); All of the request data is available in types that are implicitly convertible to a string and have a c_str() function that returns a const char_type*. For example: cgi::request req; // You can get full access to the request data in its "native" type. cgi::form_part& data = req.uploads["some-file"]; // All "native" request data types are implicitly convertible to a string. // In the case of CGI, the string_type is std::string so this works too: std::string upload_filename = req.uploads["some-file"]; Does that give you what you need?

...

...
Indeed, I am testing with lighttpd and nginx at the moment. I have found an experimental plugin for for nginx that supports "proper" multiplexing too, which looks promising.

Few words about multiplexing:

1. There is not a single web server that implements multiplexing as it much simpler and efficient to just open another socket.

Zeus does and so does the nginx implementation I mentioned.

...

2. More then that the only web server I have ever seen using Keep-Alive was cherooke (and I think IIS over pipes because pipes are not sockets)

What does this mean?

...

3. Even official fastcgi library do not support multiplexing.

I know, but that's no reason not to support it.

...

4. There is always way to tell to web server if application supports multiplexing or not. (on of commands of fcgi)

Again, this is just a workaround.

...

5. There is deep problem with multiplexing as there is no way to tell fastcgi application that it can't send data meanwhile

For example you have two clients downloading big csv file of 1G one has connection slower in two times then other.

So if they have multiplexed connection then either both clients will revive data at the lowest speed or web server will have to store about 0.5G in its internal buffers.

So multiplexing is generally bad idea.

I've thought long and hard about this and I don't think your assertion is correct. The library doesn't support restartable form parsing right now because of the problem you're talking about. With careful design and some clear caveats for users I believe this can be made a non-issue. My suggestion - don't waste your time on it. It useless feature

...

that theoretically could be useful but nobody uses it.

Also fastcgi specifications allow you as library developer not to support multiplexing (actually I hadn't seen any fastcgi client that implements multiplexing).

Peter Simons' libfastcgi does support multiplexing, but isn't a complete client library, just a protocol driver. This library "potentially" supports multiplexing. There are a couple of things that need to change internally to support it but as soon as I get my hands on a server I think it's doable.

...

CgiCC is not compatible with the BSL so I can't use it. It

...
also does not support lazy loading of requests or multiplexed FastCGI, which have always been goals of this library.

As I mentioned before multiplexed FastCGI exists only on paper. Don't waste your time.

I assure you it's real, but I'm not going to defend it much until there are some performance metrics supporting it. The goal of supporting multiplexing is less about pure speed than about resources. Having 1 connection per request means you can only support N simultaneous requests, where N is not really that huge of a number on most machines. Cheers, Darren

Artyom

7:54 p.m.

...

...
I agree that there is no reason to do this for FastCGI

and CGI

...
(even official libfcgi supports it) but if you once implement SCGI you will want to be able to switch between FCGI/SCGI without recompilation.

I don't think there's any foolproof way of doing this interrogation automatically?

If it's a manual, runtime configuration option anyway, then the library user would be able to use the sort of selecting I mentioned in my previous post.

I'd rather prefer to write in the beggining boost::cgi::acceptor ac; if(scgi) ac.open(scgi,tcp::ip); else ac.open(fcgi,unix) boost::cgi::request r ac.accept(r); Rather checking if it is scgi of fcgi at every point. As I can see from your code you ave each class type of each protocol. And it is bad (IMHO)

...

The goal of supporting multiplexing is less about pure speed than about resources. Having 1 connection per request means you can only support N simultaneous requests, where N is not really that huge of a number on most machines.

epoll/kqueue/devpoll allows you support them efficiently and N as bis as number of file descriptors in process. 10,000 is low? I don't think so (see 10K problem) Best, Artyom

Darren Garvey

8:52 p.m.

On 20 May 2010 20:54, Artyom <artyomtnk@yahoo.com> wrote:

...

I'd rather prefer to write in the beggining

boost::cgi::acceptor ac;

if(scgi) ac.open(scgi,tcp::ip); else ac.open(fcgi,unix)

Doesn't this suffer from having to link support for SCGI and FastCGI into any application that uses the library? This could be implemented as a higher-level wrapper over the components I'm proposing.

...

boost::cgi::request r ac.accept(r);

Rather checking if it is scgi of fcgi at every point.

As I can see from your code you ave each class type of each protocol. And it is bad (IMHO)

You can use templated functions and handle them all the same way. fcgi::request and cgi::request are different types, but they have the same functions, interface and semantics. A single function: template<typename Request> void handle_request(Request& req); can work transparently with any protocol. In the linked example (as ugly as it is), you can use any request type in a uniform way with request_handler: https://svn.boost.org/trac/boost/browser/sandbox/SOC/2007/cgi/trunk/libs/cgi...

...

The goal of supporting multiplexing is less about pure

...
speed than about resources. Having 1 connection per request means you can only support N simultaneous requests, where N is not really that huge of a number on most machines.

epoll/kqueue/devpoll allows you support them efficiently and N as bis as number of file descriptors in process.

10,000 is low? I don't think so (see 10K problem)

It's not a huge number is it?* 10K simultaneous connections was an issue when the C10K paper was written, and AFAICT it's still an issue. Roy Fielding and his "Waka" are worth reading about if you don't see the point in multiplexing. His arguments are very convincing. Cheers, Darren * Earth's population is increasing by about 9K people / hour, according to the CIA factbook.

Thorsten Ottosen

8:22 a.m.

Artyom skrev:

...

- Make request abstract class rather then concept. Unless you want to recompile your application to work with each type of connection. I you only what to change configuration if you work over Unix sockets, TCP sockets or you work with scgi instead of fcgi.

Beleive me I've been there I know what I'm talking about.

An abstraction layer can always be build around a low-level template code. Going the other way is impossible, so please avoid virtual functions as the only choice. -Thorsten

Artyom

8:58 a.m.

...

...
- Make request abstract class rather then concept. Unless you want to recompile your application to work with each type of connection.

An abstraction layer can always be build around a low-level template code. Going the other way is impossible, so please avoid virtual functions as the only choice.

-Thorsten

Have you ever written FastCGI/SCGI application? If so can you please give me any rationale what is the difference (from application point of view) between - FastCGI over TCP/IP - FastCGI over Unix Domain Sockets? - SCGI over TCP/IP - SCGI over Unix Domain Sockets? If so give me one reason why should my application be compiled for each type of these? Good luck :-) Artyom

Thorsten Ottosen

10:01 a.m.

Artyom skrev:

...

...
...
- Make request abstract class rather then concept. Unless you want to recompile your application to work with each type of connection.

An abstraction layer can always be build around a low-level template code. Going the other way is impossible, so please avoid virtual functions as the only choice.

...

Have you ever written FastCGI/SCGI application?

Nope.

...

If so can you please give me any rationale what is the difference (from application point of view) between

- FastCGI over TCP/IP - FastCGI over Unix Domain Sockets? - SCGI over TCP/IP - SCGI over Unix Domain Sockets?

If so give me one reason why should my application be compiled for each type of these?

Good luck :-)

(I pressume you want the library to include a compiled binary witht the implementation hidden behind pimpls) Why should I be forced to link with code I don't use or rely on virtual functions, if I only need one of the above? -Thorsten

Darren Garvey

5:49 p.m.

Hi Thorsten, On 20 May 2010 11:01, Thorsten Ottosen <nesotto@cs.aau.dk> wrote:

...

Artyom skrev:

If so can you please give me any rationale what is the difference (from

...
application point of view) between - FastCGI over TCP/IP - FastCGI over Unix Domain Sockets? - SCGI over TCP/IP - SCGI over Unix Domain Sockets?

If so give me one reason why should my application be compiled for each type of these?

Good luck :-)

(I pressume you want the library to include a compiled binary witht the implementation hidden behind pimpls)

Why should I be forced to link with code I don't use or rely on virtual functions, if I only need one of the above?

You shouldn't be. Especially since you are free to implement a PImpl-like interface over this library. Sounds like a good idea for an example. There is a very old one in the SVN repo that shows how to support both FastCGI and CGI (I must update these): https://svn.boost.org/trac/boost/browser/sandbox/SOC/2007/cgi/trunk/libs/cgi... It's a bit ugly which is why it is not included in the documentation. Cheers, Darren

Scott McMurray

5:56 p.m.

On 20 May 2010 10:58, Artyom <artyomtnk@yahoo.com> wrote:

...

If so give me one reason why should my application be compiled for each type of these?

Good luck :-)

Give me one reason why I should have to link all of those into my binary when I only deploy using one of them. And Thorsten's request is just to make the virtual optional, not to remove it altogether. You can still make it runtime-configurable if you insist.

Darren Garvey

25 May 25 May

8:49 p.m.

Hi Scott, On 20 May 2010 18:56, Scott McMurray <me22.ca+boost@gmail.com> wrote:

...

On 20 May 2010 10:58, Artyom <artyomtnk@yahoo.com> wrote:

...
If so give me one reason why should my application be compiled for each type of these?

Good luck :-)

Give me one reason why I should have to link all of those into my binary when I only deploy using one of them.

And Thorsten's request is just to make the virtual optional, not to remove it altogether. You can still make it runtime-configurable if you insist.

I've been tinkering with a dynamic layer above the low-level templated code that would be runtime-configurable when finished. This has been the first suggestion I've had from a few people and not an unreasonable request by any means. Two things that have come up so far are: 1. The dynamic interface unsuprisingly lends itself to having a compiled library for the library functions. At this early stage I'm tempted to make "compiled" the default choice - ie. using the dynamic portion of the library would simply mean including some declaration-only headers and having everything else hidden in a single library. This would mean very fast compiles and small application binaries. I don't see at the moment why the .cpp files couldn't actually be .ipp files that could optionally be included in the headers to prevent the need to link against a CGI library. 2. The dynamic interface seems to lend itself to a really simple Python binding. I'm not convinced there is any point in going very far down this route as Python is awash with CGI libraries and WSGI seems to be the de-facto standard anyway. Still, Python's FastCGI support is based on the standard FastCGI library so there might be some room for gains. Thoughts? Cheers, Darren

Artyom

26 May 26 May

5:55 a.m.

...

1. The dynamic interface unsuprisingly lends itself to having a compiled library for the library functions. At this early stage I'm tempted to make "compiled" the default choice - ie. using the dynamic portion of the library would simply mean including some declaration-only headers and having everything else hidden in a single library. This would mean very fast compiles and small application binaries.

This is very good point to start from ;-)

...

I don't see at the moment why the .cpp files couldn't actually be .ipp files that could optionally be included in the headers to prevent the need to link against a CGI library.

Yes this very good idea.

...

2. The dynamic interface seems to lend itself to a really simple Python binding. I'm not convinced there is any point in going very far down this route as Python is awash with CGI libraries and WSGI seems to be the de-facto standard anyway. Still, Python's FastCGI support is based on the standard FastCGI library so there might be some room for gains.

Thoughts?

I think that python programmers would like to use asynchronous FastCGI interface they would rather prefer to have it over Twisted event loop rather over Boost.Asio. Also the standard fastcgi library is very good and well proven one, not talking about the fact that WCGI is available in any Python. so I don't think it would be useful for synchronous operations.

Jarrad Waterloo

19 May 19 May

12:48 p.m.

2 file requests 1) On file upload it would be nice to be notified at the beginning, between processing the header and the multipart, and periodically, lets say every 4K (configurable), in addition to the already when processed. Also FastCGI supports giving a completed response before even the the request is completely processed. Allowing the request and response to be split in separate threads permits the server to give progress indicator while the file is being uploaded. 2) Frequently, I disallow files from being saved on the server, rather I would like to process the file while it is being uploaded. Use case would be I may have a CSV parser that reads from a stream and would notify me by callback upon successfully reading a row which I would then process. On 5/18/2010 8:14 PM, Darren Garvey wrote:

...

Hi,

I have packaged a version of a CGI / FastCGI library* up to sourceforge and would very much appreciate feedback and critique from interested parties. Feedback from this list has been invaluable in the past. You can download the library from:

http://sf.net/projects/cgi/files

The documentation still needs work, which I am doing as I can find the time. It can be found online at:

http://cgi.sf.net/docs

and a brief feature list can be found from:

http://cgi.sf.net

The library interface has reached a relatively stable point after a savage bunch of changes recently and I'm quite pleased with how it is shaping up. A brief list of features of the library are:

* Out of the box support for FastCGI and CGI, tested with Apache 2.2 and mod_fcgid on Windows XP, 7 and linux. * Type-safe access to different request data (eg. get / post / environment / cookies) * Access to request data in STL-like constructs, with some CGI-specific helper functions. * Optional support for user-defined, strongly-typed sessions. * A lightweight utility wrapper for using HTML / XML / etc. templates - dubbed "stencils". * A traits-based design that allows library users to alter the implementation of parts of the library at compile time.

For reference, the code is also in the Boost sandbox at:

https://svn.boost.org/svn/boost/sandbox/SOC/2007/cgi/trunk

Cheers, Darren

* the one started way back in the GSoC of 2007... _______________________________________________ Unsubscribe& other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Darren Garvey

11:30 p.m.

HI Jarrad, On 19 May 2010 13:48, Jarrad Waterloo <jwaterloo@dynamicquest.com> wrote:

...

2 file requests 1) On file upload it would be nice to be notified at the beginning, between processing the header and the multipart, and periodically, lets say every 4K (configurable), in addition to the already when processed.

I'd very much like to support this, but haven't had the time to implement it. Marshall Clow has been working on a complete MIME library - it would be very cool if he got that to support incremental parsing. (*cough*). Also FastCGI supports giving a completed response before even the the

...

request is completely processed. Allowing the request and response to be split in separate threads permits the server to give progress indicator while the file is being uploaded.

You can write a response before the entire request is read at the moment, but you can't interrupt the processing of reading / parsing the post data once you have started it. 2) Frequently, I disallow files from being saved on the server, rather I

...

would like to process the file while it is being uploaded. Use case would be I may have a CSV parser that reads from a stream and would notify me by callback upon successfully reading a row which I would then process.

Hmm, I might have to rethink the upload routines. Thanks for the use-case. Cheers, Darren On 5/18/2010 8:14 PM, Darren Garvey wrote:

...

...
Hi,

I have packaged a version of a CGI / FastCGI library* up to sourceforge and would very much appreciate feedback and critique from interested parties. Feedback from this list has been invaluable in the past. You can download the library from:

http://sf.net/projects/cgi/files

The documentation still needs work, which I am doing as I can find the time. It can be found online at:

http://cgi.sf.net/docs

and a brief feature list can be found from:

http://cgi.sf.net

The library interface has reached a relatively stable point after a savage bunch of changes recently and I'm quite pleased with how it is shaping up. A brief list of features of the library are:

* Out of the box support for FastCGI and CGI, tested with Apache 2.2 and mod_fcgid on Windows XP, 7 and linux. * Type-safe access to different request data (eg. get / post / environment / cookies) * Access to request data in STL-like constructs, with some CGI-specific helper functions. * Optional support for user-defined, strongly-typed sessions. * A lightweight utility wrapper for using HTML / XML / etc. templates - dubbed "stencils". * A traits-based design that allows library users to alter the implementation of parts of the library at compile time.

For reference, the code is also in the Boost sandbox at:

https://svn.boost.org/svn/boost/sandbox/SOC/2007/cgi/trunk

Cheers, Darren

* the one started way back in the GSoC of 2007... _______________________________________________ Unsubscribe& other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Jarrad Waterloo

20 May 20 May

12:51 p.m.

I know you said more work needs to be done on documentation. Could you include and example of this? On 5/19/2010 7:30 PM, Darren Garvey wrote:

...

Also FastCGI supports giving a completed response before even the the

...
request is completely processed. Allowing the request and response to be split in separate threads permits the server to give progress indicator while the file is being uploaded.

You can write a response before the entire request is read at the moment, but you can't interrupt the processing of reading / parsing the post data once you have started it.

Darren Garvey

5:51 p.m.

Sure, thanks for the suggestion. (also please don't top-post). Cheers, Darren On 20 May 2010 13:51, Jarrad Waterloo <jwaterloo@dynamicquest.com> wrote:

...

I know you said more work needs to be done on documentation. Could you include and example of this?

On 5/19/2010 7:30 PM, Darren Garvey wrote:

...
Also FastCGI supports giving a completed response before even the the

...
request is completely processed. Allowing the request and response to be split in separate threads permits the server to give progress indicator while the file is being uploaded.

You can write a response before the entire request is read at the moment, but you can't interrupt the processing of reading / parsing the post data once you have started it.

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Phil Endecott

19 May 19 May

2:44 p.m.

Darren Garvey wrote:

...

I have packaged a version of a CGI / FastCGI library* up to sourceforge and would very much appreciate feedback and critique from interested parties.

Hi Darren, I've just had a very quick look through the docs; some comments: - I think you must have quite a few useful algorithms hidden as implementation details (e.g. URL %-encoding, base64, multipart MIME etc) that would be more useful if they were exposed. I.e. people could use them even if they didn't want to use the rest of the library. - Saving uploaded files on the server's filesystem doesn't seem the right approach. Ideally, the handler would start before all of the uploaded file had been received, e.g. if I upload a 100 MB video, the server would check my username and the MIME type for the content and maybe look for a magic number at the start of the data stream so that it could send an error response without waiting for all of the data. - As I've said before, my preferred method when I don't want a process-per-request CGI is to use a stand-alone HTTP daemon and Apache's mod_proxy. It would be great if you could support this. I believe that asio has an HTTP server example; could that be coerced into doing this for you? Regards, Phil.

Darren Garvey

11:46 p.m.

Hi Phil, Thanks for the comments. On 19 May 2010 15:44, Phil Endecott <spam_from_boost_dev@chezphil.org>wrote:

...

- I think you must have quite a few useful algorithms hidden as implementation details (e.g. URL %-encoding, base64, multipart MIME etc) that would be more useful if they were exposed. I.e. people could use them even if they didn't want to use the rest of the library.

I don't think these really belong in a "CGI" library, it doesn't feel right. boost::url::encode() and boost::url::decode() make more sense to me than boost::cgi::url_encode()... One of my goals is to expose more of the internals of the library to allow users to, for example have fine-grained control over I/O. As you say, there's no reason you should have to use it all.

...

- Saving uploaded files on the server's filesystem doesn't seem the right approach. Ideally, the handler would start before all of the uploaded file had been received, e.g. if I upload a 100 MB video, the server would check my username and the MIME type for the content and maybe look for a magic number at the start of the data stream so that it could send an error response without waiting for all of the data.

You can do some of what you want like so: fcgi::request request; // accept the request. request.load(fcgi::parse_env | parse_cookies); // check the username / password if (ok) request.load(fcgi::parse_post); // all MIME data has been read and parsed now... drat. I'll try and find some time to add finer-grained control in "parse_post", so mime parts can be interogated and dealt with individually.

...

- As I've said before, my preferred method when I don't want a process-per-request CGI is to use a stand-alone HTTP daemon and Apache's mod_proxy. It would be great if you could support this. I believe that asio has an HTTP server example; could that be coerced into doing this for you?

This would certainly be nice. I've not used mod_proxy before, I don't support you could send me an example configuration file to get it working? Those working on a HTTP networking library (cpp-netlib) have developed the asio example further, which is intriguing too. Cheers, Darren

Phil Endecott

20 May 20 May

9:51 a.m.

Darren Garvey wrote:

...

I've not used mod_proxy before, I don't support you could send me an example configuration file to get it working?

It's quite simple, e.g. <Location /foo> ProxyPass http://localhost:1234 max=20 smax=0 ttl=60 ProxyPassReverse http://localhost:1234 </Location> See the Apache docs for more info. Phil.

Darren Garvey

5:43 p.m.

Thanks Phil, I'll take a look when I can find the time. It'll give a good test of the flexibility of the framework. Cheers, Darren On 20 May 2010 10:51, Phil Endecott <spam_from_boost_dev@chezphil.org>wrote:

...

Darren Garvey wrote:

...
I've not used mod_proxy before, I don't support you could send me an example configuration file to get it working?

It's quite simple, e.g.

<Location /foo> ProxyPass http://localhost:1234 max=20 smax=0 ttl=60 ProxyPassReverse http://localhost:1234 </Location>

See the Apache docs for more info.

Phil. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

5540

Age (days ago)

5547

Last active (days ago)

List overview

Download

21 comments

7 participants

participants (7)

Artyom
Darren Garvey
Darren Garvey
Jarrad Waterloo
Phil Endecott
Scott McMurray
Thorsten Ottosen