
Greetings, I've been working over the past 7+ months on a little library that uses Boost and ASIO to implement HTTP functionality. Originally I called this "libpion," then renamed it to the "Pion Network Library." I'm using it as piece of a larger open source project for complex event processing, which has taken on the overall "Pion" brand (hence the change). http://pion.atomiclabs.com/pion/net The library includes pretty comprehensive support for HTTP versions 1.0 and 1.1, including persistent connections, pipelining, chunked encodings, etc. It supports client-side operations as well as an extensive server-side implementation, and can use either blocking or asynchronous sockets (through use of the Boost.ASIO library). The original purpose was to create a Boost/C++ library for building lightweight HTTP interfaces for applications. Although it can certainly be used to serve files, It is not meant to replace or compete with any full-featured web server (like Apache or lighttpd). One of its main features/uses is to bind HTTP resources to code. The library is now fairly stable and feature-complete, and I have a few other developers helping out with it. I've been thinking more lately about preparing this to submit for formal review as a Boost library. However, there are a few obstacles that I know of: 1) This is my first Boost library, and I made the mistake of using CamelCase and other non-Boost styles starting it out. So, some re- naming and cleanup work would obviously be required. I don't imagine this would take too much time, though. 2) Some of the functionality overlaps with other libraries that have not yet been included or accepted into Boost (most notably, I think Boost.Extension would overlap with Pion's handling of web services: dynamic plug-ins bound to resources). I'd be happy (and actually prefer) to swap out this code with Boost.Extension, but I do not believe it has been reviewed or accepted yet either. I could also prepare a version with the plug-in functionality removed, but since it's being used in other projects, this would create a fork requiring extra maintenance energy, etc. 3) I know that Dean Michael Berris has formed a group of developers working on a more comprehensive network protocol library for Boost called cpp-netlib. HTTP is one of the protocols they are working on. I've offered and would still be quite happy to combine efforts somehow, but that project still seems to have a long way to go, and a few people have mentioned that it may be preferable to have an independent library focused on HTTP. So.. I decided to throw all this out there and see what people think. Should Boost have its own HTTP library, or should it be part of an more comprehensive network protocol library? Would it be better have something available sooner in Boost that works and is reliable, and try to resolve the overlap over time as cpp-netlib matures? Or, would it be better to wait and try to merge my library (or at least it's functionality) into cpp-netlib? Thanks, -Mike

Michael Dickey wrote:
Nice :-)
Yes, it should have an HTTP library -- it would be nice if there were other protocols, but not essential.
I think we need a first protocol library to lay the groundwork. Reviews and interactions may lead to a set of standards for handling these sorts of libraries. If http lib is closer to ready then I'd say take it thru the process as that might influence the design of cpp-netlib. Of course, you can take into account what the netlib folks are doing as well. I'd hope that somehow we could avoid redoing the http work... Jeff

Hi Jeff! On Dec 10, 2007 10:32 AM, Jeff Garland <jeff@crystalclearsoftware.com> wrote:
I have the same feeling, but then other protocols are becoming increasingly more and more important as the web matures -- XMPP is lurking to be the next generation IM/Messaging protocol, (E)SMTP is not going away for Email anytime soon, and FTP is still very popular. Maybe having a torrent client library might not be essential, though if there's enough interest then it may just be the next generation fail-safe P2P storage protocol -- or I might be dreaming too much. ;)
I agree. If Mike already has an HTTP client library we can retro-fit to work with the cpp-netlib basic_message<> implementation, then I think we don't have to re-invent the wheel as far as an HTTP client implementation goes -- and cpp-netlib 1.0 might just be around the corner once we document it properly and get it tested up to Boost standards. Looking forward to hearing everyone's thoughts about this. Have a good day everyone! -- Dean Michael C. Berris Software Engineer, Friendster, Inc. [http://cplusplus-soup.blogspot.com/] [mikhailberis@gmail.com] [+63 928 7291459] [+1 408 4049523]

Dean Michael Berris wrote:
I didn't mean to 'dis' the importance of the other protocols. What I meant to say is that if we try to bring an entire suite as one library, in one review, it will a) take a long time, and b) be hard to manage. So I'd rather see them come as smaller contributions -- perhaps within a shared framework boost::net or whatever.
Looks to me like Mike is focused on the server side...so maybe there's not much overlap anyway. Jeff

On Dec 10, 2007 10:43 AM, Jeff Garland <jeff@crystalclearsoftware.com> wrote:
Ah, yes. Now that makes sense to me. As to doing it by piece meal (HTTP first, then SMTP next perhaps) maybe we'd get more mileage. :)
Too bad... That doesn't change though, cpp-netlib will be focused on the client side. Insights from the libpion implementation may help though, especially with HTTP 1.0/1.1 implementation details. -- Dean Michael C. Berris Software Engineer, Friendster, Inc. [http://cplusplus-soup.blogspot.com/] [mikhailberis@gmail.com] [+63 928 7291459] [+1 408 4049523]

I found pion the other day, I used the scheduler and connection idea (stripped down, non-camel case) for an irc client lib (in the works). You can find the code here: http://skotty.coffeebuzzin.com/browser/trunk I've been looking for some help and guidance on making a "boost.irc" similar to this http (and boost.net) idea. Currently the message parser relies on boost.spirit (implements the EBNF almost verbatim from the RFC) and I only have client-components implemented. Server will be next. I also have some FTP stuff: http://sourceforge.net/projects/bftp but haven't had time to make it complete and generic. It's just a basic client again, no server components. I'll try to keep up to date on this, quite busy but maybe some of this code can be of some use. Cheers, Chris On Dec 9, 2007 10:09 PM, Dean Michael Berris <mikhailberis@gmail.com> wrote:

Dean, I think we could combine cpp-netlib's messaging classes with pion's HTTP parsing classes to establish a basis for HTTP processing and provide client-side support for HTTP/1.0 and 1.1. There's two possible approaches to this that I see, one is to merge a "server" space into the Network library: boost::net::basic_message<> boost::net::http::request : public basic_message<> boost::net::http::response : public basic_message<> boost::net::server::tcp boost::net::server::http : public tcp Another would be to create an independent "server" library that uses the Network library classes: boost::server::tcp boost::server::http : public tcp They both seem like sensible approaches to me. In either case, I think that HTTP would be a good starting focus for both, to get a working foundation incorporated into boost. We could always submit additional protocols independently for review as their implementations are finished. What do you think? Take care, -Mike On Dec 9, 2007, at 7:09 PM, Dean Michael Berris wrote:

Mike, Dean, On 10/12/2007, Michael Dickey <mike@mikedickey.com> wrote:
FWIW, I'm familiar with both libpion and cpp-netlib and I would agree that a crossover here would be of benefit to both groups.
I completely agree with this too. I think there are other individuals who want to work on other protocols using cpp-netlib, but at different speeds and more independently. I think this is the best way to make progress. Glyn

On Mon, 10 Dec 2007 13:28:42 -0700, Michael Dickey <mike@mikedickey.com> wrote: [snip]
boost::net::server::http : public tcp
(assuming that "what you think was directed to all of us, not just Dean :), so below is my $0.02 CAD) I like the idea of basic_message<> (which could be used for SMTP and gods know what else as well), but what about UDP servers or, generally speaking, multi-transport protocols? Although http is sitting on top of TCP, there is an RTSP protocol which claims to be derived from HTTP/1.1 in most parts (well, at least RFC has lots of backrefs to HTTP spec), but allows to use connectionless transport (aka UDP). Moreover, RTSP treats Clients as something that can receive asynchronous stream change notifications (if allowed). I am not arguing that your design will not cover 95% of useful cases, but I think protocol parsing and transport layer might need more separation. This way if brave person will decide to implement RTSP (for example) as derived from HTTP, he/she will not be limited by TCP transport. Thanks, Andrey

On Dec 10, 2007, at 1:00 PM, Andrey Tcherepanov wrote:
Of course, otherwise I wouldn't be filling you INBOX with all these messages =)
In the current Pion code, the parsing & transport are handled independently. I've never heard of HTTP being used on top of something other than TCP. But I think we could probably accommodate for this by slightly altering the tcp_server / http_server class hierarchy. Take care, -Mike

Hi Andrey, On Dec 11, 2007 5:00 AM, Andrey Tcherepanov <moyt63c02@sneakemail.com> wrote:
basic_message<> is intended to be divorced from the actual transport being used to put the contents of it out the wire. The idea of basic_message is primarily as a common structure -- or the foundation -- on which all operations lower level (network transport, serialization/deserialization, etc.) and higher level (protocol specific transformations, chaining, stream operations, etc.) will deal with. If I only had the time and talent, I'll come up with a visual representation as to how the basic_message<> fits in all this network-related code with a simple layered design. For now, this is how I envision it: Adapters/Transformations Layer ---------------------------------------------------- | basic_message<> and derivatives | ---------------------------------------------------- Transport Layer For a client protocol 'stack', there will be protocol specific adapters and transformations higher than the basic_message<> abstraction and protocol specific derivatives. So the idea would then look like (for HTTP): HTTP Transformations/Adaptations Layer ------------------------------------------------------------ | http_request / http_response | both deriving from basic_message<> ------------------------------------------------------------ Generic/HTTP Transport Layer Question might be, what the transformations/adaptations layer would contain. There is no straight answer, but what comes to mind are the following: - Stream Adapters (create an HTTP Session "Stream", cookie support, ETags, etc.) - Iterator Adapters (turn a message or set of message(s) into iterator-accessible objects for STL friendliness; further examples can be imagined if there was more time to do it) - Protocol-Protocol transformations (example, HTTP<-->SMTP, or HTTP<-->XMPP, etc.) And the possibilities are endless. ;-)
It will be divorced -- the concept of messages have nothing to do with the actual transport and processing that happens in a specific protocol stack. I hope this helps! :-) -- Dean Michael C. Berris Software Engineer, Friendster, Inc. [http://cplusplus-soup.blogspot.com/] [mikhailberis@gmail.com] [+63 928 7291459] [+1 408 4049523]

Hi Mike, On Dec 11, 2007 4:28 AM, Michael Dickey <mike@mikedickey.com> wrote:
Great!
I like the first one, where server:: is inside boost::net:: -- it at least makes more sense that way, and if boost::server would be more convenient, it can be pulled in appropriately with 'using net::server'. Would you mind committing the changes into the cpp-netlib subversion project first, then perhaps later we go ahead and put it into the boost sandbox? -- Dean Michael C. Berris Software Engineer, Friendster, Inc. [http://cplusplus-soup.blogspot.com/] [mikhailberis@gmail.com] [+63 928 7291459] [+1 408 4049523]

Hi Mike! On Dec 10, 2007 9:29 AM, Michael Dickey <mike@mikedickey.com> wrote:
You're right about working on a comprehensive network protocol library, but there I think are a few differences with what libpion is addressing and what cpp-netlib is trying to address. Let me try and point these out as how I understand it (please correct me if I'm wrong): 1. cpp-netlib primarily aims to implement a cross-platform, header-only, standards compliant networking client library. It aims to make using higher level application protocols easier than it currently is on most platforms. The HTTP implementation is intended to primarily be a client-side implementation, to make HTTP 1.0/1.1, (E)SMTP, among other protocols available to client code with minimal intrusion : it being a header-only library. 2. There has been initial intent to develop a server-side implementation for HTTP in cpp-netlib which has been dropped because there really is no single best way to implement and HTTP server -- because writing your own HTTP server usually means that you have different trade-offs between performance and reliability not met by those HTTP server implementations that are already out there. 3. Libpion aims to be in C++ what the Twisted framework is in Python: a way to easily create HTTP-aware services using asynchronous programming and non-blocking IO.
If you already have an HTTP 1.0 / 1.1 client implementation that is licensed under the BSL (I see that libpion is already under the BSL) and can be re-worked to use the basic_message implementation that's already in cpp-netlib, then I think that would be a good start. We can discuss this on either list (cpp-netlib-devel or this one), and it would be something worth noting. If you look at the goals for the 1.0 release of cpp-netlib, it's to come up with a simple HTTP 1.0 client interface that's header-only, easy to use, and to come up with the base on which most of the other network protocol implementations will build upon. The message concept and implementation is already available and is being currently extended to support a wide array of policies/platforms (wide character support, chunking/linking, custom allocators, custom string implementations, etc.). Yes, it's going to be a lot of work, and I still need all the help I can get -- with the day job taking most of the time and attention from me, it's going to be hard to put in time to code all the stuff that's still in my head wanting to get out someday. I plan on putting in a lot more time during the holidays here in the Philippines (that's between the 25th and the 1st of January) to implement more of the HTTP operations and the unit tests that need to cover the implementations that need to be written. So yes, it's years of man-hours of work, and there's plenty for everyone who's interested. :D I hope this helps! -- Dean Michael C. Berris Software Engineer, Friendster, Inc. [http://cplusplus-soup.blogspot.com/] [mikhailberis@gmail.com] [+63 928 7291459] [+1 408 4049523]

Dean Michael Berris wrote:
Yes, absolutely. I think that the most useful approach would be to provide building-blocks for things like HTTP header and URI parsing [I have some Spirit code for this that I'd be prepared to contribute], compression [Sebastian Redl seems to have gzip in his IOChains proposal, but can it be re-used in an HTTP server without bringing lots of baggage?], encryption, authentication, ETags (essentially hashes of the content) and conditional fetches -- all of this done with attention to security. This really needs to be done in a way that will fit in to thread-per-connection, thread pool, select()-based and other server designs, and perhaps also as a CGI program or Apache server module. How far can the server implementation be decoupled from the content-specific stuff, and what interfaces should there be between them? Michael Dickey wrote:
would it be better to wait and try to merge my library (or at least its functionality) into cpp-netlib?
Far from merging your library into something else, I encourage you to see how much you can break it up into smaller chunks and to make it compatible with, yet not dependent on, other libraries. Remember that there's also some GSoC CGI code pending. Regards, Phil.

On 10/12/2007, Phil Endecott <spam_from_boost_dev@chezphil.org> wrote:
Indeed there is! There's still a few months work left in it, unfortunately... I think libpion takes a different approach than the CGI library (there has been some recent discussion on the asio-users list about this). I've been keeping in mind the cpp-netlib project too so hopefully some form of protocol-independent basic_message<> class can be integrated to any CGI library that gets into Boost. It'd be ideal if a network building-blocks/utility library was accepted into Boost, but at the same time there seems to be a lot of interest for domain-specific libraries: like HTTP/IRC/CGI/etc. Hopefully these can all be built on shared generic network utilities (that exist just-above Boost.Asio ). As for which should come first, I have no idea. Either way, there is a lot of overlap between CGI and HTTP, so there is a good chance quite a lot of code can be shared between any two implementations. I hope so, anyway. Regards, Darren

On Dec 10, 2007, at 6:39 AM, Phil Endecott wrote:
It's definitely moving in that direction. For example, you can use the client-side part of the library without touching any server-side stuff, and you can even parse HTTP messages without any networking code. I agree with the approach of breaking pieces out and pushing them into a lower-level space like cpp-netlib over time. I guess a big question is, does _server-side_ functionality belong in Boost? I think for most cases, you're going to use an external packaged server to solve a problem. However, I think HTTP is a unique case in that it is so widely used and for such a wide variety of applications, and is simple enough that it often is embedded within applications.
Remember that there's also some GSoC CGI code pending.
Where is that project located? Take care, -Mike

Michael Dickey wrote:
I might have agreed with you a few years ago but recently I've come to the conclusion that it's better to just implement an HTTP server. In the past I have written CGI programs, which have the problem of starting one process per request, and Apache modules, where I have used Boost.Interprocess to store inter-request data. Just writing an HTTP server is simpler and better, and you can get Apache features (compression, encryption, authentication, etc. etc.) by using its mod_proxy to forward requests to your separate server. Phil.

On Dec 10, 2007, at 2:12 PM, Phil Endecott wrote:
I agree; that's why I wrote the library: to make it really easy to embed your own HTTP server =) I think what I was trying to say is just that for most servers, taking an embedded/roll-your-own path is hard to justify. But HTTP is becoming somewhat of a special case, in that it's more often easier to justify embedding a simple server in your application, versus integration with an outside server using CGI or some other API. Take care, -Mike

On Dec 9, 2007, at 6:33 PM, Dean Michael Berris wrote:
Sounds right to me. I didn't realize cpp-netlib had decided to drop server-side support. I think it makes sense though, since I agree that the two are very differently problems. I started work on libpion with the server-side approach only in mind, and that is definitely its focus and strength. However, in the 0.4.0 (most recent) release, we did add support for client-side HTTP as well. Mainly, because we wanted something better to use for our unit testing, and I realized that once you have a reusable HTTP parser (plus ASIO), it was trivial to add client-side support.
I agree that this sounds like a good place to start. In particular, with the latest release we really cleaned-up and broke out the HTTP parsing code (which I should say was largely based on the code in Christopher's ASIO http_server examples) so that it can be used for either requests or responses, and can even be used without networking code (our intention is to use it also with an HTTP sniffer). Your basic_message design is certainly cleaner and more extensible than our own. We basically just have HTTPRequest and HTTPResponse classes that extend a base HTTPMessage class (with headers, version, etc). We should be able to rework the parser so that it uses your classes instead, and be able to share that code between the two projects.
Unfortunately, I'm really pressed for time as well and will probably not be able to do much over the next couple months. Might be able to find some time though to get the ball rolling. Take care, -Mike

Hi Mike! On Dec 11, 2007 3:54 AM, Michael Dickey <mike@mikedickey.com> wrote:
Yeah, it was some sort of consensus that pretty much said -- there were too many ways to do server side HTTP, that's not going to be really practical to try to cover the different tradeoffs between performance, reliability, and scalability in a library. Plus, with the aim of doing it 'header-only' style, the amount of effort to be put into it doesn't make much sense (yet). I don't mind working with libpion myself, but maybe not in the projects I'm currently involved in. :)
I'll try to take a look at the client code and see what insights cpp-netlib may gain especially with regards to the more involved HTTP client specific code.
Sweet! Sounds like a good plan to me. We can keep the discussion here. I plan on moving the code we currently have in the cpp-netlib project into the sandbox, so that we can have more people who are interested be able to work with the code we currently have. Although it's not much, the foundation of most of the network-related stuff that I plan on building on in the future is already available -- the basic_message<> template has a very simple and extensible interface which allows building algorithms around it simple and extensible.
We can ask for help from people who are also interested in actually getting this into fruition (and have the time and spare brain cycles to be able to work on it in a volunteer effort). Does moving the development to the boost sandbox make more sense? I don't mind doing this, and since I'm not also personally able to maximize the resources available from sourceforge, maybe having it in the sandbox open it up to more people and more contributions? Insights would be most appreciated. -- Dean Michael C. Berris Software Engineer, Friendster, Inc. [http://cplusplus-soup.blogspot.com/] [mikhailberis@gmail.com] [+63 928 7291459] [+1 408 4049523]
participants (8)
-
Andrey Tcherepanov
-
Chris Fairles
-
Darren Garvey
-
Dean Michael Berris
-
Glyn Matthews
-
Jeff Garland
-
Michael Dickey
-
Phil Endecott