
Hi Phil and James, On 31 January 2010 12:08, Phil Endecott <spam_from_boost_dev@chezphil.org>wrote:
James Mansion wrote:
Phil Endecott wrote:
I have an HTTP request parser using Spirit, if you're interested. It is a bit grotty as I wrote it as my first exercise using Spirit - but it does work. http://svn.chezphil.org/libpbe/trunk/src/parse_http_request.cc .
Out of interest, is the parser suitable to use as a tutorial on how to translate from RFC specs?
You're welcome to use it in that way if you wish. Most of it was translated directly from the BNF in the RFCs.
It also makes the point that almost no-one actually does implement these things by taking the BNF, because if you do that it won't work. There is at least one bug in the HTTP BNF that is documented on an errata web page somewhere (the URL is in my source) but that has never justified a new RFC after 11 years...
It would be nice to build a library of reference parsers that handle
assorted protocols. email is a very interesting one, since the standard (and not-standard-but-in-use) variations on sender, date and time etc are quite tricky, and I guess there is some commonality given that the RFCs tend to reference each other. (Sort of thing I'm thinking of is the comment parts embeddable in dates and the newlines etc).
If you try to parse email headers per the BNF in the RFCs you will fail on a huge proportion of it. Somewhere I have a list of all the date formats that I have seen; it is ridiculous. I think this is a fundamental problem with text-based protocols; the "spec" is not the real spec; the real spec is the deployed implementations (see e.g. HTML parsers). Anyway, that's off-topic....
I would say that this is on-topic as it is an issue that we face in implementing cpp-netlib. Currently, the request parser in the HTTP server is taken from Boost.Asio HTTP example but I'm certain that this can be improved. Phil: thank you for your comments, I've added them to our issue tracker: http://github.com/cpp-netlib/cpp-netlib/issues If anyone would like to comment on these or add other issues, I'd encourage you to do so on our mailing list. Thanks, Glyn