Re: Sockets: proposal for a library design

Hi Boris, Thanks for all the effort and getting something posted! On first blush, my thoughts are as follows: I would suggest that a proposed Boost.Net library should focus on networking in a more abstract manner, and not be merely a socket encapsulation. IMHO, sockets are a terrible API and should be hidden as fast as possible. A socket wrapper class is appropriate, but I would not go to any trouble to change the socket API. Simply wrap the resources and provide portable access to the underlying system. The real end-user part of the Net library should not talk about sockets (unless one provides a method like get_native to return the socket handle, and that method is defined as non-portable). I think all desired I/O models should be available at run-time (see below). In my own work in this area I have seen that this approach does not greatly effect the implementation. In other words, there is virtually nothing to be gained by making async I/O optional at compile-time. As others have also suggested, I would agree that the net facilities not be tied up into I/O streams. It would be better to provide the network facilities and then layer on I/O streams for those that want it. There are really only a few abstract entities to worry about: Url, Address, Stream, Datagram, Acceptor. In my work (not related to Boost), I've also created a Network class to serve as the abstract factory and contain a thread pool for doing the async I/O. I believe your diagrams use "Client" and "Server". I think these terms are a bit too generic. I prefer the term "Acceptor". Usage examples: shared_ptr<Network> net (new SomeSpecificNetwork(some_arguments)); Url some_url = "http://www.boost.org"; shared_ptr<Address> addr = net->new_address(some_url); shared_ptr<Stream> str = addr->new_stream(); str->connect(); // now it's connected (blocking) shared_ptr<Address> localhost = net->new_local_address("ntp://"); shared_ptr<Datagram> dg = localhost->new_datagram(); // on the ntp port localhost = net->new_local_address("ftp://"); shared_ptr<Acceptor> ac = localhost->new_acceptor(); // "ftp" port str = ac->accept(); // blocking accept I've omitted any nonblocking/async methods for brevity. IMHO, blocking, nonblocking and free-threaded-async-w/callback are the only models worth providing. These models can be ported anywhere and can be implemented via whatever mechanism is best for the given platform. A select() based version can be made to work just about anywhere with minimal #ifdef logic, but may be less than ideal when the system supports poll() or NT I/O completion ports. One thing I think is essential when using a blocking model is cancellation. For example, I have a reset() method on Acceptor, Address, Datagram and Stream that can be called at any time to cause blocking methods to terminate with a cancel exception (obviously this must be from another thread). Ironically, these reset() methods meant I could not use blocking socket calls to implement my blocking methods - recv() cannot be canceled once it blocks (at least not portably)! The big issue in nonblocking I/O is how to notify the application that things have changed and now would be a good time to try more I/O. I like the free-threaded callback for this as it fits closely with async I/O and that way the nature of the I/O mechanism can still be encapsulated. Throughout the above I use the term "free threaded callback". By that I mean that the user supplied callback will be invoked directly by a worker thread. The reason for this is that it is the only way to approach the efficiency of a hand-rolled select/poll loop. When select() returns, the fd_set would be examined and steps taken to prepare for the next call to select(). Since multiple operations could complete, whatever state machines exist are pumped inside a processing loop between select() calls. When the abstraction comes in and hides the loop behind callbacks, if the callbacks are made directly, we add only the overhead of making the callbacks. If, on the other hand, all callbacks where marshalled out, the loop would not have the information needed to formulate the next call to select() and would likely have to be interrupted repeatedly to get to the proper state. The dangers of this approach should be obvious (deadlock to say the least). The user code is now forced to understand the execution model in order to properly code a callback. What I found is that a short list of allowed operations is sufficient. Any async I/O methods are OK, as are nonblocking methods, etc.. After that (long winded) explanation, what I have also seen is that very few applications need the performance that free-threaded callbacks provide and would opt out of the complexity if they could. My solution to this was an auto-marshalling callback. I'm not sure what facilities Boost provides for queued function<> calls, but the idea is to provide a callback that queues a callback. The pseudo code for me was something like this: str->async_read(buf, n, channel.bind_async_call(this, &This::foo)); The "channel" was my entity that could enqueue calls. The "(this, &This::foo)" would actually be a function<> object or something. The return from bind_async_call() is a function<> object that performs the actual call to channel.async_call() passing the user callback. Whew! Slightly more than $0.02, I know. :) Best, Don --- Boris <boris@gtemail.net> wrote:
I updated the Wiki pages for a net library at
http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?BoostNet.
After collecting all the information I found at the Wiki pages for the socket and multiplexing library I try to explain how a net library should basically look like. The proposal includes various I/O models and is based on I/O streams.
As usual any comments are appreciated, Boris
__________________________________ Do you Yahoo!? Yahoo! Small Business - Try our new resources site! http://smallbusiness.yahoo.com/resources/

Don G wrote:
[...] I would suggest that a proposed Boost.Net library should focus on networking in a more abstract manner, and not be merely a socket encapsulation. IMHO, sockets are a terrible API and should be hidden as fast as possible. A socket wrapper class is appropriate, but I would not go to any trouble to change the socket API. Simply wrap the resources and provide portable access to the underlying system. The real end-user part of the Net library should not talk about sockets (unless one provides a method like get_native to return the socket handle, and that method is defined as non-portable).
This reminds me of the layers requirement from http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl?BoostSocket/S.... I agree that it would be nice to have some easy-to-use high-level classes for concepts that are used in network applications again and again.
I think all desired I/O models should be available at run-time (see below). In my own work in this area I have seen that this approach does not greatly effect the implementation. In other words, there is virtually nothing to be gained by making async I/O optional at compile-time.
I agree again. There shouldn't be any of the four I/O models omitted. We will see as we proceed if supporting various I/O models is easy to implement and to use at runtime (especially switching between I/O models).
As others have also suggested, I would agree that the net facilities not be tied up into I/O streams. It would be better to provide the network facilities and then layer on I/O streams for those that want it.
Several people dislike the idea of a network library built on I/O streams. What are the reasons for this? Is this a request for a more low-level access to socket functions like read(), write(), send() and recv()? The only reason I see why I/O streams should not be used is the lack of support for an asynchronous I/O model. The network library would have to add functions for asynchronous support until one day I/O streams support an asynchronous I/O model by default (btw, anyone working on this?).
There are really only a few abstract entities to worry about: Url, Address, Stream, Datagram, Acceptor. In my work (not related to Boost), I've also created a Network class to serve as the abstract factory and contain a thread pool for doing the async I/O. I believe your diagrams use "Client" and "Server". I think these terms are a bit too generic. I prefer the term "Acceptor".
Thanks, I renamed client and server to connector and acceptor.
Usage examples:
Thanks again! When the socket class hierarchy is more stable I hope to move upwards to high-level classes representing concepts from the network world you used in your examples.
[...] One thing I think is essential when using a blocking model is cancellation. For example, I have a reset() method on Acceptor, Address, Datagram and Stream that can be called at any time to cause blocking methods to terminate with a cancel exception (obviously this must be from another thread). Ironically, these reset() methods meant I could not use blocking socket calls to implement my blocking methods - recv() cannot be canceled once it blocks (at least not portably)!
I added this to the Wiki but I am not sure if a network library should support it. If you have two threads in your application and one thread is halted because of a blocking call it might have been better to use a nonblocking/asynchronous I/O model?
[...] Whew! Slightly more than $0.02, I know. :)
Thank you very much - that's much better than reading nothing at all. ;) Boris
[...]
participants (2)
-
Boris
-
Don G