
Hi Peter,
I like the proposal. It shows a certain polish that only comes with real-world experience with a design. ;-)
Thanks!
I have the following comments for now, based on a not-very-deep look at the document and the header:
Thanks for all the feedback.
1. network root class
In my opinion, this class is not needed and should be removed. It complicates the design, forces a global variable on the user, and is not library-friendly. For example, if lib1 and lib2 use the network, the user can't pass an address created by lib1 to lib2. The global state modeled by the network class should be an implementation detail.
In many of my uses, I have multiple network-derived types in use at the same time (serial, HTTP tunnel, TCP/IP). I agree that it can be an issue, so perhaps there should be a way to associate code with a particular kind of network and not need the network object. Having said that, though, I think this assumed context should only be a default to be used when the calling code has no network object in hand. All libraries should accept and pass along the context. Otherwise, one could not reuse protocols over different networks in the same app. While this can be seen as an issue, most of the time I find that I need to pass something along already: an address object or a stream, for example. That is sufficient to get back to the network object. It is also OK to create multiple network instances of the same type, though it should not be done with impunity<g>.
2. address
The address should be a standalone class and not tied to a particular network.
I think different networks (in my uses anyway) have different expectations (data size, textual form, etc.). This lead me to the conclusion that an address is an abstract idea. Its primary role is a factory for other objects related to the address it describes, which is again not something a concrete type would be able to handle (at least w/o indirection hidden inside).
It should be a simple entity and not a hierarchical container. Logical to physical resolution should take a source logical address and produce a container of physical addresses.
This container is often needed internally and, in my experience, seldom needed by the user. It is helpful for the logical address object to know the corresponding physical addresses and not have to repeat the resolution process. I see no fundamental problem exposing this collection as std::vector<address_ptr> (or whatever) and not providing an interface to encapsulate it, but I was trying to keep things purely abstract.
The address should not be created with an URL, that is, a TCP connection to www.example.com should be represented by the address tcp:/www.example.com:80, leaving http://www.example.com reserved for the stream obtained by the HTTP GET / query, not for the raw protocol stream. A connection to the serial port should probably be represented by "com1:/38400,n,8,1", and so on.
This scheme morphing seems to fit with the idea that network instances are behind the scenes, but creates complexity for the user as well since these transformations must be done at that level. For serial use, it is not sufficient to say "com1" (that was a concept I had early on as well). That would address an entire network, not a specific "port"-like concept. I think the "U" in URL is really fitting for this model. There is a lot of thought behind URL's and they have the right level of generality. IMHO, we do not need another textual form for addresses.
I think that the UDP broadcast address should be represented by udp:/0.0.0.0, and the TCP loopback should be tcp:/127.0.0.1, as usual. But I may be wrong.
All of these are forcing TCP/IP (IPv4 even<g>) concepts to the user. The central idea of the abstraction I proposed is that "datagram" and "stream" are the behavior-bundles to which one should write protocols, not TCP/UDP or sockets. The notion of loopback and broadcast can carry over from one network type to another, but these textual forms do not. That said, this will work given an IPv4 network object in my proposal: strm = net->new_address("127.0.0.1:80")->new_stream(); It is just not portable to IPv6 or any other network type. Hence, one is better off saying: strm = net->new_loopback_address()->new_stream();
3. Minor stylistic issues
p->get_X() should be p->X()
Personally, I go back and forth on this<g>. I suppose that std::basic_string<>::length is the right form to follow. I currently like the get_X/set_X symmetry, but will change back in the proposal.
p->new_Y() should probably be p->create_Y()
Is that a preferred verb for Abstract Factory? I should reread that chapter... :) I chose "new" because it advertised exactly what must happen: a new instance is made. Not that I am wedded to it, if there is ample precedent for "create".
although it might be better to move to net::create_Y( p ) or even to Y( p ) and reference semantics under the hood.
The approach I took (based on shared_ptr <g>) allowed for clear ownership, copy and layering semantics. In other words: net::stream_ptr stream = ...; stream = ssl::new_client_stream(stream); At this point, the new SSL stream impl drives the original stream which it can rely upon to exist until it is done with it. From the user's point of view, the SSL stream is now _the_ stream.
The destructors should be protected.
Agreed.
4. MT-centric model
The general idea of an asynchronous callback dispatcher is sound, but the threading decision should be left to the user. The library should provide
void net::poll( timeout tm );
that delivers any pending callbacks from the context of the thread that called poll, and
void net::async_poll( error_callback );
that acts "as if" it executes net::poll( infinite ) repeatedly in one or more background threads. (error_callback will be called on errors; the synchronous variant will probably throw an exception instead.)
The threading choices are certainly the most complex. Having said that, these are mostly implementation choices, not interface choices though one might need to extend the interface to support ideas like the one you propose. The complexity really shows up when one wants to write protocol libraries. For each choice presented to the network user, the protocol library probably needs to provide similar support. For example, an HTTP library would need to provide sync and async under my proposal. Adding more styles of notification to the network layer probably makes this job more difficult. Not to pass final judgment here; just a consideration. In my work on this, I attacked thread affinity outside the network layer for the most part. In particular, what you called net::poll() I had as a different object that handled a queue of boost::function<>-like objects. That way, the main thread was not forced to be only a network thread. I do see room for this approach in cases where the main thread wants to be a network-only thread and the queue approach feels too heavy. I think that this would fit in my proposal as a different concrete network-derived class as long as the abstraction is the same: sync and async (with other rules TBD).
Whether the library uses multiple threads under the hood, or how many, is implementation defined, even if async_poll is not called; this depends on which model delivers the best performance on the specific platform.
While I agree to some extent, the user must know the context in which callbacks will be made. Or at least, they need to know a set of rules to follow that will keep their code working for all implementations. Forcing a single-threaded view of the network on the user imposes its own penalty. At the top-most level, the programmer should have some (hopefully small<g>) number of concrete network objects to pick amongst. Once chosen, all mid-level libraries need to know that their expectations of behavior are still going to be met. At the bottom, we do what seems best to implement the abstraction on a given platform.
5. read_later, etc
I think that read_later should be dropped. async_read is enough, in my opinion.
Personally I have little use for non-blocking style, but there is a lot of desire for it. And it does have its merits for buffering data.
The functionality of write_later should be achievable with a call to async_write with size 0; write_later can be dropped, too.
I debated this myself, but decided to apply the axiom "don't parameterize behavior" and ended up with the x_later approach. Under the covers, they will be very much like async_x with no buffers on which to operate.
async_read should not take a buffer; instead, the callback should receive a pointer to a buffer managed by the library that is guaranteed to be valid for the duration of the callback. (Not by default, at least.)
async_write (by default) should not assume that the passed buffer stays valid after async_write returns.
Rationale: buffer management is a pain with multiple callbacks active ;-)
Others I believe have commented on this, but my take is that this level should not make assumptions about the ultimate destination of data. This approach, would in many cases, force data to be copied "one more time". I agree that this can be a painful process, but perhaps this is where higher level abstractions come into play, or possibly (as you suggest) optional additional features for this abstraction. Maybe passing (NULL,0) for I/O requests, or some sort of set_buffer() mechanism. Don't know.
That's it for now. Please don't take these as criticisms and wherever you see "should" read "in my humble opinion, the design probably could be enhanced by".
Please understand my responses in the same way. :) I (try<g>) to never take criticism of designs personally, and I do feel that all criticism, especially constructively phrased as yours, is only beneficial to everyone.
Thanks for reading.
Nope - thank you for taking the time read my proposal and then respond with so many ideas! Best regards, Don __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com