Re: [network] An RFC - updated

Hi Boris,
on what part of the network library are you working? It seems like you are in level 1 without I/O streams - whatever this part could be called. :)
If I understand the levels right: 1 - Socket API wrappers (RAII, type-safety mostly; minimum overhead) 2 - Connector, Acceptor, Connection 3 - iostream integration Then, I guess I would place the ideas I proposed at level 1.5 :)
If I look at the extremes - wrapper classes for the C API and an acceptor-connector pattern with service handlers like in ACE - where is your library? You are somewhere in between if I am right: I don't see any socket class but see classes reminding me of network functions.
The crux of what I am proposing is a layer that encapsulates sockets behind an abstraction, which would place it between sockets and ACE-like designs or whatever else. Many, but not all, implementations of that abstraction would be based on sockets, select, epoll, kqueue, etc.. At my work, where this design was born, we have a direct serial implementation and an HTTP-tunnel implementation. We have also implemented the stream abstraction for NT named pipes and SSL (using OpenSSL). By virtue of the abstract nature of things, we can use SSL over named pipe for example. Or SSL over serial line (you get the idea<g>).
Do you think we need another layer?
Personally, I think the layer I proposed is vital to higher level designs that can be reused (on different platforms or over different network implementations). Not only designs, but also protocol libraries. For example, (if it weren't for proxy servers<g>) the code needed by an HTTP client to support HTTPS would be something on this order of complexity: void http_get (net::network_ptr net, const net::url & url) { net::stream_ptr stream; stream = net->new_address(url)->new_stream(); stream->connect(); // handle SSL: if (url.get_scheme() == "https") stream = ssl::new_client_stream(stream); // now using SSL - or not! } Lower level ideas (like socket and fd_set) imply an implementation that may be completely inappropriate. A bidirectional stream is an idea; it is not a socket. So, yes, I do think we need this layer or one much like it. And it must be much lower level than <iostream>. The abstraction I proposed is a direct rendering of general network concepts (influenced by TCP/IP, which might mean it needs some further generality). It imposes only the overhead necessary to adapt sockets to C++ ways of thinking.
Shall we drop the acceptor-connector pattern?
I have acceptor in my proposal, but connector didn't seem (to me) to be an entire class/object. I have an address object as the Abstract Factory for a stream object, which is similar to a connector concept except that the connection process begins with a method on the stream object. In other words, the stream is created in an unconnected state and it is that stream that we want to connect. This felt like a much better fit to the problem at hand (again, to me<g>).
Is your library which belongs to level 1 a neighbour of an acceptor-connector based package with different goals?
My goal is to iron out a behavior contract that satisfies various possible and desired implementations, so that a library user can know what are safe and portable practices. One could say that the sockets API does this, but there are two central issues: one cannot create a new type of socket (to layer protocols); not all forms of I/O completion are portable. So, underneath my proposal are many implementation choices. Likewise, on top of it can be abstract ideas about services or request/response patterns and the like. The layer is conceptually complete (or should be made so by adding new abstractions). The only reasons I see for going around the layer I propose would be: use of platform-specific features or performance. I believe it is too early for performance conclusions of what I am proposing. That is a job best left for measurement, which is why I wanted to proceed with things at SourceForge. I can say that based on my previous implementation of this design, I know it can perform very well. Perhaps not as well as some direct-sockets approaches. Certainly not as well as feeding the network driver directly from the file system driver's disk cache, but life is full of trade-offs. :) I hope I have answered your questions. Best regards, Don __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

Don G wrote:
The crux of what I am proposing is a layer that encapsulates sockets behind an abstraction, which would place it between sockets and ACE-like designs or whatever else.
Or even higher up abstraction, since it could be built on top of more lower layer ACE/socket apis ;).
Many, but not all, implementations of that abstraction would be based on sockets, select, epoll, kqueue, etc.. At my work, where this design was born, we have a direct serial implementation and an HTTP-tunnel implementation. We have also implemented the stream abstraction for NT named pipes and SSL (using OpenSSL). By virtue of the abstract nature of things, we can use SSL over named pipe for example. Or SSL over serial line (you get the idea<g>).
The abstraction also imposes a lot of choices on threading and memory managment. Like creating threads behind you back and such that could be a problem. Eg calling a com object on a callback thread in Windows needs CoInitialize to be called and so forth. So just creating threads behind the back and do callbacks to user code is a problem. Of course you could and should provide a thread init hook or something to that matter.
I have acceptor in my proposal, but connector didn't seem (to me) to be an entire class/object. I have an address object as the Abstract Factory for a stream object, which is similar to a connector concept except that the connection process begins with a method on the stream object. In other words, the stream is created in an unconnected state and it is that stream that we want to connect. This felt like a much better fit to the problem at hand (again, to me<g>).
I see no problem with this. The only thing an connector gives you is the possibility to do some hand shaking or whatever before handing the stream to the handler and this handshacking wouldn't be relevant to the protocol handler.
The layer is conceptually complete (or should be made so by adding new abstractions). The only reasons I see for going around the layer I propose would be: use of platform-specific features or performance.
How would you write a single threaded reactive server or client? /Michel

I like the proposal. It shows a certain polish that only comes with real-world experience with a design. ;-) I have the following comments for now, based on a not-very-deep look at the document and the header: 1. network root class In my opinion, this class is not needed and should be removed. It complicates the design, forces a global variable on the user, and is not library-friendly. For example, if lib1 and lib2 use the network, the user can't pass an address created by lib1 to lib2. The global state modeled by the network class should be an implementation detail. 2. address The address should be a standalone class and not tied to a particular network. It should be a simple entity and not a hierarchical container. Logical to physical resolution should take a source logical address and produce a container of physical addresses. The address should not be created with an URL, that is, a TCP connection to www.example.com should be represented by the address tcp:/www.example.com:80, leaving http://www.example.com reserved for the stream obtained by the HTTP GET / query, not for the raw protocol stream. A connection to the serial port should probably be represented by "com1:/38400,n,8,1", and so on. I think that the UDP broadcast address should be represented by udp:/0.0.0.0, and the TCP loopback should be tcp:/127.0.0.1, as usual. But I may be wrong. 3. Minor stylistic issues p->get_X() should be p->X(), p->new_Y() should probably be p->create_Y(), although it might be better to move to net::create_Y( p ) or even to Y( p ) and reference semantics under the hood. The destructors should be protected. 4. MT-centric model The general idea of an asynchronous callback dispatcher is sound, but the threading decision should be left to the user. The library should provide void net::poll( timeout tm ); that delivers any pending callbacks from the context of the thread that called poll, and void net::async_poll( error_callback ); that acts "as if" it executes net::poll( infinite ) repeatedly in one or more background threads. (error_callback will be called on errors; the synchronous variant will probably throw an exception instead.) Whether the library uses multiple threads under the hood, or how many, is implementation defined, even if async_poll is not called; this depends on which model delivers the best performance on the specific platform. 5. read_later, etc I think that read_later should be dropped. async_read is enough, in my opinion. The functionality of write_later should be achievable with a call to async_write with size 0; write_later can be dropped, too. async_read should not take a buffer; instead, the callback should receive a pointer to a buffer managed by the library that is guaranteed to be valid for the duration of the callback. (Not by default, at least.) async_write (by default) should not assume that the passed buffer stays valid after async_write returns. Rationale: buffer management is a pain with multiple callbacks active ;-) That's it for now. Please don't take these as criticisms and wherever you see "should" read "in my humble opinion, the design probably could be enhanced by". Thanks for reading.

Peter Dimov wrote:
4. MT-centric model
The general idea of an asynchronous callback dispatcher is sound, but the threading decision should be left to the user. The library should provide
void net::poll( timeout tm );
that delivers any pending callbacks from the context of the thread that called poll, and
void net::async_poll( error_callback );
that acts "as if" it executes net::poll( infinite ) repeatedly in one or more background threads. (error_callback will be called on errors; the synchronous variant will probably throw an exception instead.)
This means we have one global dispatcher for the library (completion queue in asynch case). Isn't there a need to have several dispatchers to be able to group connections to several dispatchers/pollers. /Michel

Michel André wrote:
Peter Dimov wrote:
4. MT-centric model
The general idea of an asynchronous callback dispatcher is sound, but the threading decision should be left to the user. The library should provide
void net::poll( timeout tm );
that delivers any pending callbacks from the context of the thread that called poll, and
void net::async_poll( error_callback );
that acts "as if" it executes net::poll( infinite ) repeatedly in one or more background threads. (error_callback will be called on errors; the synchronous variant will probably throw an exception instead.)
This means we have one global dispatcher for the library (completion queue in asynch case). Isn't there a need to have several dispatchers to be able to group connections to several dispatchers/pollers.
You can probably guess that I think that the answer is no. ;-)

Peter Dimov wrote:
This means we have one global dispatcher for the library (completion queue in asynch case). Isn't there a need to have several dispatchers to be able to group connections to several dispatchers/pollers.
You can probably guess that I think that the answer is no. ;-)
Yep ;) Do you have specific reasons? Like interface simplicity, implementation complexity? How do you stop the asynch polling threads started with net::asynch_poll or isn't that needed? Why should delivery of notifications from net::poll only be on operations started from that thread? This would divide threads to i/o non i/o and I won't be able to post async op from non i/o threads. Or am I missing some parts of the puzzle ;) /Michel

Michel André wrote:
Peter Dimov wrote:
This means we have one global dispatcher for the library (completion queue in asynch case). Isn't there a need to have several dispatchers to be able to group connections to several dispatchers/pollers.
You can probably guess that I think that the answer is no. ;-)
Yep ;)
Do you have specific reasons? Like interface simplicity, implementation complexity?
Simplicity, mainly. If multiple dispatchers were present, how would you use them to your advantage? The typical application loop would still consist of polling all the dispatchers. The intent of poll is to preserve the conceptual model of the current library, but allow the client to be unaware of threading issues. Single threaded clients typically have a loop that is a natural place for the net::poll call.
How do you stop the asynch polling threads started with net::asynch_poll or isn't that needed?
The library already supplies several 'cancel' functions, so something similar could be used here as well (although I've never had a need to stop a background dispatcher so I'm not sure whether this is an essential feature.)
Why should delivery of notifications from net::poll only be on operations started from that thread? This would divide threads to i/o non i/o and I won't be able to post async op from non i/o threads. Or am I missing some parts of the puzzle ;)
I see that what I wrote is ambiguous. net::poll executes the pending callbacks in the notification queue, regardless of the thread that scheduled them. The actual callback invocations are made from the thread that invoked net::poll. The idea is to eliminate the need for locking and reentrancy in single-threaded clients, not to filter callbacks based on the thread that scheduled them.

Peter Dimov wrote:
Michel André wrote:
Do you have specific reasons? Like interface simplicity, implementation complexity?
Simplicity, mainly. If multiple dispatchers were present, how would you use them to your advantage? The typical application loop would still consist of polling all the dispatchers.
Or having one thread polling one dispatcher and another one calling the other handling connections with different priority eg for one sake or another.
The intent of poll is to preserve the conceptual model of the current library, but allow the client to be unaware of threading issues. Single threaded clients typically have a loop that is a natural place for the net::poll call.
I think the idea of a poll or an dispatcer::dispatch is a great idea and gives the opportunity to write single threaded reactive servers or clients for that matter.
How do you stop the asynch polling threads started with net::asynch_poll or isn't that needed?
The library already supplies several 'cancel' functions, so something similar could be used here as well (although I've never had a need to stop a background dispatcher so I'm not sure whether this is an essential feature.)
Maybe this isn't strictly needed, but I imagine some isssues could raise during shutdown, if the order isn't defined.
Why should delivery of notifications from net::poll only be on operations started from that thread? This would divide threads to i/o non i/o and I won't be able to post async op from non i/o threads. Or am I missing some parts of the puzzle ;)
I see that what I wrote is ambiguous. net::poll executes the pending callbacks in the notification queue, regardless of the thread that scheduled them. The actual callback invocations are made from the thread that invoked net::poll.
Ok. So basically net::poll should be thread safe and I could use a thread pool dispatching io events by executing net::poll in a loop. Do you envisione some way to control the threading policy for net::asynch_poll? Such as number or working threads or whatnot or should it be at the libraries discretion.
The idea is to eliminate the need for locking and reentrancy in single-threaded clients, not to filter callbacks based on the thread that scheduled them.
Ok. And thats a fair goal thats needed. /Michel

Michel André wrote:
Do you envisione some way to control the threading policy for net::asynch_poll? Such as number or working threads or whatnot or should it be at the libraries discretion.
I think that the library should be in a good position to know which method delivers the best performance on the current platform, so I'm inclined to trust it to determine the number of worker threads automatically. If I want finer control I'd probably drop down to the native facilities anyway, because my application would be non-portable.

Peter Dimov wrote:
5. read_later, etc
...
async_read should not take a buffer; instead, the callback should receive a pointer to a buffer managed by the library that is guaranteed to be valid for the duration of the callback. (Not by default, at least.)
If we ever extend this library to supporting connections on X.25 libraries, at least one X.25 adapter vendor require that transfer buffers be allocated by special API calls beforehand, presumably because they are mapped to memory on the adapter card. I guess it wouldn't be too difficult to allow either model, i.e use what ever the called supplied, or allocate one automatically. Or - place the decision in a policy template class or something like that. Mats

Don G wrote: Hi Don,
[...] So, yes, I do think we need this layer or one much like it. And it must be much lower level than <iostream>. The abstraction I proposed
I think socket streams are no competitor anyway. An ACE-like library and socket streams can be both built on top of level 0 so they both live in level 1. While socket streams offer blocking and maybe non-blocking I/O using the familiar stream interface the other packages in level 1 should have clear goals, too. The package in http://www.highscore.de/boost/net/packages.png I called "ace" was introduced after my discussion with Michel. We talked about acceptor-connector pattern and service handlers, and I think these concepts can help developers using the network without any idea of sockets and how they work. That said I am trying to find out what your library offers to understand if it belongs to the same package which is called "ace" right now. Or your library has different goals which means we need another package in level 1 or even worse a new level. In the end the network library can and should consist of different packages to meet different requirements but it should be very clear to library users which requirements are met by each package.
[...] I have acceptor in my proposal, but connector didn't seem (to me) to be an entire class/object. I have an address object as the Abstract Factory for a stream object, which is similar to a connector concept except that the connection process begins with a method on the stream object. In other words, the stream is created in an unconnected state and it is that stream that we want to connect. This felt like a much better fit to the problem at hand (again, to me<g>).
As far as I understand the acceptor-connector pattern the idea is that a connector behaves like an acceptor. This means both of them are factories producing connected streams. Most network developers (who know Berkeley sockets) will probably create an acceptor based on accept() and a connector based on connect(). The acceptor will work as a factory but the connector can probably be used only once to create one connected stream. This is what I did in http://www.highscore.de/boost/net/basic.png and comes close to the C API. If we think of unexperienced developers who don't know any sockets they will probably be happy if the connector behaves like the acceptor as they wouldn't understand the difference. Regarding the acceptor and connector your library seems to be close what I made up in the class hierarchy and therefore would belong to level 0.
[...] My goal is to iron out a behavior contract that satisfies various possible and desired implementations, so that a library user can know what are safe and portable practices. One could say that the sockets API does this, but there are two central issues: one cannot create a new type of socket (to layer protocols); not all forms of I/O completion are portable.
I am surprised to read this! The socket API - the so-called level 0 - must and will be portable. It should be possible to derive classes from socket and create new types. And all four types of I/O must be supported in level 0, too. What kind of problems do you see here?
[...] The layer is conceptually complete (or should be made so by adding new abstractions). The only reasons I see for going around the layer I propose would be: use of platform-specific features or performance.
If the network library should support platform-specific features we should indeed add another package to http://www.highscore.de/boost/net/packages.png. This package would be optional. However it would be nice if classes from this package would collaborate somehow with classes from the other non-optional packages of the network library. The non-optional packages would provide a default implementation and library users could use optimized classes from the optional package depending on the platform they are using.
[...] I hope I have answered your questions.
Yes, thank you very much! Sorry to have some new questions now. :) Boris

Hi Boris, I hope all is going well.
That said I am trying to find out what your library offers to understand if it belongs to the same package which is called "ace" right now. Or your library has different goals which means we need another package in level 1 or even worse a new level. In the end the network library can and should consist of different packages to meet different requirements but it should be very clear to library users which requirements are met by each package.
I agree on the goals for each package. The library I am proposing is logically the same as sockets, but of course, it must be implemented somehow (sockets typically). Ideally, there would be nothing between it and the most primitive/best-performing library available. Various implementations of the interfaces would map to whatever mechanism was available (see below).
As far as I understand the acceptor-connector pattern the idea is that a connector behaves like an acceptor. This means both of them are factories producing connected streams. Most network developers (who know Berkeley sockets) will probably create an acceptor based on accept() and a connector based on connect(). The acceptor will work as a factory but the connector can probably be used only once to create one connected stream.
The reason I suppose that I preferred not-yet-connected as a state of stream (vs. a separate class) is that it is logically a 1-to-1 relationship (and even the same socket) whereas acceptor is a 1-to-N relationship (the acceptor socket never becomes something else). The other reason I went with the approach I have is that there is only one object to deal with for cancel. I don't need to worry about the transition from connecting to connected when I want to abort.
This is what I did in http://www.highscore.de/boost/net/basic.png and comes close to the C API. If we think of unexperienced developers who don't know any sockets they will probably be happy if the connector behaves like the acceptor as they wouldn't understand the difference.
I would find this more difficult. With an acceptor, I am doing one thing (accepting connections), but with _each_ stream connection, I am doing one thing. In this respect, the socket approach feels right: an acceptor is a thing, and a not-yet-connected stream is also its own thing (which may become connected eventually).
Regarding the acceptor and connector your library seems to be close what I made up in the class hierarchy and therefore would belong to level 0.
I think that is consistent in that it is an attempt to hide sockets which is at level 0 (maybe they will move down to -1<g>).
My goal is to iron out a behavior contract that satisfies various possible and desired implementations, so that a library user can know what are safe and portable practices. One could say that the sockets API does this, but there are two central issues: one cannot create a new type of socket (to layer protocols); not all forms of I/O completion are portable.
I am surprised to read this! The socket API - the so-called level 0 - must and will be portable. It should be possible to derive classes from socket and create new types. And all four types of I/O must be supported in level 0, too. What kind of problems do you see here?
Some platforms have socket features that aren't available on other platforms. Unix has signal-driven I/O, while Windows does not. Windows has event object based I/O, completion ports, HWND messages, etc. which are unique to it. The common intersection is blocking, non-blocking and select. One can write 99% portable sockets code based on that subset. Except that Windows has restrictions on select(). A single fd_set must be <= 64 sockets. You cannot mix sockets from different providers (which I understand to mean IPv4 vs. v6 vs. IPX vs. whatever). Also, select() is for sockets only, not files, not pipes, not fill_in_the_blank. One cannot make a sockets layer that hides these issues without violating the spirit of this layer: thin. Hence, my proposal. I believe the right answer is to write an ideal network interface. One that focuses on behaviors and not the API's used to implement them. On some platforms, this implementation can be fairly thin. Of course, it can also provide a robust thread-pooling implementation and probably has to on Windows (because of the 64 socket limit). On Linux, it could use epoll. All are wonderful areas to explore. Which brings me to notification via callback. Whenever user code is called by a library, there must be clear rules and expectations. One good way to make things difficult would be to make callbacks to user code from a signal handler. Of course, threads can also present a similar problem. It is all well and good to provide choices to the application author, but we cannot forget the vital role of protocol library authors. So, what I think is necessary is a description of rules and things one can expect when using async calls specifically. Without clear and unequivocal rules, protocol library authors would have to constantly wonder about these important details. So, let's pick on an SSL stream implementation based on OpenSSL. IMHO, the ideal implementation of such a thing would consume an abstract stream (so it could be used over any arbitrary transport) and provide the same stream features. Since this is a user library, the OpenSSL stream is not a socket. It cannot create some entity that can be passed to the socket API. This is what I meant by "we cannot create new kinds of sockets". Further, because the SSL stream makes I/O requests, it must understand what is supported and how async callbacks work. It must present exactly those same behaviors to its user.
If the network library should support platform-specific features we should indeed add another package to http://www.highscore.de/boost/net/packages.png. This package would be optional. However it would be nice if classes from this package would collaborate somehow with classes from the other non-optional packages of the network library. The non-optional packages would provide a default implementation and library users could use optimized classes from the optional package depending on the platform they are using.
I agree (I think<g>). There will always be non-portable concerns and one can always address them by writing directly to the API for the platform. Once a library is interposed, things change. What I was expecting in this area was that the lowest-level API wrappers (mostly sockets) are entities that can be used in platform-specific ways because they provide their native descriptor to the user. But, this is a "chose your layer and portability goals" question for the app. To exercise such features, one will have to go around the interfaces I am proposing. Best, Don __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

Don G wrote:
The reason I suppose that I preferred not-yet-connected as a state of stream (vs. a separate class) is that it is logically a 1-to-1 relationship (and even the same socket) whereas acceptor is a 1-to-N relationship (the acceptor socket never becomes something else).
I don't see connector as 1 to 1 concept. You can use a connector to establish several connections to the same address or endpoint if that is needed. The connector pattern can be used to model fail over between different endpoints providing the same servcie, by having a service_connector implementation that wraps and coordinates several alternate connectors to different end points over potentially different transports. This also hides/abstracts the fail over strategy new endpoint selection in a nice way from the protocol handler and stream.
The other reason I went with the approach I have is that there is only one object to deal with for cancel. I don't need to worry about the transition from connecting to connected when I want to abort.
The exact ownership of the socket/handle is always clear either the connector owns it during the connection phase and when it's connected the ownership is transferred to the stream. So I don't see this as a potential problem.
I would find this more difficult. With an acceptor, I am doing one thing (accepting connections), but with _each_ stream connection, I am doing one thing. In this respect, the socket approach feels right: an acceptor is a thing, and a not-yet-connected stream is also its own thing (which may become connected eventually).
Seperating the connection establishment would make the stream more stateless and have fewer concerns and it separates responsibility keeping the interfaces easier. I also doesn't imply inheritance as is the case whit a connectable stream. You also don't have to handle questions such as if a stream can be connected again after it is closed and so forth (and i guess this could be different in different implementations).
Some platforms have socket features that aren't available on other platforms. Unix has signal-driven I/O, while Windows does not. Windows has event object based I/O, completion ports, HWND messages, etc. which are unique to it. The common intersection is blocking, non-blocking and select. One can write 99% portable sockets code based on that subset.
So basically layer 0 should support this portable subset.
Except that Windows has restrictions on select(). A single fd_set must be <= 64 sockets. You cannot mix sockets from different providers (which I understand to mean IPv4 vs. v6 vs. IPX vs. whatever). Also, select() is for sockets only, not files, not pipes, not fill_in_the_blank.
You can define FD_SETSIZE to some arbitrary number. On my FC1 including select.hpp FD_SETSIZE is 1024 so we will have to handle arbitrary limits in the interface. On windows select could be implemented using WSAEventSelect and WaitForMultipleObjects but this will have to be cascaded to other threads if the set is to big and in that case handles can be mixed. So either we have to cater for arbitrary limits, letting the user implement on top of the limitst. Or remove arbitrary limits at cost of complexity and probably increased threadswitching.
Which brings me to notification via callback. Whenever user code is called by a library, there must be clear rules and expectations. One good way to make things difficult would be to make callbacks to user code from a signal handler. Of course, threads can also present a similar problem.
Yes the rules must be really clear, and the library should probably never hold any kind of lock when calling a user defined callback since that i genreally very error prone and deadlock creating, but this also makes it really hard to implement ;) but also interesting. /Michel

Michel André wrote:
Don G wrote:
The reason I suppose that I preferred not-yet-connected as a state of stream (vs. a separate class) is that it is logically a 1-to-1 relationship (and even the same socket) whereas acceptor is a 1-to-N relationship (the acceptor socket never becomes something else).
I don't see connector as 1 to 1 concept. You can use a connector to establish several connections to the same address or endpoint if that is needed. The connector pattern can be used to model fail over between different endpoints providing the same servcie, by having a service_connector implementation that wraps and coordinates several alternate connectors to different end points over potentially different transports. This also hides/abstracts the fail over strategy new endpoint selection in a nice way from the protocol handler and stream.
Ah, good to see Michel jumping in! :) I think the whole confusion really arises from that we talk about different parts of the network library but don't notice. In my opinion in level 0 connector has a 1:1-relationship with a connected socket while in level 1 it makes sense for the connector to behave as an acceptor and have a 1:n-relationship with connected service handlers. In level 0 experienced network developers would expect a 1:1-relationship (just as Don does) and in level 1 non-experienced network developers would expect the connector to behave as the acceptor (just as Michel does; that doesn't mean of course that I think Michel is a non-experienced network developer ;-). So both of you are right. What we have to do when we look at proposals like Don's library is to figure out what belongs to level 0, what belongs to level 1, what can be dropped and what is brand new we haven't thought about yet. While Don's connector belongs in a standard network library in level 0 I'd say that eg. address schemas like tcp:/www.example.com:80 definitely belong to level 1. My personal rule of thumb is: What Stevens wrote in "Unix Network Programming" is level 0, everything else is level 1. Boris
[...]

Don G wrote: Hi Don,
I hope all is going well.
thanks, for you, too! :)
[...]
My goal is to iron out a behavior contract that satisfies various possible and desired implementations, so that a library user can know what are safe and portable practices. One could say that the sockets API does this, but there are two central issues: one cannot create a new type of socket (to layer protocols); not all forms of I/O completion are portable.
I am surprised to read this! The socket API - the so-called level 0 - must and will be portable. It should be possible to derive classes from socket and create new types. And all four types of I/O must be supported in level 0, too. What kind of problems do you see here?
Some platforms have socket features that aren't available on other platforms. Unix has signal-driven I/O, while Windows does not. Windows has event object based I/O, completion ports, HWND messages, etc. which are unique to it. The common intersection is blocking, non-blocking and select. One can write 99% portable sockets code based on that subset.
Blocking, non-blocking and select are three of four I/O models. And I think we can support all four I/O models on all platforms: 1) Blocking/synchronous is what you called blocking. 2) Non-blocking/synchronous is what you called non-blocking. 3) Blocking/asynchronous is what you called select. 4) Non-blocking/asynchronous is an interface based on callbacks which will be implemented very differently on different platform. However there seem to be APIs available on all platforms to support this I/O model (see http://thread.gmane.org/gmane.comp.lib.boost.devel/120188).
[...] Hence, my proposal. I believe the right answer is to write an ideal network interface. One that focuses on behaviors and not the API's used to implement them. On some platforms, this implementation can be fairly thin. Of course, it can also provide a robust thread-pooling implementation and probably has to on Windows (because of the 64 socket limit). On Linux, it could use epoll. All are wonderful areas to explore.
I absolutely agree, as so often. :) That's why I like to define the four I/O models with blocking/non-blocking and synchronous/asynchronous as this is for the library user the real difference between the I/O models. The fourth I/O model described as non-blocking/asynchronous is the most difficult one to implement as APIs are very different. This I/O model will be based on POSIX aio, kqueue, I/O completion ports etc.
Which brings me to notification via callback. Whenever user code is called by a library, there must be clear rules and expectations. One good way to make things difficult would be to make callbacks to user code from a signal handler. Of course, threads can also present a similar problem.
I agree again. :) That's why I added a package called "async" to http://www.highscore.de/boost/net/packages.png. This package has to define the rules for the interface which will provide non-blocking/asynchronous I/O.
[...] So, let's pick on an SSL stream implementation based on OpenSSL. IMHO, the ideal implementation of such a thing would consume an abstract stream (so it could be used over any arbitrary transport) and provide the same stream features. Since this is a user library, the OpenSSL stream is not a socket. It cannot create some entity that can be passed to the socket API. This is what I meant by "we cannot create new kinds of sockets". Further, because the SSL stream makes I/O requests, it must understand what is supported and how async callbacks work. It must present exactly those same behaviors to its user.
If you are talking about level 1 of the network library I agree with you. However it shouldn't be any problem either to create a ssl_socket derived from a socket class in level 0.
[...] I agree (I think<g>). There will always be non-portable concerns and one can always address them by writing directly to the API for the platform. Once a library is interposed, things change. What I was expecting in this area was that the lowest-level API wrappers (mostly sockets) are entities that can be used in platform-specific ways because they provide their native descriptor to the user.
I see. So your library is as far as I understand a mixture of level 0, level 1 and optional platform-specific classes. If we can sort this out that would be nice. ;) Boris

Hi Boris,
Blocking, non-blocking and select are three of four I/O models. And I think we can support all four I/O models on all platforms
I agree that we _can_. I still don't see any reason for a select-like mechanism that is user facing. This is bookkeeping that is best left to the brokers behind the scenes. Peter's suggesting of a net::poll() or my more general approach (just hinted at really) are more than adequate and require no add/remove/who_services_this_socket_anyway issues. I would like to hear some convincing reasons for exposing behaviors that feel like select (beyond a certain low level).
If you are talking about level 1 of the network library I agree with you. However it shouldn't be any problem either to create a ssl_socket derived from a socket class in level 0.
What is a "socket" then as it relates to "socket class"? If the assumption is that it has a C API socket fd, then an ssl stream will not be a substitute for a socket. I guess I don't see how it can be short of writing a man-in-the-middle SSL that creates a socket pair; one for it to read/write user data and the other so it can look and behave like a socket. If this is the approach, I don't think it is the right way to go. The abstraction I proposed makes this a "trivial" task and one that does not require any extra detours for the user's data (SSL is already slow enough<g>).
I see. So your library is as far as I understand a mixture of level 0, level 1 and optional platform- specific classes. If we can sort this out that would be nice. ;)
I'm not sure about the platform-specific classes, could you clarify? I view my proposal as a complete abstraction, encapsulation, virtualization of layer 0, so in that sense it could be viewed as a level 1 library, but not in the same sense as the Wiki docs. It makes every effort to expose as much of the capabilities and behavior of the underlying network (not to be confused with API) as possible: datagram, stream, broadcast, loopback, INADDR_ANY, multi-home (local and remote), address resolution (async as well as sync). More capabilities could be added in a similar spirit, such as multicasting. Some facilities may be only available for some network types, but they should still be described in a general way. One could take this approach for platform-specific features as well, but that has to be carefully weighed because it could lead to useful libraries that have limited portability for no good reason. The library I propose does have features (ex: async callbacks) that might/probably require threads on some platforms, because that is a more "C++ way" to know what has completed within its appropriate object context. I don't view this as adding any new networking concepts any more than Winsock posting messages to an HWND for read-complete would go beyond the spirit of sockets. Suppose, just for sake of discussion, that some platform exists that does not use sockets as its network API (as was the case for Mac OS pre-X). Using the approach I propose this would have no impact to all layers above. It is not clear (to me<g>) exactly what would be required with the design on the Wiki. Best regards, Don __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

Don G wrote: Hi Don,
Blocking, non-blocking and select are three of four I/O models. And I think we can support all four I/O models on all platforms
I agree that we _can_. I still don't see any reason for a select-like mechanism that is user facing. This is bookkeeping that is best left to the brokers behind the scenes. Peter's suggesting of a net::poll() or my more general approach (just hinted at really) are more than adequate and require no add/remove/who_services_this_socket_anyway issues. I would like to hear some convincing reasons for exposing behaviors that feel like select (beyond a certain low level).
again I agree with you. I wrote select() as you were talking about select() in Windows before. So I didn't write poll() or epoll() in order not to shift focus to implementation details. :) I agree with you that we shouldn't imitate functions from the C API but improve them.
If you are talking about level 1 of the network library I agree with you. However it shouldn't be any problem either to create a ssl_socket derived from a socket class in level 0.
What is a "socket" then as it relates to "socket class"? If the assumption is that it has a C API socket fd, then an ssl stream will not be a substitute for a socket. I guess I don't see how it can be
In level 0 I assume a socket to encapsulate a socket fd and provide methods to support the four mentioned I/O models. It should then be possible to derive another class, eg. ssl_socket and overwrite methods like accept() in order to verify the client. However I wonder about your ssl stream? Are you in level 0 or in level 1? ;)
short of writing a man-in-the-middle SSL that creates a socket pair; one for it to read/write user data and the other so it can look and behave like a socket. If this is the approach, I don't think it is
I see you distinguish between read/write and socket! If I understand you correctly you don't expect any socket to provide read/write methods but return some kind of stream which is then used for reading and writing?
I see. So your library is as far as I understand a mixture of level 0, level 1 and optional platform- specific classes. If we can sort this out that would be nice. ;)
I'm not sure about the platform-specific classes, could you clarify?
The non-blocking/asynchronous I/O can be implemented in different ways. If you know that your target platform supports eg. kqueue and you have reasons to use kqueue the network library could offer such a platform-specific class to provide non-blocking/asynchronous I/O based on kqueue. At least I thought this is one goal of your library. However now after your question I am not so sure any more. ;)
I view my proposal as a complete abstraction, encapsulation, virtualization of layer 0, so in that sense it could be viewed as a level 1 library, but not in the same sense as the Wiki docs. It makes every effort to expose as much of the capabilities and behavior of the underlying network (not to be confused with API) as possible: datagram, stream, broadcast, loopback, INADDR_ANY, multi-home (local and remote), address resolution (async as well as sync). More capabilities could be added in a similar spirit, such as multicasting. Some facilities may be only available for some network types, but they should still be described in a general way. One could take this approach for platform-specific features as well, but that has to be carefully weighed because it could lead to useful libraries that have limited portability for no good reason.
I see. I think I wrote in another thread that I view layer 0 as to be very close to the C API. The idea of layer 0 is (at least in my mind) that experienced network developers who know Berkeley sockets should be able to switch over very easily to the C++ network library - the entry level should be as low as possible. Your idea of level 0 seems to be something else?
[...] Suppose, just for sake of discussion, that some platform exists that does not use sockets as its network API (as was the case for Mac OS pre-X). Using the approach I propose this would have no impact to all
How many platforms are there without Berkeley sockets? Do we really want *not* to provide a layer close to Berkeley sockets because of one (?) old platform and make it more difficult for the the majority of network developers to switch over to the C++ network library who know and understand Berkeley sockets? Boris

Hi Boris,
In level 0 I assume a socket to encapsulate a socket fd and provide methods to support the four mentioned I/O models. It should then be possible to derive another class, eg. ssl_socket and overwrite methods like accept() in order to verify the client. However I wonder about your ssl stream? Are you in level 0 or in level 1? ;)
I had assumed level 0 was sockets. No abstraction really; just encapsulation. In my design I still need sockets :). I just don't want them being visible beyond the level where they must be visible. If you look at http://sourceforge.net/projects/netiphany you will see what I mean by sockets and level 0 (just my take, of course). Where we diverge perhaps is level 1. I think the first thing that should be done with sockets is to hide them behind a complete abstraction. The only reason for writing code to level 0 (outside of level 1) is for pure performance. Not that the library I am proposing is a slouch, just that you cannot add any code or mechanism and still be as optimal as direct sockets usage (which level 0 + inlining would be). So, in my view of the world<g>, I see something like my proposal as level 1. There should not need to be anything else at level 1. It is a complete abstraction (or should be made so). Back to SSL. It is an ideal example. While it uses (my definition of) level 1 facilities, it also provides an implementation of the abstraction defined by level 1. So it is both a producer and consumer of level 1. I suppose you could say it is level 2, but I think it doesn't fit there either, unless you defined level 2 to be exactly that kind of thing only. Beyond level 1 (and things like ssl_stream), protocol libraries and the like would live off the level 1 abstractions or some higher level framework. Anyway, that is my view. :)
I see you distinguish between read/write and socket! If I understand you correctly you don't expect any socket to provide read/write methods but return some kind of stream which is then used for reading and writing?
I am having a hard time with the limits of English (C++ is more expressive<g>). At level 0, sockets provide send and recv as they always have, just wrapped for safety, convenience, portable error code returns, etc.. Which is why I cannot see how ssl_stream could derive from the socket wrapper class. Even if these were virtual, if the socket base class provides a get_fd() method (which I think it must), a derived ssl_stream would violate the rule of substitution unless you play some expensive games.
I'm not sure about the platform-specific classes, could you clarify?
The non-blocking/asynchronous I/O can be implemented in different ways. If you know that your target platform supports eg. kqueue and you have reasons to use kqueue the network library could offer such a platform- specific class to provide non-blocking/ asynchronous I/O based on kqueue. At least I thought this is one goal of your library. However now after your question I am not so sure any more. ;)
I understand your point now. I was trying to satisfy the desire of some to chose an implementation for platforms that could have multiple choices (select + poll + kqueue). I don't think this is something most users would ever do, but there is room for the app developer to chose what class they instantiate for the network. I think we would need a simple name for the common/preferred option such as: network_ptr net = new network_ipv4; // default choice network_ptr net = new network_ipv4_kq; // not portable Beyond this choice, the abstractions should be 100% the same to the user and other library code.
I see. I think I wrote in another thread that I view layer 0 as to be very close to the C API. The idea of layer 0 is (at least in my mind) that experienced network developers who know Berkeley sockets should be able to switch over very easily to the C++ network library - the entry level should be as low as possible. Your idea of level 0 seems to be something else?
Not at all. I just don't see encouraging anyone to work at level 0. All higher level thinking is above that. In particular, the abstraction I am proposing is level 1 (my definition again<g>). But, that level 1 abstraction is ideally very much like level 0 in terms of network semantics, just not so much like the sockets API.
How many platforms are there without Berkeley sockets?
Hopefully not many, but I would put Windows in that list. :) They have sockets, but you code with them in very non-portable ways if you want performance. Much like the fracture between poll, epoll and kqueue.
Do we really want *not* to provide a layer close to Berkeley sockets because of one (?) old platform and make it more difficult for the the majority of network developers to switch over to the C++ network library who know and understand Berkeley sockets?
I would say that level 0 is a direct, perhaps even inlined wrapper over sockets (see above). For legacy code, that would be useful. Even for experienced developers, I believe they would be much better off (give or take some radical optimization requirements) with the layer I propose. If they see it otherwise, level 0 is there and is a supported library for use by anyone. That said, I really wanted to address the design to someone without lots of experience in this area. There are some things that just cannot and should not be hidden from the user (bi-directionality, long delays, etc.), but we can make it very easy to do the simple things (should I use gethostbyname or gethostbyname_r or WSAAsyncGetHostByName?). And without an undue performance penalty that would push even modest needs down a level. Also, by using abstract interfaces, shared_ptr<> and Abstract Factory, one can do the most amazing things with layered objects (see posts with Peter). That was part of the goal of my design as well and indeed part of the problem that gave it birth. Best, Don __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com

Don G wrote:
I would say that level 0 is a direct, perhaps even inlined wrapper over sockets (see above). For legacy code, that would be useful. Even for experienced developers, I believe they would be much better off (give or take some radical optimization requirements) with the layer I propose. If they see it otherwise, level 0 is there and is a supported library for use by anyone.
I would not recommend making the socket abstraction/layer inline. Some socket implementations have notoriously bad header files that you don't want to have included by accident in user code using level 0. And apart from the resource managment perspective this is one of the main reasons for having a facade. /Michel

Don G wrote: Hi Don,
[...] Where we diverge perhaps is level 1. I think the first thing that should be done with sockets is to hide them behind a complete abstraction. The only reason for writing code to level 0 (outside of
actually I agree again. I think from now one I just assume that you talk about level 1 all the time. :)
[...] So, in my view of the world<g>, I see something like my proposal as level 1. There should not need to be anything else at level 1. It is a complete abstraction (or should be made so).
I agree again. In my view of the world level 0 is as much low-level and level 1 as much high-level as they can be. If anyone wants to introduce another level of abstraction he needs good reasons to do so and make a sharp distinction between such a new level and the other two. Right now I think we have no reasons to assume that we need a third layer.
[...] I am having a hard time with the limits of English (C++ is more expressive<g>). At level 0, sockets provide send and recv as they always have, just wrapped for safety, convenience, portable error code returns, etc.. Which is why I cannot see how ssl_stream could derive from the socket wrapper class. Even if these were virtual, if the socket base class provides a get_fd() method (which I think it must), a derived ssl_stream would violate the rule of substitution unless you play some expensive games.
Maybe I should have written ssl_socket and not ssl_stream. What I was trying to say is that level 0 should be as complete as Berkeley sockets are. If someone wants to create a ssl_socket class it should be possible to do with level 0 classes. I understand that there is interest in improving the API and get to a higher abstraction - I completely agree that we need a layer 1. However I am also very much interested in a complete layer 0 and think we need this layer (reasons below).
[...] I understand your point now. I was trying to satisfy the desire of some to chose an implementation for platforms that could have multiple choices (select + poll + kqueue). I don't think this is something most users would ever do, but there is room for the app developer to chose what class they instantiate for the network. I
I will update http://www.highscore.de/boost/net/packages.png and add a package "platform-specific" just for completeness.
[...] Not at all. I just don't see encouraging anyone to work at level 0. All higher level thinking is above that. In particular, the abstraction I am proposing is level 1 (my definition again<g>). But, that level 1 abstraction is ideally very much like level 0 in terms of network semantics, just not so much like the sockets API.
I agree with what you said about level 1.
[...] I would say that level 0 is a direct, perhaps even inlined wrapper over sockets (see above). For legacy code, that would be useful. Even for experienced developers, I believe they would be much better off (give or take some radical optimization requirements) with the layer I propose. If they see it otherwise, level 0 is there and is a supported library for use by anyone.
I think we need a complete level 0 which is as low as possible for one very important reason. It isn't so much about legacy code or that we couldn't implement level 1 without level 0. It is because the majority of network developers knows Berkeley sockets and the C API and has a lot of experience with these concepts. If we follow the ongoing discussions here in this list it seems like everyone created a network library himself. What are the reasons for this? Either developers had very specific requirements and were forced to create their own network libraries - however I can't believe this with so many network libraries out there. Or developers didn't want to spend time to learn about the design of other network libraries and understand their advantages and disadvantages and decided to create their own network libraries based on Berkeley sockets. There have been many complaints recently in the thread "Boost to the rescue" that C++ is losing ground. If we want network developers to switch over to the C++ network library we should help them to reuse their knowledge and not force them to forget everything they know and get used to new concepts. I think I view level 0 is a very low entry barrier for today's network developers - without such a very low entry barrier I fear we have a hard time to convince network developers to switch. Eg. when I try to understand the design of a network library I search for the socket class first. It helps me to see something familiar and I start exploring the network library from the socket class. Being confronted with unfamiliar concepts in level 1 network developers might decide to develop their own network libraries *again* instead of using the brand new Boost C++ network library. In the long term it would be nice if everyone was using level 1. But even if we manage to create a level 1 which meets all requirements network developers ever have I don't think we can convince network developers that our network library really has such a great level 1. These strange network developers want to build something up themselves based on Berkeley sockets all the time. :) Boris
participants (5)
-
Boris
-
Don G
-
Mats Nilsson
-
Michel André
-
Peter Dimov