Yet Another Network Library

Hello fellow boosters, after months of lurking on this list and seeing how everybody seems to have its own network library, i've finally decided to put online mine. I will upload it under the vault as nanostream.tar.gz, it is unix/linux only currently. You may want to skip the library code and read the pdf under the doc directory where the library is shortly presented and some examples are given. These are the main features/peculiarities of my library: - There is no socket concept because i don't really think it is natural, at least not for a C++ programmer. I use the acceptor, connector and stream concepts. - User is not required to reference streams by pointer, streams are stack allocated or are simply members of another object. Internally they have a smart pointer to an implementaion handle. Consider them stack-based proxies. The acceptor and the connector return the handle that is asigned to the stream. - The preferred way to do input output is to use standard-like algorithms (i.e. copy) with buffered stream adaptors and specialized input/output iterators. I believe that an efficient library can be written this way and be very C++-user-friendly. Classic read/write are still available, but their semantics might be surprising. - All classes are concrete, no polymorphism is used (i.e. no virtuals). Polimorphic behaviour must currently be achieved with some external mean (i.e using the external polymorphism pattern. I think that the boost::IDL library would be great). - Errors can be reported both with exceptions and with error codes. Exceptions are used by default unless error callbacks are passed. This seems to work quite well. Internally only error codes are used and exceptions are thrown only at the most external abstracion layer. Note that the library does not actually use error codes, but error types using a specialized variant object. This is just an experiment and i might remove it. I will probably add status bits a-la iostreams. - File streams. The library actually try to be a generalized i/o framework, and file streams are provided for completeness. - The library can be extended simply by creating new handles. In addition to TCP streams there are Unix streams (come almost for free :-) and file streams. SSL/TLS was present but did get broken some time ago and didn't have the time to fix it. - Input/Output buffer. This is very similar to a std::deque<char>. It has segmented iterators support (the same interface presented in the Austern paper), and it is the preferred input/output buffer, it can be easilly grown on both directions (usefull if you need to add an header) and can be efficiently read/written with scatter-gather operations. While i have yet to implement it, i think that moving data from a buffer to another can be done extremely efficently by splicing the internal pages. The buffer is also used to implement the buffered adapter. boost::arrays, vectors and plain arrays also work fine, while other containers require a bounce buffer and thus slower operations. - Addresses are logically defined by a triplet <domain, host, service>, where the domain is implicit in the acceptor, address, or connector type, host and service are two strings. This is inspired by the getnameinfo(3) interface. - A multithreaded stream adaptor and an http module are also included. These are just proof of concept and not really part of the library (yet). Missing (definitelly not complete list): The library is fully sinchronous for now. I'm still considering how to add asynch support. I think i will implement it in the buffered adaptor I/O is done asynchronously to the internal buffers that can grow as much as it is necessary. Timeouts are definitelly a must-have. Final notes: I've have seen that the current consens is to encode the the stream type in the address, so to allow a dynamic behaviour: the actual transport is selected only at runtime, based on the address string. I think this is a bad decision (i considered doing it while implementing my library) and this is why: - C++ is a static language, let's leave these niceties to more dynamic languages. I have found myself weeks hunting a bug in the http code because i thought that it would be cool if i could access http properties using a map instead of proper accessor functions. I spent weeks hunting persistent connection bug. It was a simple typo inside a http property string that would have been caught immediately by the compiler if i were using functions or costants [1]. - It mimics standard library usage. You cannot open the input stream instead of a file by using the "input:" file name. (well under Unix you might actually do it, but i don't think this was the intention of the standard library authors and it is not portable any way). - It is extremely insecure. In a network library security must be paramount. If the transport type were encoded in the address, it would be much harder to validate externally received addresses. A similar argument can be made for the port numbers. It is better to keep these things separated. The library user can create its own indexed factory collection if it really needs to. Sorry for the long post, just tryin' to be usefull :-). -- Giovanni P. Deretta 1 - I did write "Content-Length" instead of "Content-Lenght".

"Giovanni P. Deretta" <lordshoo@gmail.com> writes:
I will upload it under the vault as nanostream.tar.gz, it is unix/linux only currently.
You may want to skip the library code and read the pdf under the doc directory where the library is shortly presented and some examples are given.
I'm not picking on you; everyone seems to do this. I find it inconvenient to download a big archive just to browse the documentation. That tends to prevent me from looking at things I might examine otherwise -- your loss as well as mine. Can we start uploading the docs separately or making them available on other websites for direct reading, as a matter of practice? -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
I'm not picking on you; everyone seems to do this. I find it inconvenient to download a big archive just to browse the documentation. That tends to prevent me from looking at things I might examine otherwise -- your loss as well as mine. Can we start uploading the docs separately or making them available on other websites for direct reading, as a matter of practice?
Sorry about that, I realized only now that my archive is the larger in the vault. I was actually reluctant to upload it, but I thought that providing the pdf without the code would be mostly useless. There should be a vault policy document on the website (if there is one already, i coudn't find it). I did apply for a sourceforge project acount, but will not be ready untill next week. I will put any eventual future code there. -- Giovanni P. Deretta

"Giovanni P. Deretta" <lordshoo@gmail.com> writes:
David Abrahams wrote:
I'm not picking on you; everyone seems to do this. I find it inconvenient to download a big archive just to browse the documentation. That tends to prevent me from looking at things I might examine otherwise -- your loss as well as mine. Can we start uploading the docs separately or making them available on other websites for direct reading, as a matter of practice?
Sorry about that, I realized only now that my archive is the larger in the vault.
It's not the size; it's the accessibility. I'm _really_ lazy: I don't want to download-and-decompress the docs. -- Dave Abrahams Boost Consulting www.boost-consulting.com

I'm not picking on you; everyone seems to do this. I find it inconvenient to download a big archive just to browse the documentation. That tends to prevent me from looking at things I might examine otherwise -- your loss as well as mine. ...
It's not the size; it's the accessibility. I'm _really_ lazy: I don't want to download-and-decompress the docs.
I agree (not about Dave being lazy, about the importance of being able to follow a link directly to the first page of the docs). Especially for library reviews: I want to skim the intro and examples to see if I can justify trying to find time to review it. Darren

On Mon, 25 Apr 2005 08:30:34 +0900, Darren Cook wrote
I'm not picking on you; everyone seems to do this. I find it inconvenient to download a big archive just to browse the documentation. That tends to prevent me from looking at things I might examine otherwise -- your loss as well as mine. ...
It's not the size; it's the accessibility. I'm _really_ lazy: I don't want to download-and-decompress the docs.
I agree (not about Dave being lazy, about the importance of being able to follow a link directly to the first page of the docs). Especially for library reviews: I want to skim the intro and examples to see if I can justify trying to find time to review it.
I've wondered for awhile if we don't need the same sort of protocol for Boost itself. That is, a document only download. Admitedly it's less important for releases since you can just browse the website.... Jeff

Darren Cook wrote:
I'm not picking on you; everyone seems to do this. I find it inconvenient to download a big archive just to browse the documentation. That tends to prevent me from looking at things I might examine otherwise -- your loss as well as mine. ...
It's not the size; it's the accessibility. I'm _really_ lazy: I don't want to download-and-decompress the docs.
I agree (not about Dave being lazy, about the importance of being able to follow a link directly to the first page of the docs). Especially
I agree, too. While I appreciate the many discussions about the network library I'd appreciate even more if we try to streamline the discussions, find a consensus and make decisions. Boris
[...]

"Giovanni P. Deretta" <lordshoo@gmail.com> writes:
I've have seen that the current consens is to encode the the stream type in the address, so to allow a dynamic behaviour: the actual transport is selected only at runtime, based on the address string.
Yeah, a similar unfortunate line of thinking seems to prevail in the discussion of Unicode libraries.
I think this is a bad decision (i considered doing it while implementing my library) and this is why:
- C++ is a static language, let's leave these niceties to more dynamic languages. I have found myself weeks hunting a bug in the http code because i thought that it would be cool if i could access http properties using a map instead of proper accessor functions. I spent weeks hunting persistent connection bug. It was a simple typo inside a http property string that would have been caught immediately by the compiler if i were using functions or costants [1].
Right. If you need to do dynamic polymorphism, add a layer of type erasure in a separate component. I think that's what you meant by "external polymorphism," but that's not a great term because it doesn't distinguish the static from the dynamic.
- It mimics standard library usage. You cannot open the input stream instead of a file by using the "input:" file name. (well under Unix you might actually do it, but i don't think this was the intention of the standard library authors and it is not portable any way).
- It is extremely insecure. In a network library security must be paramount. If the transport type were encoded in the address, it would be much harder to validate externally received addresses. A similar argument can be made for the port numbers. It is better to keep these things separated. The library user can create its own indexed factory collection if it really needs to.
That's a very convincing argument. I don't know much about networking, but from what I've heard here I like your design instincts very much. I'll try to look at your docs. Is this in the same design space as a "sockets library?" -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
"Giovanni P. Deretta" <lordshoo@gmail.com> writes:
- It is extremely insecure. In a network library security must be paramount. If the transport type were encoded in the address, it would be much harder to validate externally received addresses. A similar argument can be made for the port numbers. It is better to keep these things separated. The library user can create its own indexed factory collection if it really needs to.
That's a very convincing argument.
No, it isn't. If you analyze the security of the two cases carefully you'll see that there isn't much of a difference, except that the "transport-encoded" type gives you one bit of extra information, the transport, which you can check against your expectations.

"Peter Dimov" <pdimov@mmltd.net> writes:
David Abrahams wrote:
"Giovanni P. Deretta" <lordshoo@gmail.com> writes:
- It is extremely insecure. In a network library security must be paramount. If the transport type were encoded in the address, it would be much harder to validate externally received addresses. A similar argument can be made for the port numbers. It is better to keep these things separated. The library user can create its own indexed factory collection if it really needs to.
That's a very convincing argument.
No, it isn't. If you analyze the security of the two cases carefully you'll see that there isn't much of a difference, except that the "transport-encoded" type gives you one bit of extra information, the transport, which you can check against your expectations.
Sorry, not enough sleep. I knew I wasn't quite saying what I meant. I meant that if there is truly a security problem that's very compelling. I don't know how to analyze whether that's the case or not. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams wrote:
- C++ is a static language, let's leave these niceties to more dynamic languages. I have found myself weeks hunting a bug in the http code because i thought that it would be cool if i could access http properties using a map instead of proper accessor functions. I spent weeks hunting persistent connection bug. It was a simple typo inside a http property string that would have been caught immediately by the compiler if i were using functions or costants [1].
Right. If you need to do dynamic polymorphism, add a layer of type erasure in a separate component. I think that's what you meant by "external polymorphism," but that's not a great term because it doesn't distinguish the static from the dynamic.
Yes, I meant type erasure. I used the term "external polymorphism pattern" because it is described with this name in the ACE patterns documentation.
That's a very convincing argument.
I don't know much about networking, but from what I've heard here I like your design instincts very much. I'll try to look at your docs. Is this in the same design space as a "sockets library?"
Yes, it is, mostly. It doesn't really have a socket concept and it also deals with disk i/o, thus it is not completelly network centric. -- Giovanni P. Deretta

Giovanni P. Deretta wrote:
[...] I've have seen that the current consens is to encode the the stream type in the address, so to allow a dynamic behaviour: the actual transport is selected only at runtime, based on the address string. I think
Michel Andre asked a similar question about template based or inheritance/interface based approach (see http://article.gmane.org/gmane.comp.lib.boost.devel/121983 and scroll down) and got only one answer (that was mine). I don't see much of a consensus but lots of ideas. There is no general consensus so far about the package structure of the whole network library. So there can't be any consensus about how different packages look inside. So let us do this step first before we do the second please. Boris
[...]

Boris wrote:
Giovanni P. Deretta wrote:
[...] I've have seen that the current consens is to encode the the stream type in the address, so to allow a dynamic behaviour: the actual transport is selected only at runtime, based on the address string. I think
Michel Andre asked a similar question about template based or inheritance/interface based approach (see http://article.gmane.org/gmane.comp.lib.boost.devel/121983 and scroll down) and got only one answer (that was mine). I don't see much of a consensus but lots of ideas. There is no general consensus so far about the package structure of the whole network library. So there can't be any consensus about how different packages look inside. So let us do this step first before we do the second please.
Even if we go for a interface/inheritance based approach is would like to expose tcp_stream/tcp_connector/tcp_acceptor/udp_stream et al and the like so the user could instantiate a concrete implementation and not going through the address, dynamic factory thingy. There might be specifics in these classes that you care for if you only support tcp eg. /Michel

Hi Giovanni, First off, thanks for the posting. I downloaded and will try to understand your library as time permits. After a brief look, my head is still spinning trying to understand the central concepts (my limitation, not your code<g>). If I understand correctly, the library is service/server side and not client-side at present, right?
These are the main features/peculiarities of my library:
- There is no socket concept because i don't really think it is natural, at least not for a C++ programmer. I use the acceptor, connector and stream concepts.
I think we have similar opinions about sockets ;) just followed different paths from there.
- User is not required to reference streams by pointer, streams are stack allocated or are simply members of another object. Internally they have a smart pointer to an implementaion handle. Consider them stack-based proxies. The acceptor and the connector return the handle that is asigned to the stream.
So is this choice just for user simplification? Internally, the user is still holding a pointer, right? What are the copy semantics of the objects held by the user? This is where things can be tricky any way you go. Either the object is copyable and confusion can come via aliasing, or they aren't which is probably better in this case, but could possibly cause some idioms to not work (like "stream s = my_clever_stream_creator()"). My preference was to use shared_ptr<> as the semantics are well understood and objects can layer easily in obvious ways, but the cost is "->" vs "." syntax.
- The preferred way to do input output is to use standard-like algorithms (i.e. copy) with buffered stream adaptors and specialized input/output iterators. I believe that an efficient library can be written this way and be very C++-user-friendly. Classic read/write are still available, but their semantics might be surprising.
I agree that this is the right approach for many users and protocols, but most network programmers (including myself<g>) need access to the primitive behaviors. They won't find them surprising unless the wrapping violates expectations coming from sockets-like programming.
- All classes are concrete, no polymorphism is used (i.e. no virtuals). Polimorphic behaviour must currently be achieved with some external mean (i.e using the external polymorphism pattern. I think that the boost::IDL library would be great).
Here is where we are at different ends of the spectrum :). I didn't see any SSL code, so I can only imagine how the http code would handle SSL vs. non-SSL stream underneath. Ideally, this should not require two template instantiations like http<stream> and http<ssl_stream>, for example. In the end, I thought templates had little to offer at this level. Parameterizing protocols by stream type seems (IMHO) to buy nothing in particular except the removal of virtual at the expense of the user having to specify <kind_of_stream> and _lots_ of extra code generation. The app should be able to layer objects as it sees fit and run-time polymorphism is (again, IMHO) the right solution to that problem.
- Errors can be reported both with exceptions and with error codes. Exceptions are used by default unless error callbacks are passed. This seems to work quite well. Internally only error codes are used and exceptions are thrown only at the most external abstracion layer.
This is a good idea, and very similar to what I have done as well. At least for async. What is the behavior of blocking read in the face of error? Is the user callback made inside read? If so, what does read() return?
I will probably add status bits a-la iostreams.
What kind of bits? I can see eof and fail and those cannot be cleared. Others?
- File streams. The library actually try to be a generalized i/o framework, and file streams are provided for completeness.
At an abstract level, they are very similar and should behave in a similar way. I haven't tried to tackle that part because it is an area where there is already something in place, albeit not async, and I didn't want to try to integrate into iostream (not my cup of tea).
- The library can be extended simply by creating new handles. In addition to TCP streams there are Unix streams (come almost for free :-) and file streams. SSL/TLS was present but did get broken some time ago and didn't have the time to fix it.
I would be most curious to know how SSL fit in your library and how other layers interact with or are shielded from it.
- Input/Output buffer. [some good stuff was here<g>]
From the little I've read through the code, it looks like this is a layer above the raw stream impl. I think that is exactly the right way to go. :)
Missing (definitelly not complete list):
The library is fully sinchronous for now. I'm still considering how to add asynch support. I think i will implement it in the buffered adaptor I/O is done asynchronously to the internal buffers that can grow as much as it is necessary. Timeouts are definitelly a must-have.
Agreed on async and timeout. Can sync calls be manually/explicitly canceled? In my experience (and opinion<g>), a reader/writer MT design needs cancel semantics. Without it, such an app cannot be responsive to outside stimuli.
Final notes:
I've have seen that the current consens is to encode the the stream type in the address, so to allow a dynamic behaviour: the actual transport is selected only at runtime, based on the address string. I think this is a bad decision (i considered doing it while implementing my library) and this is why:
I am more and more convinced that this is not the right approach for the library core, but from different reasons (see other posts). It could be offered as a stand-alone library for an app that has the need for this, but I think it is most likely a trivial map problem (plus a little text manipulation).
- C++ is a static language, let's leave these niceties to more dynamic languages.
I think C++ is quite dynamic (not in the java script way<g>) and should exercise that power where appropriate :) It pains me to see Java servers everywhere. C++ can and should have all the HTTP, SSL and server stuff and be as easy to develop servlet-like things. One does not need reflection, dynamic loading whatnot to play well in that space. One does need standard (or at least defacto) libraries. Without them, effort is fragmented and disjoint. Which is why I joined boost. :)
I have found myself weeks hunting a bug in the http code because i thought that it would be cool if i could access http properties using a map instead of proper accessor functions. I spent weeks hunting persistent connection bug. It was a simple typo inside a http property string that would have been caught immediately by the compiler if i were using functions or costants [1].
I agree that strings literals are not the right way to interact with a library in general. Addresses seem near the border though because they are user facing typically. So while I am not in favor of this approach at this layer, I can see some merit to the general idea.
- It mimics standard library usage. You cannot open the input stream instead of a file by using the "input:" file name. (well under Unix you might actually do it, but i don't think this was the intention of the standard library authors and it is not portable any way).
Devils advocate: but in the same sense, the std lib supports mounts, chroot, NFS, Samba file shares, etc.. Anything that can be mapped to an OS file entity. Things are more murky in network programming: credentials, auth, keys, security in general as you suggest next.
- It is extremely insecure. In a network library security must be paramount. If the transport type were encoded in the address, it would be much harder to validate externally received addresses.
Good point. Validation is one thing, but meeting expectations of the software is another. In some cases, just any transport may not be appropriate and hence should be validated. This can be done from the string form, of course, but it presents a wider interface.
A similar argument can be made for the port numbers. It is better to keep these things separated. The library user can create its own indexed factory collection if it really needs to.
Not sure I see the connection to ports, but I agree with the rest here. This is something that can easily be done at a higher level. The trade-off is that libraries won't accept the same indirection in the address, which leaves all the mapping up to the app (not just the configuration of the mapping).
Sorry for the long post, just tryin' to be usefull :-).
Don't be sorry. I am sure I've written longer posts and it was helpful. Best regards, Don __________________________________ Do you Yahoo!? Yahoo! Small Business - Try our new resources site! http://smallbusiness.yahoo.com/resources/

Don G wrote:
- User is not required to reference streams by pointer, streams are stack allocated or are simply members of another object. Internally they have a smart pointer to an implementaion handle. Consider them stack-based proxies. The acceptor and the connector return the handle that is asigned to the stream.
So is this choice just for user simplification? Internally, the user is still holding a pointer, right? What are the copy semantics of the objects held by the user? This is where things can be tricky any way you go. Either the object is copyable and confusion can come via aliasing, or they aren't which is probably better in this case, but could possibly cause some idioms to not work (like "stream s = my_clever_stream_creator()"). My preference was to use shared_ptr<> as the semantics are well understood and objects can layer easily in obvious ways, but the cost is "->" vs "." syntax.
Yes it holds a pointer, a shared_ptr actually, this makes it possible for some part of the library to temporarily hold a (potentially weak) reference to the handle without fear that it might be destroyed/closed. I think this will come handy with asynchronous I/O. [1] I think that stack semantics are much more intuitive for non polymorphic objects (i.e. iostreams versus streambuffers). Your example could be rewritten as 'my_clever_stream_creator(s)' without really losing expressivity in the non-polymorphic case. This is exactly how connectors and acceptors work in my library. Currently the wrapper is copyable, but i will probably correct this unless i find very good reasons not to (the only one i can find currently is two threads wanting to do parallel i/o on the same file: the stream classes are not thread safe, so each thread might want to have a copy). 1: Note that the internal file descriptor is closed when and only when the owner handle is closed. There is no close() call although shutdown() is available. Thus there is no risk that the operating system might reuse the same file descriptor number while there are stale FDs around. Usefull if you need a 'FD->handle' map.
- The preferred way to do input output is to use standard-like algorithms (i.e. copy) with buffered stream adaptors and specialized input/output iterators. I believe that an efficient library can be written this way and be very C++-user-friendly. Classic read/write are still available, but their semantics might be surprising.
I agree that this is the right approach for many users and protocols, but most network programmers (including myself<g>) need access to the primitive behaviors. They won't find them surprising unless the wrapping violates expectations coming from sockets-like programming.
They are available, in fact they are necessary to implement the rest of the library ;-), but they do not try to be user-friendly: they have many parameters, complex return values, non trivial preconditions and postconditions. For example there is no guarantee that a write always writes the whole buffer in absence of errors, it might do a partial write for no reason at all (obviously minimizing the number of calls is a quality of implementation issue).
- All classes are concrete, no polymorphism is used (i.e. no virtuals). Polimorphic behaviour must currently be achieved with some external mean (i.e using the external polymorphism pattern. I think that the boost::IDL library would be great).
Here is where we are at different ends of the spectrum :). I didn't see any SSL code, so I can only imagine how the http code would handle SSL vs. non-SSL stream underneath. Ideally, this should not require two template instantiations like http<stream> and http<ssl_stream>, for example.
In the end, I thought templates had little to offer at this level. Parameterizing protocols by stream type seems (IMHO) to buy nothing in particular except the removal of virtual at the expense of the user having to specify <kind_of_stream> and _lots_ of extra code generation. The app should be able to layer objects as it sees fit and run-time polymorphism is (again, IMHO) the right solution to that problem.
I did start with virtual interfaces based design (Part of it still visible, for example the address object is way too much complex for my current needs, also the domain object is a relic of a factory based structure). It took me a long time to find on a general stream interface that was at least partially statisfying. When i started implementing the concrete objects i've found that many methods looked almost the same, so i refactored the code and put the common code in an implementation class. Then I thought that the library user could find usefull to deal with the actual stream type, and promoted the implementation class as a public object, with a virtual interface adapter optionally applicable (i.e. the external polymorphism pattern, or type erasure). Even the virtual adapter could be generated with the use of templates. I was happy with this design untill I realized that I was just duplicating what could be better done with a dynamic_any or with boost::IDL, and i scraped it. Only the concrete objects were left and i have yet to find the need to put the interfaces back. Dynamic polymorphism can certanly increase flexibility without template bloat, but there would be really that much code generated? An http<tcp_stream> and an http<ssl_stream> certanly can share 99% of the code, what you really need is a parametrized function that fetches the data from the stream and put it in a buffer. Keep most of the code in a base class, or better, put the parametrized function (or functor) in a boost::function and store it to the non-templated http protocol object. Easy. I didn't remove virtuals just for the sake of it, but it is just an accident of design. I might consider putting them back in the internal handle, at least to give read/writes polymoprhic behaviour (this would mimic the iostream and streambuffer pair). BTW, i do not exactly understand what do you exactly mean with 'beeing free to layer objects'.
- Errors can be reported both with exceptions and with error codes. Exceptions are used by default unless error callbacks are passed. This seems to work quite well. Internally only error codes are used and exceptions are thrown only at the most external abstracion layer.
This is a good idea, and very similar to what I have done as well. At least for async. What is the behavior of blocking read in the face of error? Is the user callback made inside read? If so, what does read() return?
An error is thrown, unless a callback is provided. If so the error code is passed to the callback. A throwing read returns the amount of data read, a 'callback augmented' read returns the callback itself (callbacks are passed by value as with standard algorithms). The amount of data read is passed to the callback along with the error code.
I will probably add status bits a-la iostreams.
What kind of bits? I can see eof and fail and those cannot be cleared. Others?
Currently the only bits that i plan to have are: 'input buffer grown' and 'output buffer flushed' usefull for asynchronous i/o and buffered streams. I actually do not have (yet) eof and fail because initially the stream was supposed to be thread safe and it had to be stateless. I I will certanly add state to keep track of closed connections and obviously it will only be resetted if the internal handle is reinitialized.
- File streams. The library actually try to be a generalized i/o framework, and file streams are provided for completeness.
At an abstract level, they are very similar and should behave in a similar way. I haven't tried to tackle that part because it is an area where there is already something in place, albeit not async, and I didn't want to try to integrate into iostream (not my cup of tea).
I think that file I/O is as important in network programming as network I/O itself, so it is usefull to have an unified framework. BTW, polling for I/O readiness (i.e. the select model) does not make sense with files, i believe that the asynchronous I/O model is the only non blocking io model that fits all stream types.
- The library can be extended simply by creating new handles. In addition to TCP streams there are Unix streams (come almost for free :-) and file streams. SSL/TLS was present but did get broken some time ago and didn't have the time to fix it.
I would be most curious to know how SSL fit in your library and how other layers interact with or are shielded from it.
Nothing very complex, really. I did write a thin wrapper over OpenSSL. I only did take advantage of the ability to initalize a context with an already connected file descriptor, then wrapped the context along in an handle. The read/write methods simply forwarded the call to SSL_read/SSL_write. I didn't really take advantage of the BIO infrasturcture, that will probably be necessary to make an ssl_stream an adapter over any kind of stream.
- Input/Output buffer.
[some good stuff was here<g>]
From the little I've read through the code, it looks like this is a layer above the raw stream impl. I think that is exactly the right way to go. :)
Yes, the buffered stream is just a layer above the standard stream. Also it should be very easy to implement a streambuffer on top of the buffered stream adaptor. I think that the buffered adaptor will greatly simplify the asynchronous buffer management: asynchronous reads put data in the internal input buffer that can be grown efficiently as much as needed (it is a deque); user code copies data from this buffer to its own buffer, or takes ownership of it. Asynchronous writes take data from the output buffer; user code copies data form therir internal buffers to this buffer, or relinquish ownership of their buffer or, if they want to keep ownership of the buffer and still avoid the extra copy, the must use a special buffer that is guaranteed to be immutable (i.e. once created it can never be changed, copies share the internal data using shared_ptrs. I do not have it yet, will add it when i'll attack the asynchronous io problem). If the user does not want to have automatic buffer management, it can still use the unbuffered functions, but then it is his job to garantee that buffers stay valid and unchanged untill the operation is completed (i.e. a mess!!).
Missing (definitelly not complete list):
The library is fully sinchronous for now. I'm still considering how to add asynch support. I think i will implement it in the buffered adaptor I/O is done asynchronously to the internal buffers that can grow as much as it is necessary. Timeouts are definitelly a must-have.
Agreed on async and timeout. Can sync calls be manually/explicitly canceled? In my experience (and opinion<g>), a reader/writer MT design needs cancel semantics. Without it, such an app cannot be responsive to outside stimuli.
No, not yet, still on my todo list. Well, you can obviously cancel a pending operation by shutting down the stream, but a much more gentle solution is needed :-).
Final notes:
I've have seen that the current consens is to encode the the stream type in the address, so to allow a dynamic behaviour: the actual transport is selected only at runtime, based on the address string. I think this is a bad decision (i considered doing it while implementing my library) and this is why:
I am more and more convinced that this is not the right approach for the library core, but from different reasons (see other posts). It could be offered as a stand-alone library for an app that has the need for this, but I think it is most likely a trivial map problem (plus a little text manipulation).
Yes, as an add-on would be fine, but the transport-encoded address should not be a central concept.
- C++ is a static language, let's leave these niceties to more dynamic languages.
I think C++ is quite dynamic (not in the java script way<g>) and should exercise that power where appropriate :) It pains me to see Java servers everywhere. C++ can and should have all the HTTP, SSL and server stuff and be as easy to develop servlet-like things. One does not need reflection, dynamic loading whatnot to play well in that space. One does need standard (or at least defacto) libraries. Without them, effort is fragmented and disjoint.
Which is why I joined boost. :)
Well with static i meant :"do as much work at compile time as possible" which translates to "catch as many errors as early as possible" ;-). C++ is certanly dynamic, but i kind-of-like the way everything is NOT always an object. [Really off-topic] BTW, I would *love* complete, standard, compile-time reflection facilities.
[...]
- It is extremely insecure. In a network library security must be paramount. If the transport type were encoded in the address, it would be much harder to validate externally received addresses.
Good point. Validation is one thing, but meeting expectations of the software is another. In some cases, just any transport may not be appropriate and hence should be validated. This can be done from the string form, of course, but it presents a wider interface.
Just to give an example: Stevens in Unix Network Programming Vol 1 shows an example of getaddrinfo that, as an extension, could return unix domain sockets in addition to ipv4 and ipv6 sockets. Glibc did actually implement the extension. It was removed later because of security concerns: see this post for details. http://sources.redhat.com/ml/libc-hacker/2001-05/msg00044.html You might want to treat streams polymophically once created, but at creation the type should be statically known by the user code, because it needs to be aware that not all streams have the same semantics. You might say that not all streams are 'created' equals :-).
[...]
Sorry for the long post, just tryin' to be usefull :-).
Don't be sorry. I am sure I've written longer posts and it was helpful.
Well, this *certanly* was a long post. I hope i've cleraed some details of my library. Now, let's get back to code. -- Giovanni P. Deretta

On 4/25/05, Giovanni P. Deretta <lordshoo@gmail.com> wrote:
BTW, polling for I/O readiness (i.e. the select model) does not make sense with files, i believe that the asynchronous I/O model is the only non blocking io model that fits all stream types.
What if that file is a socket or pipe?
[Really off-topic] BTW, I would *love* complete, standard, compile-time reflection facilities.
Who wouldn't? I think this is pretty much covered by the Boost.Langbinding project (of which I believe there is only a design and precursor implementation in Boost.Python) -- Caleb Epstein caleb dot epstein at gmail dot com
participants (9)
-
Boris
-
Caleb Epstein
-
Darren Cook
-
David Abrahams
-
Don G
-
Giovanni P. Deretta
-
Jeff Garland
-
Michel André
-
Peter Dimov