asio networking proposal 0.3.4

Hello all, I have just released asio 0.3.4. You can download it, or view the documentation online, at http://asio.sourceforge.net. The major changes in this version include support for scatter-gather operations, an initial SSL implementation, and documentation improvements including an HTTP 1.0 server example. Code that uses asio will need to be updated. It is my feeling that there are no more major interface changes to be made, and my next goal is to convert asio into "Boost format" in preparation for a submission. Feedback is appreciated. In detail, the changes since asio 0.3.3 include: - Added support for scatter-gather on all read and write operations. These operations now take a list of one or more "buffers" for the operation. These buffers may be created to represent raw memory, POD arrays, boost::array or std::vector. This change breaks existing code, which needs to be fixed as follows. Where one used to write: sock.read(data, length); one now needs to write: sock.read(asio::buffers(data, length)); Similar changes are needed on all read, write, send and receive operations, both synchronous and asynchronous. - Added initial SSL support using OpenSSL. This takes the form of a template, called asio::ssl::stream, which can be used with any class which supports the Sync_*_Stream and Async_*_Stream concepts, e.g.: typedef asio::ssl::stream<asio::stream_socket> my_ssl_socket; Please note that for now this only works when used with single-threaded demuxers. A big thanks to Indrek Juhani of Voipster for developing the core implementation, and to Dirk Griffioen for organising this contribution. - Exceptions in handlers are no longer suppressed using catch (...). Instead they are now allowed to propagate out of the basic_demuxer::run function so they may be caught by application code. After an exception is caught, basic_demuxer::run may be called again immediately to continue processing. - Replaced the basic_demuxer::work_started/work_finished classes with an RAII class called basic_demuxer::work. - Added iostream output operators for ipv4::address, ipv4::tcp::endpoint and ipv4::udp::endpoint. - Added an HTTP 1.0 server example. - Added a kqueue reactor implementation for Mac OS X. Thanks to Stefan Arentz for providing the initial implementation. - Improved error handling in the epoll_reactor. - TSS slot usage has been minimised to use one slot for all demuxers of a given type, rather than one slot per demuxer object. - The calls to WSAStartup and WSACleanup have been made safer with respect to global demuxer objects. - Automaticaly link in ws2_32.lib when using asio on Win32 with Visual C++ or Borland C++. - Added back missing implementation of get_remote_endpoint. - Visual C++ 6 and gcc 2.95.x are not supported in this version, due to lack of testing resources. - Many documentation improvements. Note that I chose not to rename the send/receive family of functions to write and read, as had been discussed. After further thought I came to the conclusion that functions should only have the same name if they provide the same semantics. I think it's fair to say that read and write are the ideal names for the stream concepts' operations. However stream_socket's send and receive functions take additional parameters (flags) that can significantly alter the semantics (e.g. peek). A similar thing applies to not renaming datagram_socket's send/send_to/receive/receive_from functions, which are datagram-oriented, as opposed to the stream concepts' read and write which are (obviously) stream-oriented. Cheers, Chris

I love asio, a great network library! Yet I still have a little complain that "header files only" makes the compile process slower and slower with the functions of asio grow more and more. Furthermore, I guess that it use static class member initialization for win32 socket WSAStartup, which seems to be a potential occasion for users to make a bug(just guess, not sure) while more than one c++ source files include asio headers. Would you please consider a header+source version with clean and simple header files ? Looking forward to "boost format" :) Thank you again for the amazing library. "Christopher Kohlhoff" <chris@kohlhoff.com>
It is my feeling that there are no more major interface changes to be made, and my next goal is to convert asio into "Boost format" in preparation for a submission. Feedback is appreciated.

Hi, --- RocWood <rocwood@21cn.com> wrote:
Yet I still have a little complain that "header files only" makes the compile process slower and slower with the functions of asio grow more and more.
In the longer term I plan to support a separate library (possibly an optional one) that keeps system headers away from application code (i.e. to prevent the namespace pollution that system headers sometimes introduce). However this is not currently a high priority for me as it doesn't affect asio's API. In terms of compile times, my testing indicates that, on Windows with MSVC, windows.h and friends are responsible for at most 30% of the compile time. The rest comes from asio + standard library headers, and the fact that it uses templates extensively. In 0.3.4 I do now test that every public asio header file can compile on its own, so you can probably gain some improvement in compile time by only including the headers that you need, rather than the catch-all asio.hpp.
Furthermore, I guess that it use static class member initialization for win32 socket WSAStartup, which seems to be a potential occasion for users to make a bug(just guess, not sure) while more than one c++ source files include asio headers.
This should be ok, as it's a static member of a class *template*. In cases where multiple definitions cannot be avoided (e.g. multiple DLLs using asio) the multiple initialisations of winsock should still work just fine (although there may be a memory leak per definition).
Would you please consider a header+source version with clean and simple header files ?
I do understand the desire for short, clean header files :) However the main problem is that to do that I would have to abandon any hope of supporting older compilers like MSVC6. At the moment I believe that a large and still useful subset of asio will work for MSVC6, but MSVC6 does require that you put member template function definitions inside the class definition. I'm not ready to give up on it completely... yet ;) Cheers, Chris

"Christopher Kohlhoff" <chris@kohlhoff.com> wrote in message news:20051015131343.27549.qmail@web32609.mail.mud.yahoo.com...
Hello all,
I have just released asio 0.3.4. You can download it, or view the documentation online, at http://asio.sourceforge.net.
The major changes in this version include support for scatter-gather operations, an initial SSL implementation, and documentation improvements including an HTTP 1.0 server example. Code that uses asio will need to be updated.
It is my feeling that there are no more major interface changes to be made, and my next goal is to convert asio into "Boost format" in preparation for a submission. Feedback is appreciated. ...
Looks like you have made a lot of forward progress! There are still some void *'s in the public interface. The io_control helpers have data() members returning void *'s, for example. Any chance of wringing out the non-memory management void *'s before a submission? I know it sounds picky, but a lot of C++ programmers object strongly to void * in public interfaces for anything except the rawest of raw memory management. I also wonder if the interface could be thinned without reducing functionality. For example, could buffer() and buffers() be folded into one function, or at least overloads with the same name. Likewise, could the 12 free read/write functions be reduced to 2 names (presumably read/write) or 4 names (presumably read, async_read, write, async_write) via folding and/or overloading?. Am I the only one who feels the large number of names makes the interface appear more complex than it really is? Keep up the good work, --Beman

Hi Beman, --- Beman Dawes <bdawes@acm.org> wrote:
There are still some void *'s in the public interface. The io_control
helpers have data() members returning void *'s, for example.
Any chance of wringing out the non-memory management void *'s before a submission? I know it sounds picky, but a lot of C++ programmers object strongly to void * in public interfaces for anything except the rawest of raw memory management.
Hmm, I'm not sure. Here are the current uses of void* and their rationales: * buffer() and buffers() - so that arbitrary application data structures can be sent without an additional buffer copy. * const_buffer::data() and mutable_data::data() - to get a pointer to pass to OS functions like send and recv. * IO_Control_Command::data() - to get a pointer to pass to OS functions like ioctl or WSAIOCtl. * Socket_Option::data() - to get a pointer to be pass to OS functions like setsocketopt and getsockopt. The last 3 are similar, in that they involve getting a pointer to be passed to a low-level OS function. (Note that the Endpoint concept has a similar thing, except that it returns an implementation-defined type rather than void*.) I don't want to preclude user-defined IO_Control_Command or Socket_Option types, so there has to be *some* way of getting a pointer to the data in the public interface. However I'd be very happy to change it if there's a better way of accomplishing the same thing.
I also wonder if the interface could be thinned without reducing functionality. For example, could buffer() and buffers() be folded into one function, or at least overloads with the same name.
I had considered making the mutable_buffer/const_buffer classes also implement the Mutable_Buffers/Const_Buffers concepts, but i found it was confusing as to whether the begin()/end() functions applied to the underlying memory. What about this for an idea: - Remove the buffer() functions. - Rename buffers() to buffer(). - Have a specialisation for const_buffer<1> that supports conversion to const_buffer. - Have a specialisation for mutable_buffer<1> that supports conversion to mutable_buffer and const_buffer. Then to send a single buffer you could write: sock.write(buffer(data, size)); and still use the chaining to send multiple buffers: sock.write(buffer(data1, size1)(data2, size2)); The conversion to the individual buffer classes should still let you write: std::vector<const_buffer> bufs; bufs.push_back(buffer(data1, size1)); bufs.push_back(buffer(data2, size2)); sock.write(bufs);' or: const_buffers<2> bufs = { buffer(data1, size1), buffer(data2, size2) }; sock.write(bufs);
Likewise, could the 12 free read/write functions be reduced to 2 names (presumably read/write) or 4 names (presumably read, async_read, write, async_write) via folding and/or overloading?. Am I the only one who feels the large number of names makes the interface appear more complex than it really is?
No you're not, especially after I've had to type async_read_at_least_n many times ;) Reducing to 4 names might be feasible. One thing I realised recently is that, in terms of behaviour, read() and read_n() are just special cases of read_at_least_n(). The same obviously applies to write*() and the async equivalents. Could something be done by extending the Mutable_Buffers/Const_Buffers concepts to have a desired_minimum_transfer() member function? Bad name, I know, but it would return the minimum number of bytes to read or write from the list of buffers. It would have to be optional so that containers like std::vector can still meet the concept requirements. It would probably have a default of 0, i.e. same semantics as current asio::read() or asio::write(). You could then write: read(sock, bufs); // Same as existing read() read(sock, all_of(bufs)); // Same as read_n() read(sock, at_least(bufs, 42)); // Same as read_at_least_n(). I'm ready to take guidance here :) Cheers, Chris

"Christopher Kohlhoff" <chris@kohlhoff.com> wrote in message news:20051017052938.33920.qmail@web32604.mail.mud.yahoo.com...
Hi Beman,
--- Beman Dawes <bdawes@acm.org> wrote:
There are still some void *'s in the public interface. The io_control
helpers have data() members returning void *'s, for example.
Any chance of wringing out the non-memory management void *'s before a submission? I know it sounds picky, but a lot of C++ programmers object strongly to void * in public interfaces for anything except the rawest of raw memory management.
Hmm, I'm not sure. Here are the current uses of void* and their rationales:
* buffer() and buffers() - so that arbitrary application data structures can be sent without an additional buffer copy.
Where did this idea come from that the only way to avoid an additional copy is to use a void *? Think about the iterator interfaces to Standard Library algorithms. They traffic nicely in pointers, yet not a buffer copy or void * to be seen. So instead of: ... foo( void * data, size_t size); something like: template <class RandomAccessIterator> ... foo( RandomAccessIterator first, RandomAccessIterator last );
* const_buffer::data() and mutable_data::data() - to get a pointer to pass to OS functions like send and recv.
Return a pointer to the actual type. If the user ends up casting that to void* to use it with an old C interface, that's the user's choice. Or use cast-style syntax: template <class T> T data_cast(...);
* IO_Control_Command::data() - to get a pointer to pass to OS functions like ioctl or WSAIOCtl.
* Socket_Option::data() - to get a pointer to be pass to OS functions like setsocketopt and getsockopt.
Same comment as above.
The last 3 are similar, in that they involve getting a pointer to be passed to a low-level OS function. (Note that the Endpoint concept has a similar thing, except that it returns an implementation-defined type rather than void*.)
Have you considered defining one or more concepts for the low-level OS functions, and including a trivial wrapper to supply a model of the concept for the most common implementation of the OS functions?
I don't want to preclude user-defined IO_Control_Command or Socket_Option types, so there has to be *some* way of getting a pointer to the data in the public interface. However I'd be very happy to change it if there's a better way of accomplishing the same thing.
There isn't anything wrong with functions that return pointers (or more generally, iterators). The objection is to void * pointers breaking type safety. If the actual type is known (or even implementation-defined) then use it. If the actual type can't be known ahead of usage, supply the type as a template parameter. The above is very mom-and-apple-pie, so I expect you have already gone over it in your mind. And there are better interface designers than me reading this list, so others may have better ideas. But there is certainly some way to get rid of the void *'s, and the interface will be better for doing so. All IMO, of course.
I also wonder if the interface could be thinned without reducing functionality. For example, could buffer() and buffers() be folded into one function, or at least overloads with the same name.
I had considered making the mutable_buffer/const_buffer classes also implement the Mutable_Buffers/Const_Buffers concepts, but i found it was confusing as to whether the begin()/end() functions applied to the underlying memory.
What about this for an idea:
- Remove the buffer() functions.
- Rename buffers() to buffer().
- Have a specialisation for const_buffer<1> that supports conversion to const_buffer.
- Have a specialisation for mutable_buffer<1> that supports conversion to mutable_buffer and const_buffer.
Then to send a single buffer you could write:
sock.write(buffer(data, size));
and still use the chaining to send multiple buffers:
sock.write(buffer(data1, size1)(data2, size2));
The conversion to the individual buffer classes should still let you write:
std::vector<const_buffer> bufs; bufs.push_back(buffer(data1, size1)); bufs.push_back(buffer(data2, size2)); sock.write(bufs);'
or:
const_buffers<2> bufs = { buffer(data1, size1), buffer(data2, size2) }; sock.write(bufs);
Likewise, could the 12 free read/write functions be reduced to 2 names (presumably read/write) or 4 names (presumably read, async_read, write, async_write) via folding and/or overloading?. Am I the only one who feels the large number of names makes the interface appear more complex than it really is?
No you're not, especially after I've had to type async_read_at_least_n many times ;)
Reducing to 4 names might be feasible. One thing I realised recently is that, in terms of behaviour, read() and read_n() are just special cases of read_at_least_n(). The same obviously applies to write*() and the async equivalents.
That's the kind of thought process I suggesting you apply. Since you know the use cases far better than I do, you should be the one to make the decision as to how to fold these special cases into one more general case. All I can do is encourage you to make the effort.
Could something be done by extending the Mutable_Buffers/Const_Buffers concepts to have a desired_minimum_transfer() member function? Bad name, I know, but it would return the minimum number of bytes to read or write from the list of buffers.
It would have to be optional so that containers like std::vector can still meet the concept requirements. It would probably have a default of 0, i.e. same semantics as current asio::read() or asio::write().
You could then write:
read(sock, bufs); // Same as existing read()
read(sock, all_of(bufs)); // Same as read_n()
read(sock, at_least(bufs, 42)); // Same as read_at_least_n().
I'm ready to take guidance here :)
I've become convinced that the STL half-open range is the best way to represent a sequence, rather than the address of first element and a count. Thus your three forms might look like: read( sock, first ); // read one only read( sock, first, last ); // read n = =last-first read( sock, first, min, end ); // read at least min-first, at most end-first I would probably eliminate the first overload unless the case of wanting to read only one is very, very common. Take the above with a grain of salt since I'm not familiar with the problem domain and only commented in the first place because void *'s are such a red flag, and because there seemed to be a lot of names for very similar functionality. --Beman

Hi Beman, Just addressing the void* buffers issue for now... --- Beman Dawes <bdawes@acm.org> wrote:
Where did this idea come from that the only way to avoid an additional copy is to use a void *?
It's not just avoiding a copy, it's avoiding a copy *and* allowing arbitrary data structures to be sent. IMHO an important use case for C++ and networking is being able to do things like: struct message { int a; double b; int c_size; char c[0]; }; message* m = ...; sock.write(buffers(m, sizeof(message) + m->c_size)); Additionally, we must consider the fact that socket operations are byte-oriented and that they can, and usually do, result in fewer bytes being transferred than was requested. If we do not allow an interface that can address arbitrary memory, what happens in a situation like this: double values[100]; size_t bytes_read = read(sock, buffers(values)); // bytes_read == 17, what now?
Think about the iterator interfaces to Standard Library algorithms. They traffic nicely in pointers, yet not a buffer copy or void * to be seen.
So instead of:
... foo( void * data, size_t size);
something like:
template <class RandomAccessIterator> ... foo( RandomAccessIterator first, RandomAccessIterator last );
The RandomAccessIterator concept, as an example, is not strict enough for a networking use case, Unlike STL algorithms, we must consider that the underlying OS primitives deal in contiguous memory with a granularity of bytes. So, I firmly believe that a representation of raw memory is essential. The mutable_buffer and const_buffer classes are trying to provide a "safe" representation of this concept. They already address the issue of buffer overrun protection. And I do feel that the buffers classes (i.e. address & size) represent the restricted concept of a contiguous memory range better than iterators. If the issue here is preventing violations of type safety then how about I make conversion into a mutable_buffer/const_buffer a (for the most part) one-way operation. Or put another way, the buffer is in a sense typeless. That is, it is relatively easy to create a buffer object that represents some memory, but harder to go back the other way. In practice this means removing the void* data() member function from the buffer classes, and replacing it with a template <class T> T data_cast(...) as you suggest (although I might call it buffer_cast). The asio internals will use buffer_cast<void*>(buf) to obtain the pointer to be passed to send() or recv(). The buffer(void*, size_t) overload would be retained however, but since this is a conversion *to* a "typeless" buffer, I don't see this is a type safety violation in itself. You would have to use buffer_cast to violate type safety now. Cheers, Chris

Beman Dawes wrote:
"Christopher Kohlhoff" <chris@kohlhoff.com> wrote in message news:20051017052938.33920.qmail@web32604.mail.mud.yahoo.com...
Hmm, I'm not sure. Here are the current uses of void* and their rationales:
* buffer() and buffers() - so that arbitrary application data structures can be sent without an additional buffer copy.
Where did this idea come from that the only way to avoid an additional copy is to use a void *?
Think about the iterator interfaces to Standard Library algorithms. They traffic nicely in pointers, yet not a buffer copy or void * to be seen.
But the algorithms operate on typed data, and asio/sockets operate on untyped data. They gain nothing from an iterator interface, because they simply can't provide type safety. When you send a type T over the wire, what comes out at the other end is an untyped sequence of bytes that may have been a T on the source machine, but is not necessarily a T on the target; so even if the target could somehow deduce the type - and it cannot - it wouldn't be very useful to cast the byte sequence to an invalid T. It's better to not provide an illusion of type safety when there is none; the buffer is a raw sequence of bytes and (void*, size_t), (unsigned char*, size_t) and (unsigned char*, unsigned char*) are its natural representations. It really is of type "raw sequence of bytes", and using another type to describe it decreases type safety rather than increasing it.
participants (4)
-
Beman Dawes
-
Christopher Kohlhoff
-
Peter Dimov
-
RocWood