
Hi, I'd like to offer the following comments on asio based on 15+ years of network programming in various languages and various situations, but as somebody who is pretty much a boost novice. I wish I could become a boost expert instantly; my commentary from many previous experiences (some successful, and some not) trying to wrap the network programming monster up with higher level abstractions. There are definitely areas that felt "wierd" to me. Unfortunately, I don't know if that is b/c I'm not in tune with "the boost way", or if it's something in either the design or the implementation. In general, I tried to ignore those sorts of things. Regardless, these are the things that really stick out to me: 1. too abstract in places Specifically, I'm talking about the whole buffer thing and the hiding of the demuxer's implementation. I'll address each of these separately: (a) buffers: from a certain perspective, this sounds nice, but it seems that there are already so many abstractions over a (void* + length)--do we REALLY need another? Could this be simplified to something like std::pair<void*, std::size_t> ? (b) hiding of demux impl: first, I don't like the fact that it chooses for me at compile time. Abstraction is nice, but sometimes I, as the programmer, really DO know better or have good reason to choose one thing over another. Anytime libraries completely force my hand, I rebel. In general, I feel that this library throws in enough abstraction to be in the way, but not enough to completely abstract away the features that it's really using. I don't see how one can get hold of the raw socket descriptor to twiddle options that the library hasn't abstracted. I also know that after 15+ years of network programming, KNOWING the details all the way down to the wire is important for how one writes code. Purists may say that one should never make such assumptions, but the reality is that when your "wire" is gigabit ethernet you get to write code with a certain set of assumption that is completely different from assumptions in place when the "wire" are high-bandwidth/high-latency satellites in geosynch orbit. The abstractions that are in place in this library seem to insulate me too much. I would prefer to see a "you don't have to know if you don't want to" philosophy rather than a "you don't need to know because I don't think you need to know" philosophy. 2. unnecessary features Is hostname resolution really so much of an overhead that we need asynch hostname resolution? If the feature were basically "free", I would say okay, but the fact that it requires firing up an extra thread behind the scenes (another one of my pet peeves) makes it seem like a wholly unnecessary feature. 3. header-only implementation Besides the fact that this style is "wierd" to me (that's a personal thing), this implementation choice forces users' hands down the line. First, it forces all the code in the "library" to be replicated in every executable. This is not efficient and increases footprint. 4. How would I implement a timed write with the current async API? By "timed write", I mean that I want to write some hunk of data, but I need to know if the write hasn't completed within a certain window of time. This is a common need in performance-sensitive systems, and writing it synchronously can be done, but how do I write it asynch? 5. TSS and threads Shouldn't these rely on boost.thread? I'm especially thinkng of TSS in this case, bceause in ACE/TAO we've found that TSS is, at the very least, inconsistent across platforms. See changelog entry in ACE_wrappers/Changelogs/ChangeLog-05a for one of t he more aggregious examples. Wed Feb 16 16:18:45 2005 Dale Wilson <wilson_d@ociweb.com> I look forward to seeing Jeff's consolidation of all the review material and what Chris comes up with afterwards! -- Chris Cleeland, cleeland_c @ ociweb.com, http://www.milodesigns.com/~chris Principal Software Engineer, Object Computing, Inc., +1 314 579 0066 Support Me Supporting Cancer Survivors in Ride for the Roses 2005 >>>>>>>>> Donate at http://www.milodesigns.com/donate <<<<<<<<<

Hi Chris, --- Chris Cleeland <cleeland@ociweb.com> wrote: <snip>
(a) buffers: from a certain perspective, this sounds nice, but it seems that there are already so many abstractions over a (void* + length)
There are? :)
--do we REALLY need another? Could this be simplified to something like std::pair<void*, std::size_t> ?
Conceptually, mutable_buffer does behave like std::pair<void*,size_t>, and const_buffer is like std::pair<const void*,size_t>. However, unlike just using std::pair, the buffer classes also: - Make type-safety violations explicit (although not impossible) by requiring you to use buffer_cast. - Provide some protection against buffer overruns by only allowing smaller buffers to be created from larger ones.
I don't see how one can get hold of the raw socket descriptor to twiddle options that the library hasn't abstracted.
Two methods: - Use the impl() member function to get access to the underlying platform-specific implementation. E.g.: ::setsockopt(sock.impl(), ...); - Implement the Socket_Option concept for the option. <snip>
I also know that after 15+ years of network programming, KNOWING the details all the way down to the wire is important for how one writes code.
If an application requires that level of control, then asio is probably not the appropriate tool.
Purists may say that one should never make such assumptions, but the reality is that when your "wire" is gigabit ethernet you get to write code with a certain set of assumption that is completely different from assumptions in place when the "wire" are high-bandwidth/high-latency satellites in geosynch orbit.
I'm genuinely curious to know what assumptions are different between those two specific scenarios, with respect to demultiplexing. <snip>
Is hostname resolution really so much of an overhead that we need asynch hostname resolution? If the feature were basically "free", I would say okay, but the fact that it requires firing up an extra thread behind the scenes (another one of my pet peeves) makes it seem like a wholly unnecessary feature.
The issue here is that hostname resolution is a potentially lengthy operation. If resolution is only performed at program startup, then that might be ok. However, if a server needs to resolve hostnames on a regular basis, you would not want to block the flow of control and delay other clients. If the synchronous operations were all that was on offer, the only course open to the application developer would be to use threads. The asio philosophy is to offer application developers support for concurrency without the need to use threads directly. As a bonus, some platforms do provide asynchronous host resolution APIs, and I hope to support these over time. This would be transparent to users of the asynchronous host resolution.
Besides the fact that this style is "wierd" to me (that's a personal thing), this implementation choice forces users' hands down the line. First, it forces all the code in the "library" to be replicated in every executable. This is not efficient and increases footprint.
It is on my to-do list to add support for a library implementation. However the goal of this is to prevent system headers from polluting the application's namespace. Since most of the library is template-based, I don't expect it to make much difference to the footprint (although you never know).
4. How would I implement a timed write with the current async API? By "timed write", I mean that I want to write some hunk of data, but I need to know if the write hasn't completed within a certain window of time. This is a common need in performance-sensitive systems, and writing it synchronously can be done, but how do I write it asynch?
The timing bit is easy: just set the expiry on a deadline_timer and perform an asynchronous wait. However the real question is what to do once the timer fires :) Can you describe the wider use case in more detail? In particular, what you intend to do with the socket if the "timed write" times out. At the moment the way to cancel operations in asio is to close the socket. All pending asynchronous operations complete with the operation_aborted error.
5. TSS and threads Shouldn't these rely on boost.thread? I'm especially thinking of TSS in this case, bceause in ACE/TAO we've found that TSS is, at the very least, inconsistent across platforms. See changelog entry in ACE_wrappers/Changelogs/ChangeLog-05a for one of t he more aggregious examples.
Wed Feb 16 16:18:45 2005 Dale Wilson <wilson_d@ociweb.com>
Unlike both Boost.Thread's and ACE's TSS implementation, the TSS wrapper in asio provides absolutely no ownership or cleanup semantics, thus avoiding painful issues like the ones mentioned in that ChangeLog entry. It's used only to store a pointer to a variable on the stack. Cheers, Chris

Hi, Chris, Thanks for the reply. On Fri, 13 Jan 2006, Christopher Kohlhoff wrote:
--- Chris Cleeland <cleeland@ociweb.com> wrote: <snip>
(a) buffers: from a certain perspective, this sounds nice, but it seems that there are already so many abstractions over a (void* + length)
There are? :)
So it seems. I know I've written a bunch, all with a slightly different twist. Maybe it's an opportunity for another boost feature? :-)
I don't see how one can get hold of the raw socket descriptor to twiddle options that the library hasn't abstracted.
Two methods:
- Use the impl() member function to get access to the underlying platform-specific implementation. E.g.:
::setsockopt(sock.impl(), ...);
Okay.
- Implement the Socket_Option concept for the option.
Maybe you could show an example in the docs for this?
I also know that after 15+ years of network programming, KNOWING the details all the way down to the wire is important for how one writes code.
If an application requires that level of control, then asio is probably not the appropriate tool.
I disagree. First, I stated that KNOWING the details is important, not that one need to manipulate the details all the way down the line. But even if I do need to manipulate details at different layers (all the way down to the wire), that need doesn't necessarily diminish the utility of a feature like asio. Why should I be prevented from using an otherwise-useful feature simply because I need to tweak things at a lower level?
Purists may say that one should never make such assumptions, but the reality is that when your "wire" is gigabit ethernet you get to write code with a certain set of assumption that is completely different from assumptions in place when the "wire" are high-bandwidth/high-latency satellites in geosynch orbit.
I'm genuinely curious to know what assumptions are different between those two specific scenarios, with respect to demultiplexing.
Is hostname resolution really so much of an overhead that we need asynch hostname resolution? If the feature were basically "free", I would say okay, but the fact that it requires firing up an extra thread behind the scenes (another one of my pet peeves) makes it seem like a wholly unnecessary feature.
The issue here is that hostname resolution is a potentially lengthy operation. If resolution is only performed at program startup, then that might be ok. However, if a server needs to resolve hostnames on a regular basis, you would not want to block the flow of control and delay other clients.
Okay, I can understand the motivation, but are there not other ways to implement besides firing off a thread? I admit that have a very strong negative bias for libraries that fire off threads behind my back...
If the synchronous operations were all that was on offer, the only course open to the application developer would be to use threads.
Via boost.thread?
The asio philosophy is to offer application developers support for concurrency without the need to use threads directly.
I guess this might be a philosophical difference--I simply do not like libraries that fire off threads behind my back or use threads that are out of my control, including system libs. Older versions of solaris used this same trick to implement asynch IO, and performance was absolutely horrendous. However, there was no way you would know this going in; you had to figure it out for yourself. Don't get me wrong--I have nothing against threads per se, and I realize that libraries sometimes must use threads. But, as a library writer, I also realize that developers that use my lib get a "budget", if you will, of resources that can be use and must use them judiciously. If my library is going to use threads I must (a) communicate that clearly to the library's users and (b) give the user opportunities to shape how the library uses threads. If asio is going to use threads in its implementation it must be up-front about it and give users of asio the opportunity to tune and control those threads, and (ideally) provide ways to substitute a non-thread-based strategy for those who might want it.
Besides the fact that this style is "wierd" to me (that's a personal thing), this implementation choice forces users' hands down the line. First, it forces all the code in the "library" to be replicated in every executable. This is not efficient and increases footprint.
It is on my to-do list to add support for a library implementation. However the goal of this is to prevent system headers from polluting the application's namespace.
Since most of the library is template-based, I don't expect it to make much difference to the footprint (although you never know).
I think it might make a difference on embedded systems where code gets shared.
4. How would I implement a timed write with the current async API? By "timed write", I mean that I want to write some hunk of data, but I need to know if the write hasn't completed within a certain window of time. This is a common need in performance-sensitive systems, and writing it synchronously can be done, but how do I write it asynch?
The timing bit is easy: just set the expiry on a deadline_timer and perform an asynchronous wait. However the real question is what to do once the timer fires :)
What to do would be application-specific, but an example might be to send a notice on an out-of-band channel, or to to raise an exception, or simply to tweak a flag.
Can you describe the wider use case in more detail? In particular, what you intend to do with the socket if the "timed write" times out. At the moment the way to cancel operations in asio is to close the socket. All pending asynchronous operations complete with the operation_aborted error.
I wasn't necessarily thinking that cancellation would be the common case. After all, once data gets copied into the system buffers, you can't really grab it back. You can close the socket, but even that may not completely prevent transmission of data if, in the case of TCP, you have the SO_LINGER set accordingly. My thinking was more like that above: the write hasn't completed within a specified window of time and the application needs to know that.
The TSS wrapper in asio provides absolutely no ownership or cleanup semantics [...] It's used only to store a pointer to a variable on the stack.
Can you expand on this a little more? I'm having a hard time understanding what it means to have tss hold a pointer to something on the stack. I'll try to check out the code itself, too. -- Chris Cleeland, cleeland_c @ ociweb.com, http://www.milodesigns.com/~chris Principal Software Engineer, Object Computing, Inc., +1 314 579 0066 Support Me Supporting Cancer Survivors in Ride for the Roses 2005 >>>>>>>>> Donate at http://www.milodesigns.com/donate <<<<<<<<<

Hi Chris, --- Chris Cleeland <cleeland@ociweb.com> wrote: <snip>
- Implement the Socket_Option concept for the option.
Maybe you could show an example in the docs for this?
Yep, no problem. <snip>
I disagree. First, I stated that KNOWING the details is important, not that one need to manipulate the details all the way down the line. But even if I do need to manipulate details at different layers (all the way down to the wire), that need doesn't necessarily diminish the utility of a feature like asio. Why should I be prevented from using an otherwise-useful feature simply because I need to tweak things at a lower level?
If it's just knowing the details and having the ability to tweak the implementation, then what I'm planning is: - To document the implementation strategies for each supported platform. There's a little bit in the design docs already. - To provide compile-time (i.e. #define), and in the long term possibly runtime, ways to choose the implementation strategy. However these mechanisms will be outside the library's public interface, since they're tied to the implementation, and that can change between releases. <snip>
The issue here is that hostname resolution is a potentially lengthy operation. If resolution is only performed at program startup, then that might be ok. However, if a server needs to resolve hostnames on a regular basis, you would not want to block the flow of control and delay other clients.
Okay, I can understand the motivation, but are there not other ways to implement besides firing off a thread?
The alternatives that I'm aware of are: - To use a platform's native asynchronous host resolution functions, if available. - To implement DNS lookup using sockets. This might be a really good solution for some platforms, but unfortunately not for all. Using gethostbyname on Windows, for example, also resolves NetBIOS names, and asio should provide identical behaviour to the platform's own host resolution functions. <snip>
Can you expand on this a little more? I'm having a hard time understanding what it means to have tss hold a pointer to something on the stack. I'll try to check out the code itself, too.
The only place TSS is used in asio is to determine whether or not the current thread is inside a call to demuxer::run() for a specific demuxer object. It works as follows: - The TSS value holds a pointer to the top of a stack implemented as a linked list. - When a demuxer::run() call starts it creates a linked list node on the stack, and pushes it on to the top of the stack. - When a demuxer::run() call exits it pops the node from the stack. - To determine whether the current thread is inside demuxer::run(), the TSS pointer is access and the stack traversed to see if the demuxer object is present. See asio/detail/demuxer_run_call_stack.hpp for the implementation of this. Cheers, Chris

Christopher Kohlhoff <chris <at> kohlhoff.com> writes:
Using gethostbyname on Windows, for example, also resolves NetBIOS names, and asio should provide identical behaviour to the platform's own host resolution functions.
Please take note that the gethostbyname function has been deprecated by the introduction of the getaddrinfo function on the Windows platform! I have not checked into your code, but what is Asio's default mechanism for resolving hostnames - will it spawn and asynchronously use a new thread or will it resolve the hostname synchronously in main thread? /Tompa

On 1/13/06, Tompa <tompa1969@yahoo.com> wrote:
Christopher Kohlhoff <chris <at> kohlhoff.com> writes:
Using gethostbyname on Windows, for example, also resolves NetBIOS names, and asio should provide identical behaviour to the platform's own host resolution functions.
Please take note that the gethostbyname function has been deprecated by the introduction of the getaddrinfo function on the Windows platform!
I believe getaddrinfo is the preferred method on all IPv6-capable platforms. The code in the boost-review version of asio is using gethostby{name|addr} but I think this part of the code may be re-worked in light of various threads I've seen on this list. I have not checked into your code, but what is Asio's default mechanism for
resolving hostnames - will it spawn and asynchronously use a new thread or will it resolve the hostname synchronously in main thread?
The asio host_resolver can be used either synchronously or asynchronously. In the latter case, resolver calls are handled by a background thread. -- Caleb Epstein caleb dot epstein at gmail dot com

On Sat, 14 Jan 2006, Christopher Kohlhoff wrote:
--- Chris Cleeland <cleeland@ociweb.com> wrote: <snip>
- Implement the Socket_Option concept for the option.
Maybe you could show an example in the docs for this?
Yep, no problem.
<snip>
I disagree. First, I stated that KNOWING the details is important, not that one need to manipulate the details all the way down the line. But even if I do need to manipulate details at different layers (all the way down to the wire), that need doesn't necessarily diminish the utility of a feature like asio. Why should I be prevented from using an otherwise-useful feature simply because I need to tweak things at a lower level?
If it's just knowing the details and having the ability to tweak the implementation, then what I'm planning is:
When you say "tweak the implementation", are you talking about modifying the distributed code? Or, are you talking about using language features to substitute alternative implementations for stuff normally provided by the lib?
- To provide compile-time (i.e. #define), and in the long term possibly runtime, ways to choose the implementation strategy. However these mechanisms will be outside the library's public interface, since they're tied to the implementation, and that can change between releases.
This is quite different from what I'm talking about. What this sounds like to me is that you'll provide a way for the user of the lib to select from whatever choices you, as lib implementer, decide to provide. I'm talking about opening things up such that if somebody wants to provide an alternate implementation they can code behind an interface you dictate and hook it into your lib.
- To implement DNS lookup using sockets. This might be a really good solution for some platforms, but unfortunately not for all. Using gethostbyname on Windows, for example, also resolves NetBIOS names, and asio should provide identical behaviour to the platform's own host resolution functions.
This is what I was thinking, but forgot about the platform-specific ways in which resolving might occur. I suspect something similar would also be true under OS X so that it can find listings in NetInfo, Rendezvous, etc.
Can you expand on this a little more? I'm having a hard time understanding what it means to have tss hold a pointer to something on the stack. I'll try to check out the code itself, too.
The only place TSS is used in asio is to determine whether or not the current thread is inside a call to demuxer::run() for a specific demuxer object. It works as follows:
- The TSS value holds a pointer to the top of a stack implemented as a linked list.
- When a demuxer::run() call starts it creates a linked list node on the stack, and pushes it on to the top of the stack.
- When a demuxer::run() call exits it pops the node from the stack.
How does the stack get destroyed? Who allocates and deallocates the tss slot?
See asio/detail/demuxer_run_call_stack.hpp for the implementation of this.
I will; should I fetch the version from CVS, or is the review code for that area still valid? -- Chris Cleeland, cleeland_c @ ociweb.com, http://www.milodesigns.com/~chris Principal Software Engineer, Object Computing, Inc., +1 314 579 0066 Support Me Supporting Cancer Survivors in Ride for the Roses 2005 >>>>>>>>> Donate at http://www.milodesigns.com/donate <<<<<<<<<

Hi Chris, Apologies for the tardy reply... --- Chris Cleeland <cleeland@ociweb.com> wrote:
When you say "tweak the implementation", are you talking about modifying the distributed code? Or, are you talking about using language features to substitute alternative implementations for stuff normally provided by the lib?
The latter.
This is quite different from what I'm talking about. What this sounds like to me is that you'll provide a way for the user of the lib to select from whatever choices you, as lib implementer, decide to provide.
Yep, this is what I'm planning for now.
I'm talking about opening things up such that if somebody wants to provide an alternate implementation they can code behind an interface you dictate and hook it into your lib.
If it's a widely useful implementation, then I'd rather see it added to the library. In the longer term I may provide some support for substituting implementations (possibly at runtime) but it's not high on my list of priorities. <snip> [ ... TSS demuxer call stack stuff ... ]
How does the stack get destroyed?
There's nothing to be destroyed, since the stack is just a linked list through objects on the thread's stack. As the objects' destructors are called they pop themselves and update the TSS pointer to point to the new top of the stack. The last object to be destroyed sets the TSS pointer to 0.
Who allocates and deallocates the tss slot?
It's a static class member.
See asio/detail/demuxer_run_call_stack.hpp for the implementation of this.
I will; should I fetch the version from CVS, or is the review code for that area still valid?
The review version is fine. The class names have since changed in CVS, but the implementation is the same. Cheers, Chris

On 1/10/06, Chris Cleeland <cleeland@ociweb.com> wrote:
2. unnecessary features
Is hostname resolution really so much of an overhead that we need asynch hostname resolution? If the feature were basically "free", I would
Yes, it can be. In any application I've ever seen that does potentially large numbers of Internet hostname lookups (e.g. web browsers, spiders, and the like), hostname resolution is delegated to one or more background threads. There is simply no predicting how long a query might or might not take. say okay, but the fact that it requires firing up an extra thread behind
the scenes (another one of my pet peeves) makes it seem like a wholly unnecessary feature.
I think that very few, if any, platforms ship with an asynchronous resolver API at present (OSX 10.4 maybe?) but I'm sure Chris will make every effort to make use of these when they come into being. -- Caleb Epstein caleb dot epstein at gmail dot com

I think that very few, if any, platforms ship with an asynchronous resolver API at present (OSX 10.4 maybe?) but I'm sure Chris will make every effort to make use of these when they come into being.
Most recent linux distros come with getaddrinfo_a (available in glibc since version 2.2.4). Unfortunately looks like it is not documented in the man pages (at least not in the lastest slack release). A description is available here: http://people.redhat.com/~drepper/asynchnl.pdf I think it is currently implemented using a thread pool.

Chris Cleeland <cleeland@ociweb.com>:
2. unnecessary features
Is hostname resolution really so much of an overhead that we need asynch hostname resolution?
Since I've been working on a GUI program with an "unfixable" bug about a hang of 2 to 5 minutes, when the program disappeared into gethostbyname(), I'd tend to say yes...
participants (6)
-
Caleb Epstein
-
Chris Cleeland
-
Christopher Kohlhoff
-
Giovanni P. Deretta
-
Steinar Bang
-
Tompa