Re: [boost] [Ann] socketstream library 0.7

14 Jun 2005

      ----- Original Message -----
From: "Maxim Yegorushkin" <e-maxim@yandex.ru>
To: <boost@lists.boost.org>
Sent: Monday, June 13, 2005 11:59 PM
Subject: Re: [boost] [Ann] socketstream library 0.7

[snip]
...
...
...
... And too slow. You have one data copy from kernel into the
streambuf,
and another one from the streambuf to the message object. The same is
for
output: message -> streambuf -> socket. This is unacceptable, at least
for
[snip]
30% of user time was spent in guess where? In zeroing out memory in
std::vector<char>::resize(). And you are talking about data copying
here...
Considering the protocol of your application has built in methods for
announcing the length of the payload, your requirement is met by the
streambuf::sgetn(char_type*, streamsize) method, for a properly
specialized implementation of the virtual protected xsgetn method.
[snip]
...
So you get operator semantics for free. :-)
And perhaps even a putback area, if there's one provided by this
particular streambuf implementation.
Sounds interesting, but I don't see how this can work well with
nonblocking sockets. You have to store how much bytes have already been
read/sent somewhere.
[snip]
There is really interesting material here. There is also other stuff that I
feel
obliged to comment on :-)

1. Some of the contortions suggested to effectively read messages off
an iostream socket are not specific to the fundamental network/socket
goals, i..e they are just difficulties associated with iostream-ing.

2. Some of those same contortions are in response to the different
media (not sure if thats the best term) that the stream is running over,
i.e. a network transport. This is more ammunition for anyone trying
to shoot the sync-iostream-model-over-sockets down. Or at least
suggest that the model is a constraint to those wrtiing iostream-based
network apps.

3. Lastly, some of the observations (while excellent) seem a bit "macro"
when a wider view might lead to different thinking. What I am trying to
suggest here is that the time spent in vector<>::resize is truly surprising
but its also very low-level. having been through 4 recent refactorings of a
network framework, I have been surprised at the gains made in other
areas by conceding say, byte-level processing, in another area.

To make more of a case around the last point, consider the packetizing,
parsing and copying thats recently been discussed. This has been related
to the successful recognition of a message on the stream.

Is it acknowldeged that a message is an arbitrarily complex data
object? By the time an application is making use of the "fields" within
a message thats probably a reasonable assumption. So at some point
these fields must be "broken out" of the message. Or parsed. Or
marshalled. Or serialized. Is the low-level packet (with the length)
header and body) being scanned again? What copying is being done?
This seems like multi-pass to me.

To get to the point; I am currently reading blocks off network connections
and presenting them to byte-by-byte lexer/parser routines. These form
the structured network messages directly, i.e. fields are already plucked
out.

So which is better? Direct byte-by-byte conversion to structured network
message or multi-pass?

Cheers.

Re: [boost] [Ann] socketstream library 0.7

Scott Woods