Re: [boost] asio networking proposal

19 Aug 2005

      From: Thore Karlsen <sid@6581.com>
...
On Fri, 19 Aug 2005 09:08:59 -0400, Rob Stewart <stewart@sig.com> wrote:
...
...
The performance problems of requiring vector<char> or char[N] exist on
several levels:
- For vector<char>, there is the initialisation of the chars to 0 on
construction or when you do a resize. Note that this is proportional to
the size of the vector, not necessarily to the amount of data
transferred. I have seen this have a noticable cost in a CPU-bound
server handling thousands of connections.
...
Don't construct a vector of a given size or use resize(), then.
Rely on reserve() instead.
Then size() will return the wrong value, and relying on capacity() is
No, size() will correctly indicate that there are no objects in
the vector.
...
not a good idea. If you're thinking about pushing data onto the vector
Why is relying on capacity() a bad idea?
...
as it's read, that's also bad, because then you'd have to read into a
temporary buffer first and copy it to the vector after. (Or do multiple
resizes.)
At some point, the vector has to have sufficient elements in it
so that you can copy new values onto them.  Otherwise, the vector
won't know it has real elements.  Yes, that means you do pay for
the initialization at some point and, yes, that isn't desirable.
...
I think it's a very bad idea to require vector<char> or a static array.
Christopher does a good job of explaining the drawbacks, and I agree
with him. I also do high performance asynchronous networking in my
server and client applications, and a library requiring vector<char> or
a static array would be completely useless to me. Most of the time I
don't have the data I want to send in a vector or in a static array, and
most of the time the amount of data is too big to send or receive a
whole buffer at a time.
I understand.
...
...
...
- Requiring a copy from a native data structure into vector<char> or
char[N]. If I have an array of a doubles say, I should be able to send
it as-is to a peer that has identical architecture and compiler.
Avoiding unnecessary data copying is a vital part of implementing high
performance protocols.
...
Agreed.  OTOH, using swap(), *if* a user used a vector<double>
instead of the array you mention, then vector won't add overhead.
Why would a swap be necessary?
If asio used a vector, the user could swap it's contents into his
own vector.  (That's not the current interface, but then neither
is vector in the current interface.  It was just informational
for the discussion.)
...
...
Instead, how about a std::vector-like class that takes a
user-defined, fixed-size block of memory?
No, that would still require a copy if the data isn't already in such a
buffer. void * (or unsigned char *, or char *, or whatever) HAS to be
there, otherwise the library is useless. Such a class could be an option
(and I would like to see it as an option), but not a requirement.
I don't think you understand what I'm suggesting.  Notice that I
used the word "takes."  Furthermore, I think you snipped the
details about how that class would use the buffer handed to it.

The data has to be in some memory somewhere.  The class I'm
suggesting can be told to use that memory and can be told how
much memory is available.  Then, whenever asio needs to read data
into the caller's buffer, or write data from the caller's buffer,
it is taken from that preexisting buffer.
...
In my applications I can't afford to copy data from my internal buffers
to whatever the networking library requires. I also can't put the data
in such buffers to begin with.
So wrap your internal buffers with the class I'm suggesting.  No
copy is needed.  The class simply provides a standard interface,
complete with push_back(), iterators, random access, etc., and
prevents buffer overruns.

-- 
Rob Stewart                           stewart@sig.com
Software Engineer                     http://www.sig.com
Susquehanna International Group, LLP  using std::disclaimer;