
On Tue, 14 Jun 2005 08:00:26 +0400, Scott Woods <scottw@qbik.com> wrote: []
3. Lastly, some of the observations (while excellent) seem a bit "macro" when a wider view might lead to different thinking. What I am trying to suggest here is that the time spent in vector<>::resize is truly surprising
So was it for me.
but its also very low-level. having been through 4 recent refactorings of a network framework, I have been surprised at the gains made in other areas by conceding say, byte-level processing, in another area.
I'm currently working on an network framework. Three major performance improvements in several iterations was: a) drop textuality; b) drop a C++ glue layer which was built over libevent, so I'm now using libevent directly - this _is_ the framework for me; c) drop using std::vector as a message buffer.
To make more of a case around the last point, consider the packetizing, parsing and copying thats recently been discussed. This has been related to the successful recognition of a message on the stream.
Is it acknowldeged that a message is an arbitrarily complex data object?
It is.
By the time an application is making use of the "fields" within a message thats probably a reasonable assumption. So at some point these fields must be "broken out" of the message.
A point to note here, is that there may be checkpoints on a message path, where a message must be read in order to be forwarded. At such points one wants to avoid whole message parsing.
Or parsed. Or marshalled. Or serialized. Is the low-level packet (with the length) header and body) being scanned again? What copying is being done? This seems like multi-pass to me.
To get to the point; I am currently reading blocks off network connections and presenting them to byte-by-byte lexer/parser routines. These form the structured network messages directly, i.e. fields are already plucked out.
So which is better? Direct byte-by-byte conversion to structured network message or multi-pass?
I'm not sure I understand "byte-by-byte conversion" and "multi-pass". What I did was breaking a message in two parts: header and body. The header contains message type and asynchronous completion token stack. Body contains application protocol specific data. A message is read in a chunk of memory (which was that vector<char>) and only the header part is parsed. When a message is forwarded only the header part is rebuild, the body gets forwarded without any user space copying. Only at the final destination an application parses the message body. -- Maxim Yegorushkin