On 26/06/2017 00:54, Vinnie Falco via Boost wrote:
On Sun, Jun 25, 2017 at 4:42 PM, Niall Douglas via Boost <boost@lists.boost.org> wrote:
Rarely I find the need to quote myself, this is one such occasion:
Using gsl::span<> effectively type-erases the buffer sequence which is inefficient, since it must be copied to a vector equivalent in order to be used.
Think gsl::span<gsl::span<char>>.
gsl::span<gsl::span<char>> effectively requires the callers to set aside storage for an array of std::pair<void const*, std::size_t>. The Asio equivalent is std::vector<boost::asio::const_buffer> which as you can see is inefficient as std::vector is not cheap to copy.
Asio's ConstBufferSequence concept of course includes gsl::span<gsl::span<char>> and std::vector<boost::asio::const_buffer> as models but it also includes the types I provided which cannot be modeled using gsl::span without intermediate storage. If you feel this is an acceptable tradeoff for your library, I am quite happy for you but I'll stick with the use of templates and the ConstBufferSequence concept since that is what is on track for standardization.
AFIO v2 doesn't allocate memory except exactly once per async i/o initiated. So the scatter-gather buffer list is given to the OS immediately, and therefore no copies of that list are needed as the OS takes them. Most users I should imagine would therefore build scatter-gather lists on the stack as they'll be thrown away immediately, indeed I usually feed it curly braced initializer_lists personally, least typing. In BEAST's case, I can see that scatter-gather buffers may need to be kept for longer. Except ...
I'd have thought that as the HTTP comes in - in chunks - your code would work with discontiguous storage.
I thought that too and built quite a bit on top of a parser that worked in chunks, until I used a profiler and did some comparisons to other implementations...
Are you memory copying instead?
...when I discovered it is actually much, much faster to memcpy discontiguous buffers into a new, allocated buffer so that the parser can assume the entire structured portion is in a single "span" (to use the terminology you prefer). And of course if the caller already has the data in a single buffer then there's no need to memcpy. The class beast::flat_buffer is provided for that purpose.
... you're using memcpy to append incoming chunks of TCP data anyway, obviating the need for scatter-gather. Now I personally find it quite surprising that you found that memory copying is faster than a high quality discontinuous storage iterator implementation with ring buffered page aligned DMA friendly TCP i/o. Anyone who has ever claimed that to me before, when I examined their code I found they'd been using a naive RandomAccessIterator implementation which was doing two pointer indirections per access, which is obviously going to be not quick. What they should have done was implement BidirectionalIterator only, and had it cache when it next needs to do a lookup into the next discontiguous section, thus becoming very fast indeed because we're not trashing the CPU caches and not stalling the out of order execution. This is the kind of stuff you need if you want to get a single CPU core feeding a >= 10 Gbps NIC. The question then becomes is how expensive is losing random access on your parser implementation? Cloning these BidirectionalIterators is extremely quick (three pointers each), plus parsing HTTP tends to work in offsets anyway. Any XML parser I've ever written I keep a stack of where I am in the tag hierarchy as I run forwards - so I would reckon a ForwardIterator would actually be enough in practice. But I'll grant you that HTML is not XML, it's much harder to parse tag soup. So you may be right. Nevertheless I think you are going to have to back up your claim that memory copying all incoming data is faster rather than being bad implementation techniques with discontinuous storage, as surely I am not alone here in being surprised at such a claim. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/