On Mon, Feb 6, 2023 at 11:40 PM Andrzej Krzemienski
Let me offer some further thoughts.
Yes these are very good :)
1. It is more of a question. These buffers::sink and buffers::source being mentioned in this thread: are they already present in one of the Boost libraries (ASIO, Beast)?
These are new. The motivation for these types is originally for type-erasing certain HTTP bodies during parsing or serialization in the new Http.Proto library (https://github.com/CPPAlliance/http_proto). This library implements HTTP/1 and makes some different design choices which aim to fix some inherent defects in Beast's design.
The only thing that multiple libraries (Boost and non-Boost) would benefit from is the buffer interface (boost::sink, boost::source). No buffer implementations, no buffer algorithms.
The situation in Beast is that it is actually several libraries in one: 1. "sans-IO" HTTP/1 2. HTTP/1 on Asio 3. "sans-IO" Websocket 4. Websocket on Asio 5. Buffer algorithms and containers 6. Asio utilities 7. C++ port of ZLib (!) The "core" directory contain the files for items 5 and 6 above: https://github.com/boostorg/beast/tree/341ac7591b2b023c81de13312a80d1e824742... In theory there is nothing wrong with aggregating these things into one library. But for practical reasons, quite frankly it sucks. It takes forever for CI to turn around, the docs end up taking longer to build because there is so much more stuff, and it just gives off a bulky vibe that isn't fun to work with. One of my goals for my new generation of libraries (which intend to replace Beast) is to design things in a way that they do more with less API and implementation. Less "try-hard" so to speak. In Http.Proto I tried very hard to stay away from having to need these various buffer implementations and concepts but in the end it proved unworkable. It turns out that these buffer algorithms and containers are just so damn useful that even in a "sans-IO" (https://sans-io.readthedocs.io/) library they end up being the correct choice. Okay fine so I brought back in some selected buffer related things to Http.Proto but only as implementation details and private interfaces. No problem the library stays lean. But... oh well that didn't work out so well either because as it turns out, using buffer sequences to define the HTTP body is very useful and that brought me right back to where I started which is that the HTTP protocol library API benefits from buffer concepts. And the user benefits from having implementations of buffers on hand. Specifically, HTTP/1 message body serialization and parsing should support: Three body styles for `serializer` 1 Specify a ConstBufferSequence 2 Specify a Source 3 Write into a serializer::stream Three body styles for `parser` 4 Specify a DynamicBuffer 5 Specify a Sink 6 Read from a parser::stream Boost.Buffers fulfills the API requirements for achieving 1,2,4, and 5 above.
3. There are a number of interfaces, where everyone can plug their type, in the STD and Boost that have an overlap. We have the IOStream interface, we have Boost.Serialization interface, std::format is coming, and we have now Boost.Buffer being proposed. Can you make a clear distinction why Boost.Buffers is different? Why do we need another one? Are previous ones defective (and can be superseded), or do they play a different, incompatible role?
Yes. std::istream, std::ostream: These are actually pretty good substitutes for source and sink. They are in the standard already. They perform type-erasure. And they come with implementations (e.g. stringstream, ofstream). They could in theory work, and many types already support operator<< to std::ostream& so these could be generically used. Good thinking Andrzej :) But it is not without problems. It's got weird error handling and signaling for end of stream. Implementing your own istream or ostream can be difficult. It isn't design from the ground up for the buffer-oriented interface They are biased for character -based output and part of their interface has to do with formatting (which is out of scope for Boost.Buffers). Boost.Serialization: This is an entirely different thing from buffer-oriented exchange of data. Rather it is about defining algorithms for transporting types to and from an "archive" at full fidelity. This is out of scope. Boost.Buffers could have a say in how an archive represented as zero or more contiguous spans of bytes might be transported or streamed from one API to another. But it has nothing to say (nor cares) about how the user defined types map to or from those bytes. std::format: Kind of the same situation as Boost.Serialization. It defines a way to convert types into ASCII text and substitute that text into a larger corpus, which is out of scope. Boost.Buffers could have a say in how the buffers produced by std::format might be transported to another API.
I suppose that buffers have much to do with buffering, that is working with chunks of messages. But how does this work with JSON? Can you even start thinking about parsing JSON if you only have a part, broken at an arbitrary position?
As a matter of fact... you can :) Boost.JSON is unique in that it is the only JSON library which comes with a streaming parser and a streaming serializer, to allow buffer-at-a-time processing. This is essential for network programs to provide fairness. Specifically the streaming interface allows the implementor to restrict the amount of work performed when serializing or parsing JSON, and spread the computational requirements of handling large JSON texts across multiple I/O cycles so that one connection does not monopolize a thread. You can see those interfaces here: https://www.boost.org/doc/libs/1_81_0/libs/json/doc/html/json/input_output.h... https://www.boost.org/doc/libs/1_81_0/libs/json/doc/html/json/ref/boost__jso...
Also, IOStreams have a layer of buffers. What is the relation there (Between IOStream buffers and the proposed Boost.Buffers)?
I have to be honest I find the IOStreams interfaces incredibly confusing with the buffers and the "controlled sequence" and the g pointer and the p pointer and the.. well, you get the point. Maybe you could tell me what the relationship is, if any, as I am not quite sure...
4. Would Boost.Buffers satisfy my every use case for buffers?
Well I don't know. You'd have to list the use-cases :) My recipe for this library was to start with Asio concepts, add in some implementations which end up being needed often, and add my own buffer-oriented filter, source, and sink abstract interfaces. If these are insufficient for a particular use case I would need to study the use-case and figure out if it is the right fit for Buffers and how it could be satisfied.
Are buffers only about how you allocate a new chunk of memory?
No. There are roughly three areas of interest: 1. range-like sequences of contiguous storage: - ConstBufferSequence, MutableBufferSequence - buffers::const_buffer, buffers::mutable_buffer 2. stream-like controlled buffers: - DynamicBuffer - buffers::circular_buffer, buffers::flat_buffer, buffers::string_buffer 3. abstract buffering interfaces: buffers::filter, buffers::source, buffers:sink In 1 the ranges are static, buffers can't change size nor can the range change size. The concepts are akin to `span const`. In 2 the controlled buffers can grow (via prepare/commit) and shrink (via consume). Depending on the implementation this could allocate memory (or not). For 3 these interfaces define how buffers are passed from one program interface to another, when operating a buffer-at-a-time processing algorithm.
Or are they about identifying a place where you can cut your message into meaningful portions?
No, but if you have already cut your message into zero or more contiguous bytes of storage then Boost.Buffers can help you do things with it. Thanks