Re: [boost] [async-mqtt5] Review questions

20 Oct 2024

      Hi Vinnie, and thanks for your questions :)
...
On 20.10.2024., at 19:06, Vinnie Falco via Boost <boost@lists.boost.org> wrote:
On Sat, Oct 19, 2024 at 4:43 PM Ivica Siladic via Boost <
boost@lists.boost.org> wrote:
...
...
Ivica, I have some questions :) My first step was to audit the
file mqtt_client.hpp by inspecting everything from top to bottom:
1. Where are the type requirements for TlsContext explained?
https://github.com/mireo/async-mqtt5/blob/0b935bd1a2ba65427d1fb491aa1133b07c...
TlsContext is indeed not a concept. For mqtt_client, it’s a fully opaque object typically required for TLS streams. For example, with boost::asio::ssl::stream, the TlsContext would be boost::asio::ssl::context. For Botan's TLS stream, it would be the Botan TLS context, and so on. The context is generally a kind of singleton that holds administrative data for streams that will be created based on that data. For instance, a collection of Certificate Authorities (CAs) is typically loaded once and then used to create TLS streams, which rely on those CAs to verify the server's certificate.

There are no interface requirements for TlsContext other than that it must be move-constructible. The mqtt_client simply passes this object to the SSL stream specializations that require access to it.
...
2. Allocating in a constructor is OK, and the documentation should
communicate this and the possibility of an exception:
https://github.com/mireo/async-mqtt5/blob/0b935bd1a2ba65427d1fb491aa1133b07c...
Correct, and thanks for pointing that out.
...
3. Why bother with this overload? Users can call ctx.get_executor(), why do
it for them?
https://github.com/mireo/async-mqtt5/blob/0b935bd1a2ba65427d1fb491aa1133b07c...
Literally all Asio async objects have constructors that accept an executor or execution context. I believe this is a bit of a historical relic, but we included it to maintain consistency with Asio standards.
...
4. What if `other` is performing operations? The documentation does not
explain it.
https://github.com/mireo/async-mqtt5/blob/0b935bd1a2ba65427d1fb491aa1133b07c...
`other` must not be running and you are correct, this should be documented.
...
5. The destructor ~mqtt_client calls impl_->cancel(), which is a
non-trivial function which can throw. Submitting work to an executor from a
destructor is surprising:
https://github.com/mireo/async-mqtt5/blob/0b935bd1a2ba65427d1fb491aa1133b07c...
This is precisely how, for example, the Asio basic_deadline_timer works. Specifically, the basic_deadline_timer object contains a member named impl_, whose type is io_object_impl<detail::deadline_timer_service>. In its destructor, impl_ calls service_->destroy(implementation_). The destroy function invokes an internal cancel, which may throw and submits the completion handler for execution via scheduler_.post_deferred_completions. The logic in mqtt_client follows this same pattern. The key point, as we understand it, is to explicitly cancel all outstanding asynchronous operations and notify any pending handlers that the object has gone out of scope. This can only be accomplished from the destructor. 

It is unusual, I agree. Most their-party Asio objects I’ve seen do not call cancel() from destructor. If you have a pending async operation on that object and you leave object to go out of scope your program will most likely crash. We wanted to be aligned here with Asio again.
...
6. This "will" look like an MQTT protocol thing. The documentation could
explain it better:
https://github.com/mireo/async-mqtt5/blob/0b935bd1a2ba65427d1fb491aa1133b07c...
I found this from searching:
https://www.hivemq.com/blog/mqtt-essentials-part-9-last-will-and-testament/
The documentation for the "Will" type states, "A Will Message is an Application Message that the Broker should publish after the Network Connection is closed in cases where the Network Connection is not closed normally." The context in which a Will can be used in the MQTT protocol is indeed specific to MQTT. Therefore, the conceptual usage of the MQTT Will is likely beyond the scope of the Async.MQTT5 documentation.
...
7. The word "will" appearing three times in this declaration looks
unhealthy. The function name doesn't make its operation clear:
https://github.com/mireo/async-mqtt5/blob/0b935bd1a2ba65427d1fb491aa1133b07c...
Correct, and thanks for pointing that out.
...
8. The ad-hoc format of putting a list into a string is a little weird, why
isn't this some kind of range?
https://github.com/mireo/async-mqtt5/blob/0b935bd1a2ba65427d1fb491aa1133b07c...
Ruben asked the same question so I’m pasting the answer here:

In the vast majority of cases, there will be only one broker, so creating a std::vector<std::string> with a single element would introduce unnecessary overhead. If multiple brokers are involved, they are either configured in a .conf file or hard-coded. In the case of a configuration file, it's simpler to pass the entire broker string directly to the mqtt_client rather than parsing the string, splitting it into substrings, and filling a vector. If the brokers are hard-coded, then this overhead is irrelevant anyway.
...
9. I found the "TlsContext" concept documentation. It completely omits any
requirements, which means that TlsContext is not a real concept. Using
std::monostate is weird, why not void? How do you feel about using this
declaration for mqtt_client instead?
template < typename StreamType, bool enableTLS = false >
   class mqtt_client;
We use std::monostate to avoid creating separate specializations of the mqtt_client and client_service constructors that omit the (defaulted) TlsContext parameter. These constructors would require additional enable_if disambiguation, which would complicate the code. By using a default-constructible special object, the code becomes more concise. While we could have chosen any default-constructible type, we decided to go with std::monostate because, as the specifications state, "std::monostate is a unit type intended for use as a well-behaved empty alternative in std::variant."
...
10. async_subscribe requires a reference to a vector. And the call to
`async_initiate` makes a copy? Why not pass the vector by value and then
move it? This would allow specifying an initializer list at call sites to
async_publish without making an extra copy:
https://github.com/mireo/async-mqtt5/blob/0b935bd1a2ba65427d1fb491aa1133b07c...
(of course, my C++ is rusty so I could be wrong here)
Yes, async_initiate makes a copy of vector supplied as argument to async_subscribe. Actually, during async_initiate we encode complete SUBSCRIBE MQTT message with topic names embedded into it. The encoding function copies elements from supplied vector of topics directly to buffer holding encoded message.
...
11. Everything in this document should be in the official library
documentation, and a permanent part of the library. Not just for reviewers.
This information is very helpful for people who are evaluating whether or
not to adopt the library for their needs instead of other offerings:
https://www.mireo.com/cdn/Async.MQTT5.pdf
OK, thanks for pointing that out. We’ll include it into official docs.
...
12. The documentation alludes to some kind of queuing or buffering of
outgoing messages, to handle the case for recurring disconnections (which
are frequent in IoT deployments):
"While offline, it automatically buffers all the packets to send when the
connection is re-established."
Does this mean that mqtt_client objects can consume unbounded memory? What
is the limit on the queue, or where are the APIs to control queue depth?
How does the client eventually block callers when the queue is full, to
prevent resource exhaustion?
Ruben also asked this question so here’s my answer:

Resource consumption will not grow indefinitely. The client will attempt to send up to 65,535 packets, after which async_publish will immediately return an error code (pid_overrun), as it will have run out of Packet Identifiers.

If you want to impose stricter limits, you can implement a more complex solution on the user's side. For example, you could track the number of bytes (or messages) sent by incrementing a counter each time you call async_publish and decrementing it in the completion handler of async_publish. Before attempting to publish, you should check whether the number of in-flight messages has reached your limit, and if so, simply skip publishing until some messages are acknowledged.
...
13. I have the same questions as above, but regarding buffering for
incoming messages. Are they buffered without bounds? What if someone calls
async_run() without collecting any messages, will the memory consumption
grow unchecked?
Incoming messages are placed into an asio::basic_channel, with its limit set to std::numeric_limits<size_t>::max(), effectively making it unlimited. The program may crash if you subscribe to a topic and fail to consume the incoming messages. It’s important to note that the receiving side operates very differently from the publishing side, whose buffer depends solely on network bandwidth and availability. The receiving side only gets messages that have successfully traversed the network.

We set the limit to std::numeric_limits<size_t>::max() because we couldn't come up with a reasonable lower value. The channel buffers received messages because there may be lengthy operations occurring between consecutive async_receive calls, during which several messages might arrive. However, it is the user's responsibility to pop these messages in a timely manner.

I don't believe that exposing the buffer limit as a configurable parameter would make sense. It would represent something like the "maximum number of messages in the intermediate buffer stored between consecutive async_receive calls." I think it would be challenging to explain to users what this means and what value should be used.
...
I will have more soon.
Thanks
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Thanks,

Ivica Siladic