Fwd: Aedis review (Mohammad Nejati)
---------- Forwarded message ---------
From: Mohammad Nejati [Ashtum]
Does this library bring real benefit to C++ developers for real world use-case? Yes, Redis is extensively used in server-side services, so having a high-quality asynchronous C++ library is a need for this type of application.
Do you have an application for this library? Yes, I work on backend services and use Redis as a key-value storage and a message broker (Redis streams) and I am in need of an asynchronous Redis library.
Does the API match with current best practices? Did you try to use it? What problems or surprises did you encounter? Yes, I am a user of Aedis already and have used it in an application too.
Is the documentation helpful and clear? I find them very clear and well written but it would be nicer to have more examples of real-world usages of Redis, I see documentation links to the example directory for examples. It would be nice if add a couple of them to
There are some issues that I have found in the practical usage of this library, some of them are related to the API and some of them have performance implications. The connection being reconnectable seems a design flaw: The idea behind having a reconnectable connection seems to be there to make the user's life easier (by keeping the user's request and retrying them on reconnect) but there are some issues: A network disconnect leaves the user's request in an undefined state (they might have applied on the server or not) we can't simply retry them after a reconnection it requires the user's intervention to handle a reconnect. For example, we can't simply retry an INCRBY command, it might increment the value twice. Because of the possibility of reconnecting the handshake process has to be done by users (by sending HELLO command as the first request) and on a reconnect this should be repeated again, this affected the implementation in a way it has to check for each command to see if it is a HELLO command or not. If a user is using async_receive there is no way to detect a disconnect, users have to use some form of async condition variable (which Asio doesn't have at the moment) to signal the disconnect event. On a reconnect all Redis subscriptions are dropped and again a user has a hard time signaling the async_receive part of the code to subscribe again. Because of the possibility of reconnecting, each request gets a lot of boolean config parameters to handle the reconnect scenario (cancel_on_connection_lost, cancel_if_not_connected, cancel_if_unresponded, hello_with_priority) that made the API difficult to use and understand. So if all these efforts have a limited use case (implicitly retrying the requests that are read-only or do not matter if executed twice) I think a user can handle this scenario with a custom wrapper type would be better instead of making the whole connection reconnectable. There are a lot of rescheduling for reading a single push event (performance issue): At the moment the demultiplexing of Redis events seems to happen with a strategy https://github.com/mzimbres/aedis/blob/master/include/aedis/detail/guarded_o... The problem is that for reading each event message around 4 rescheduling on executor happens (it might be less or more than 4 it is hard to be sure from the code) and that has degraded the performance of reading push events (rescheduling on Asio executors is not a cheap operation). There is no possibility of reading push events in batches (performance issue): Redis pub/sub receives its messages as push events and because it is a way of broadcasting messages it is sometimes used at rates like 50,000 messages per second, with the current API there is no possibility for reading these messages in batches which made the reading operation very costly. Creating an Asio timer for each request (performance issue): Currently, Aedis creates an Asio timer for each request in the queue to act as an async condition variable, but it seems it can be eliminated if the exec_op is converted into a polymorphic type which can be resumed externally, after this change would a need for an Asio service too, to destroy paused operation on a service shutdown. the document and explain what is going on. There is no example of how we should use this library to work with Redis streams. The echo server benchmark doesn't carry any information and is misleading in my opinion, if there is some benchmark it should be a practical usage example.
Please indicate the time & effort spent on the evaluation and give the reasons for your decision. Because I had some previous experience with this library I think in total it should be around 4 days of debugging and evaluation.
Please explicitly state that you either *accept* or *reject* the inclusion of this library into boost I think this library needs to see more practical usage to find its shape in terms of the quality of implementation and the quality of the provided user interface. There are various modes we can use Redis (key-value storage, message broker, pub/sub) and each one can reveal shortages in the performance or API. At this stage, I think it is too early for inclusion in the boost.
On Tue, 24 Jan 2023 at 12:11, Klemens Morgenstern via Boost
The connection being reconnectable seems a design flaw:
There are many problems with connections that can't be reconnected * They can't be easily shared among e.g. http/websocket sessions without an additional layer. To address this, users would have to either use one redis-connection per e.g. http-session, which might exhaust ports on the Redis server, or implement a connection pool which brings a whole new world of complexity to the problem. Puting this burden on every user does not seem a wise decision. * The performance of resp3 is built around command pipelines. A connection that can be shared makes better use of this feature as it can put e.g. 10k commands in a single write instead of 1000 commands in 10 writes (among other things). * Non-reconnectable connections make it harder to write failover-safe apps. For example, say you receive a chat message over a websocket session and want to store it on redis. What should you do if at this moment there is a failover happening? Implementing retry logic at client code is ineficient and complex as it would involve canceling and retrying, another implementation burden.
The idea behind having a reconnectable connection seems to be there to make the user's life easier (by keeping the user's request and retrying them on reconnect) but there are some issues: A network disconnect leaves the user's request in an undefined state (they might have applied on the server or not) we can't simply retry them after a reconnection it requires the user's intervention to handle a reconnect. For example, we can't simply retry an INCRBY command, it might increment the value twice.
Double execution is guaranteed to not happen by default. The implementation will cancel all requests that have been written but remained unresponded when the connection was lost. To allow resending those unresponsed requests you need to mark them with request::config::cancel_if_unresponded = false // default is true. which is useful for read-only (e.g. GET) and other commands like SET etc. This means the API is safe by default, you won't get double execution if you don't want to risk it.
Because of the possibility of reconnecting the handshake process has to be done by users (by sending HELLO command as the first request) and on a reconnect this should be repeated again, this affected the implementation in a way it has to check for each command to see if it is a HELLO command or not.
You are talking about the performance overhead of this line if (cmd == "HELLO") which is unmeasurable. I don't see how this can be a concern of yours.
If a user is using async_receive there is no way to detect a disconnect, users have to use some form of async condition variable (which Asio doesn't have at the moment) to signal the disconnect event.
This claim does not make sense to me. What do you mean by detect? You "detect" that a disconnection occurred when async_run completes co_await conn->async_run(); conn->cancel(operation::receive); // If you want to cancel.
On a reconnect all Redis subscriptions are dropped and again a user has a hard time signaling the async_receive part of the code to subscribe again.
This can be easily addressed by running async_run and async_receive in parallel so it gets cancelled, as I show in the examples co_await ((conn->async_run() || receiver(conn)));
Because of the possibility of reconnecting, each request gets a lot of boolean config parameters to handle the reconnect scenario (cancel_on_connection_lost, cancel_if_not_connected, cancel_if_unresponded, hello_with_priority) that made the API difficult to use and understand.
This complexity is inherent to the problem and we cannot pretend it does not exist. if I ignore it, users will have to address it themselves (wasted effort).
There are a lot of rescheduling for reading a single push event (performance issue): At the moment the demultiplexing of Redis events seems to happen with a strategy https://github.com/mzimbres/aedis/blob/master/include/aedis/detail/guarded_o... The problem is that for reading each event message around 4 rescheduling on executor happens (it might be less or more than 4 it is hard to be sure from the code) and that has degraded the performance of reading push events (rescheduling on Asio executors is not a cheap operation).
Is this backed with benchmarks? Parsing resp3 itself requires lots of rescheduling so I am not sure this is going to impact performance heavily. In any case there are alternatives. Implementing more than one read in a single call to async_receive is also simple.
There is no possibility of reading push events in batches (performance issue): Redis pub/sub receives its messages as push events and because it is a way of broadcasting messages it is sometimes used at rates like 50,000 messages per second, with the current API there is no possibility for reading these messages in batches which made the reading operation very costly.
Do you have any number to share so I have an idea about how costly this is?
Creating an Asio timer for each request (performance issue): Currently, Aedis creates an Asio timer for each request in the queue to act as an async condition variable, but it seems it can be eliminated if the exec_op is converted into a polymorphic type which can be resumed externally, after this change would a need for an Asio service too, to destroy paused operation on a service shutdown.
The dynamic allocation is performed on the completion handler associated allocator, so you have full control over it. If suspending an operation without a timer is possible and produces better results, please open an issue. It is however hard for me to imagine the timer can hurt performance.
The echo server benchmark doesn't carry any information and is misleading in my opinion, if there is some benchmark it should be a practical usage example.
The echo server is very good for Redis benchmarks because it measures the ability of a client to perform automatic pipelining and how good it handles concurrent use of a connection. Thanks, Marcelo
participants (2)
-
Klemens Morgenstern
-
Marcelo Zimbres Silva