On Tue, 10 May 2022 at 22:33, Marcelo Zimbres Silva via Boost
Hi, here is my review.
Thanks for taking your time to write your review.
1) ================================
https://anarthal.github.io/mysql/mysql/ref/boost__mysql__resultset/async_rea... The handler signature for this operation is void(boost::mysql::error_code, std::vectorboost::mysql::row). etc.
Having a std::vector in the signature of the completion handler is unusual in ASIO. In general, large objects should be passed as parameters to the initiating function.
Additionally, this operation does not support custom allocators. That wouldn't be a big problem to me if I could reuse the same memory for each request but the interface above prevents me from doing so.
It is also a pessimized storage as each row will allocate its own chunk of memory. A single block of memory would be better.
Those functions don't strive for efficiency, that is true. I think they can be
useful for people using callbacks, for example. I see two options here,
and I'm happy with both of them:
a) Change the function signature to
async_read_all(vector
2) ================================
https://anarthal.github.io/mysql/mysql/ref/boost__mysql__connection/async_qu... The handler signature for this operation is void(boost::mysql::error_code, boost::mysql::resultset<Stream>)
Here the completion handler is returning an io object i.e. boost::mysql::resultset, which is also unusual as far as I can see. Wouldn't it make sense for users to specify a different executor for the resultset? This interface is preventing that.
Please note that `connection`, `resultset` and `prepared_statement` always use the same executor as the underlying `Stream` object, as returned by `connection::next_layer`. `resultset` and `prepared_statement` are proxy I/O objects to make the operations more semantic, but end up in calls to the underlying `Stream` read and write functions. I don't think it is possible at all to have separate executors for different objects in this context.
I prefer having the completion handler return config parameters with which I can instantiate a boost::mysql::resultset myself in my desired executor.
As pointed above, I don't think this makes sense, unless I'm missing something. However, returning resultsets by lvalue reference instead of as completion handler arguments may be positive for efficiency, as it would allow more memory reuse. Resultsets hold table metadata, which is required to parse rows and lives in dynamic storage. There would be no performance gain for prepared_statatetment's, as these perform no allocations. I'm happy for this to become an acceptance condition, too, but please note that it won't allow different executors in any case. I've raised https://github.com/anarthal/mysql/issues/59 to track the issue.
3) ================================
https://github.com/anarthal/mysql/blob/13d3615464f41222052d9ad8fa0c86bcdddee...
I am not convinced the design of the value class is a good idea. The reason is that users know at compile time what types they are expecting from a given query. That means it would be possible for them to specify that before doing the query. So instead of doing this
https://github.com/anarthal/mysql/blob/13d3615464f41222052d9ad8fa0c86bcdddee...
boost::mysql::row row; while (result.async_read_one(row, additional_info, yield[ec]))
users could do
boost::mysql::row
row; while (result.async_read_one(row, additional_info, yield[ec])) avoiding unnecessary further copies. In principle it would be even possible to parse directly in a custom data structure, for example, when storing json string in the database.
I see different use cases here:
a) Parsing a row into a data structure known at compile-time. Rather than
a row
4) ================================
IMO, the examples are too limited and show variation of the same functionality. For example
- Text query, async with callbacks - Text query, async with futures - Text query, async with Boost.Coroutine coroutines - Text query, async with C++20 coroutines - Default completion tokens
Here you are showing nice Asio features. I am not meaning you should remove them but if I am not mistaken there is no example that shows
- How to keep long lasting connections open (specially for ssl).
Unless I'm missing something obvious, I don't think there is anything to do on the user side to keep connections open. Once you connect the underlying `Stream` and call `connection::handshake` (or the composed function `connection::connect`), the connection will remain open until: a) It is closed by the user (e.g. by calling `connection::close`). b) The connection remains idle by a certain amount of time, as defined by the MySQL variable https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_.... By default, this timeout is 8h. It can be changed at the database level and on a per-connection basis, by issuing a `SET SESSION` command (e.g. `conn.query("SET SESSION wait_timeout = 3600");` would set the connection timeout to 1h). Note that this is set *after* the connection is established. Unless I'm missing something, there is nothing special in handling SSL long-lasting connections - keep alives happen at the TCP level. There is no keep-alive mechanism at the MySQL protocol level. I've raised https://github.com/anarthal/mysql/issues/61 to add an example demonstrating b).
- Timers for resolve, connect, read and write.
There is an example on how to use Asio cancellation mechanism together with C++20 coroutines and Asio timers to implement timeouts for name resolution, connection establishment and queries here: https://anarthal.github.io/mysql/mysql/examples/timeouts.html Is this what you mean?
- Synchronized writes.
Not following you here, could you please elaborate on this particular use case?
- Healthy checks, failover etc.
At the protocol level, there is a COM_PING command (https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_com_ping.htm...) that allows the user to check whether the server is still alive or not. Issuing that requires additional library support. The interface for this would be something like `void connection::ping(error_code&)`. It would return an empty error code on success and an appropriate error code on failure. I guess this can be currently be work-arounded by issuing a dummy SQL query to the server as a health-check. I've raised https://github.com/anarthal/mysql/issues/62 to track the PING addition. I need to further investigate failover in replica set scenarios. I hope to do it today evening. I've raised https://github.com/anarthal/mysql/issues/63 to track this.
I estimate that to address these points users will be required to write further 500 - 1000 LOC on top of your library. Otherwise they will be stuck on underperforming short lived connections.
It would be helpful for users to have a starting point to begin with.
There some minor things I have also noticed:
5) ================================
https://github.com/anarthal/mysql/blob/13d3615464f41222052d9ad8fa0c86bcdddee...
I expect this function to be O(1).
Raised https://github.com/anarthal/mysql/issues/64 to track it.
6) ================================
https://anarthal.github.io/mysql/mysql/ref/boost__mysql__value/get_std_optio... https://anarthal.github.io/mysql/mysql/ref/boost__mysql__value/get_optional....
Sounds also unusual to have two member functions for the different versions of optional. I suggest using the boost version until we transition to C++17.
Why do you think it can be a problem? I would say having both just targets a broader audience: those using `boost::optional` and those using `std::optional`, without compromising either.
7) ================================
https://anarthal.github.io/mysql/mysql/ref/boost__mysql__errc.html
errc should be conditions and not error AFAICS.
Not following you here - errc here is just acting the same as `boost::beast::http::error` enum.
============= Review End ================
I vote for conditional acceptance. Points 1, 2, 3 and 4 should be addressed before acceptance. My acceptance criteria sums up to
1) Remove functionality that provides low performance. With coroutines it will be easy for users to loop on async_query to create a std::vector of rows.
As I mentioned above, either removing or switching to return-by-lvalue seems OK as an acceptance condition for me.
2) No IO object should be returned as a parameter of the completion handler.
I can accept this as a performance improvement, but please read the note on executors above.
3) Review the value type. Can it be improved?
Does this mean that you consider parsing into compile-time user-defined data structures (e.g. `tuple`) an acceptance condition?
4) Improve examples.
Will inform as soon as I have an answer on failover, happy to provide the other ones.
Regards, Marcelo
Regards, Ruben.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost