
I found the mistake. ASIO is *not* at fault. You guys can disregard my
previous email. There's no bug to report in ASIO's bug tracker this
time.
Em ter., 11 de fev. de 2025 às 15:17, Vinícius dos Santos Oliveira
Em seg., 30 de dez. de 2024 às 11:10, Vinnie Falco via Boost
escreveu: On Mon, Dec 30, 2024 at 2:04 AM Richard Hodges via Boost < boost@lists.boost.org> wrote:
...execution order is not part of the contract
Yes, it is:
https://www.boost.org/doc/libs/1_87_0/doc/html/boost_asio/reference/io_conte...
I think I've found a violation to these rules (which I depend on). However I've been failing to produce a minimal test case to send a proper bug report. I've exhausted my ideas for the time being, so I've come here to ask for help/new ideas that I could attempt.
Given I've failed to produce a minimal test case, I'll have to point you guys to code which is larger.
So here I call strand.post(a): https://gitlab.com/emilua/emilua/-/blob/v0.11.0/src/actor.ypp#L1038
And here I call strand.post(b): https://gitlab.com/emilua/emilua/-/blob/v0.11.0/include/emilua/core.hpp#L120...
strand.post(a) happens before strand.post(b) (I even inserted printf() statements locally just to make sure they really do). Therefore a() should happen before b(), but that's not what I've been observing. I observed b() happening before a() on Windows and Linux (both epoll and io_uring). On FreeBSD a() always happens before b(). I don't know what ASIO does differently in FreeBSD. Sometimes on Linux I get the desired behavior as well, but almost always I get the undesired behaviour. I think when the cache is hot I always get the undesired behavior. So that's the minimal test case I wrote:
#include
#include <iostream> #include <thread> #include <memory> namespace asio = boost::asio;
struct actor { actor(asio::io_context& ioc, int nsenders) : work_guard{ioc.get_executor()} , s{ioc} , nsenders{nsenders} {}
const asio::io_context::strand& strand() { return s; }
asio::executor_work_guardasio::io_context::executor_type work_guard; asio::io_context::strand s; int nsenders; };
int main() { std::thread t; std::shared_ptr<actor> a; { auto ioc = std::make_sharedasio::io_context(); a = std::make_shared<actor>(*ioc, 2); t = std::thread{[ioc]() mutable { ioc->run(); ioc.reset(); }}; }
std::cout << "1\n"; a->strand().post([a]{ std::cout << "2\n"; if (--a->nsenders == 0) { a->work_guard.reset(); } }, std::allocator<void>{});
std::cout << "a\n"; a->strand().post([a]{ std::cout << "b\n"; if (--a->nsenders == 0) { a->work_guard.reset(); } }, std::allocator<void>{});
a.reset(); t.join(); }
That's the same algorithm I use in Emilua, but now I cannot observe the undesired result. I've tried to insert sleep_for() in a few spots in an attempt to mimic the delays/overhead from LuaJIT, but they were not enough to reproduce the behavior I observed in Emilua. So... ideas on how I can make this minimal test case stress more code branches from ASIO?
If you want to reproduce the problem locally, you can attempt the Lua code below:
if _CONTEXT ~= 'main' then local inbox = require 'inbox' print(inbox:receive()) return end
local actor2 = spawn_vm{ module = '.',
-- comment/remove inherit_context=false to make the code work inherit_context = false }
actor2:send('hello')
Just run the program with:
emilua path/to/program.lua
The desired output would be the message "hello" printed in stdout (which happens very rarely on Linux, and happens every time on FreeBSD). The undesired output would be in the likes of:
Main fiber from VM 0x796707f86380 panicked: 'Broadcast the address before attempting to receive on it' stack traceback: [string "?"]: in function 'receive' /home/vinipsmaker/t5.lua:3: in main chunk [C]: in function '' [string "?"]: in function <[string "?"]:0>
-- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/
-- Vinícius dos Santos Oliveira https://vinipsmaker.github.io/