[gil] Broken bit-aligned pixel locator on MSVC++ 64-bit optimized builds only
Hi,
Over the last few months, intermittently I've been chasing a peculiar
bug that has
been revealing itself in failing checksum tests in GIL's
test/legacy/image.cpp [1]
since we started testing GIL with MSVC++ (VS2017, VS2019) with
b2 toolset=msvc variant=release address-model=64
[1] https://github.com/boostorg/gil/blob/2ff2d31895dae665ef3e05d0496bc082470e229...
The peculiarity of this bug is that it has not been observed using any
of these compilers:
- GCC: 4.8, 4.9, 5, 6, 7, 8, 9
- Clang: 3.9, 4.0, 5.0, 6.0, 7, 8, 9, 10
- XCode: 8.3, 9.0, 9.1, 9.2, 9.3, 9.4, 10
There is complete report [2], but I also decided to reach the list
looking for any
insights that may help me to understand what may be causing the bug.
Complete repo [3] has GitHub Actions to build and run minimal_test.cpp test.
[2] https://github.com/boostorg/gil/issues/447
[3] https://github.com/mloskot/boost-gil-checksum-bug
This minimal_test.cpp (at the bottom) leaves out the checksum calculation and
narrows the scope of the bug down to the `xy_locator` and advances of
the iterators.
The test basically is this:
1. Create 3x3 image based on bit-aligned BGR or RGB pixel (e.g. bgr121_image_t)
2. Fill image with red
3. Draw blue diagonal
```
bgr121_image_t img(3, 3);
// ... fill img with `bgr121_red` pixels
// draw blue diagonal
auto v = view(img);
auto loc = v.xy_at(0, v.height() - 1);
for (std::ptrdiff_t y = 0; y < v.height(); ++y)
{
*loc = bgr121_blue; // set blue pixel value
++loc.x(); // INCREMENT X FIRST
--loc.y();
}
```
If the loop like this is compiled using either
- `b2 toolset=msvc variant=release address-model=64
optimization=space` (/O1 /Ob2)
- `b2 toolset=msvc variant=release address-model=64
optimization=speed` (/O2 /Ob2)
then pixels of the blue diagonal are messed up.
If the loop is compiled without optimization using either
- `b2 toolset=msvc variant=debug address-model=64` (/Od /Ob2)
- `b2 toolset=msvc variant=release address-model=64 optimization=off` (/Od /Ob2)
then pixels of the blue diagonal are correct.
Now, if the `for` loop above is modified, that is, lines incrementing
the `x_iterator` and `y_itertor` are reordered:
```
for (std::ptrdiff_t y = 0; y < v.height(); ++y)
{
*loc = bgr121_blue; // set blue pixel value
--loc.y();
++loc.x(); // PRE-INCREMENT X SECOND
}
```
or
```
for (std::ptrdiff_t y = 0; y < v.height(); ++y)
{
*loc = bgr121_blue; // set blue pixel value
--loc.y();
++loc.x(); // POST-INCREMENT X SECOND
}
```
then the blue diagonal is drawn correctly regardless of the MSVC++
optimizations.
If the loop is manually unrolled, see minimal_test.cpp below, the blue
diagonal is correct too.
I'm yet to try the CE to see compare the generated code, as far as my
limited assembly fu goes, but currently Boost for MSVC++ on CE is
broken (https://github.com/mattgodbolt/compiler-explorer/issues/1816)
If anyone would have any insights on what may be causing the problem,
or any suggestions what else to test, I would greatly appreciate that.
Obviously, I'm getting suspicious about the MSVC++ optimizer, but I do know
blaming a compiler is last+1 resort :)
Thanks
Mateusz
```
// minimal_test.cpp
#include
const; using bgr121_image_t = gil::image
; using bgr121_view_t = typename bgr121_image_t::view_t; using bgr121_pixel_t = typename bgr121_view_t::value_type;
bgr121_pixel_t bgr121_red(0), bgr121_blue(0); void fill_image_red(bgr121_image_t& img) { gil::rgb8_pixel_t red8(255, 0, 0); gil::rgb8_pixel_t blue8(0, 0, 255); gil::color_convert(red8, bgr121_red); gil::color_convert(blue8, bgr121_blue); auto v = view(img); std::fill(v.begin(), v.end(), bgr121_red); for (auto it = view(img).begin().x(), end = view(img).end().x(); it != end; ++it) BOOST_TEST(*it == bgr121_red); } void test_draw_with_xy_locator_loop_fail(std::ptrdiff_t w, std::ptrdiff_t h) { bgr121_image_t img(w, h); fill_image_red(img); { auto v = view(img); auto loc = v.xy_at(0, v.height() - 1); for (std::ptrdiff_t y = 0; y < v.height(); ++y) { *loc = bgr121_blue; // OPTION 1 FAILS ++loc.x(); // PRE-INCREMENT X FIRST --loc.y(); // OPTION 2 FAILS TOO //loc.x()++; // POST-INCREMENT X FIRST //loc.y()--; // OPTION 3 FAILS TOO //auto& x_it = loc.x(); // PRE-INCREMENT NAMED X FIRST //++x_it; //auto& y_it = loc.y(); //--y_it; } } { auto it = view(img).begin().x(); // row 0 BOOST_TEST(*it == bgr121_red); ++it; BOOST_TEST(*it == bgr121_red); ++it; BOOST_TEST(*it == bgr121_blue); ++it; // row 1 BOOST_TEST(*it == bgr121_red); ++it; BOOST_TEST(*it == bgr121_blue); ++it; // FAIL BOOST_TEST(*it == bgr121_red); ++it; // row 2 BOOST_TEST(*it == bgr121_blue); ++it; BOOST_TEST(*it == bgr121_red); ++it; BOOST_TEST(*it == bgr121_red); } } void test_draw_with_xy_locator_loop_good(std::ptrdiff_t w, std::ptrdiff_t h) { bgr121_image_t img(w, h); fill_image_red(img); { auto v = view(img); auto loc = v.xy_at(0, v.height() - 1); for (std::ptrdiff_t y = 0; y < v.height(); ++y) { *loc = bgr121_blue; // OPTION 1 WORKS --loc.y(); ++loc.x(); // PRE-INCREMENT X SECOND // OPTION 2 WORKS TOO //loc.y()--; //loc.x()++; // POST-INCREMENT X SECOND } } { auto it = view(img).begin().x(); // row 0 BOOST_TEST(*it == bgr121_red); ++it; BOOST_TEST(*it == bgr121_red); ++it; BOOST_TEST(*it == bgr121_blue); ++it; // row 1 BOOST_TEST(*it == bgr121_red); ++it; BOOST_TEST(*it == bgr121_blue); ++it; BOOST_TEST(*it == bgr121_red); ++it; // row 2 BOOST_TEST(*it == bgr121_blue); ++it; BOOST_TEST(*it == bgr121_red); ++it; BOOST_TEST(*it == bgr121_red); } } void test_draw_with_xy_locator_step_good(std::ptrdiff_t w, std::ptrdiff_t h) { bgr121_image_t img(w, h); fill_image_red(img); { auto v = view(img); auto loc = v.xy_at(0, v.height() - 1); *loc = bgr121_blue; // red red blue ++loc.x(); --loc.y(); *loc = bgr121_blue; // red blue red ++loc.x(); --loc.y(); *loc = bgr121_blue; // blue red red ++loc.x(); --loc.y(); } { auto it = view(img).begin().x(); // row 0 BOOST_TEST(*it == bgr121_red); ++it; BOOST_TEST(*it == bgr121_red); ++it; BOOST_TEST(*it == bgr121_blue); ++it; // row 1 BOOST_TEST(*it == bgr121_red); ++it; BOOST_TEST(*it == bgr121_blue); ++it; BOOST_TEST(*it == bgr121_red); ++it; // row 2 BOOST_TEST(*it == bgr121_blue); ++it; BOOST_TEST(*it == bgr121_red); ++it; BOOST_TEST(*it == bgr121_red); } } int main() { test_draw_with_xy_locator_loop_fail(3, 3); test_draw_with_xy_locator_loop_good(3, 3); test_draw_with_xy_locator_step_good(3, 3); return boost::report_errors(); } ``` Best regards, -- Mateusz Loskot, http://mateusz.loskot.net
On Wed, 11 Mar 2020 at 01:02, Mateusz Loskot
Over the last few months, intermittently I've been chasing a peculiar bug that has been revealing itself in failing checksum tests in GIL's test/legacy/image.cpp [1] since we started testing GIL with MSVC++ (VS2017, VS2019) with
b2 toolset=msvc variant=release address-model=64 [...] auto v = view(img); auto loc = v.xy_at(0, v.height() - 1); for (std::ptrdiff_t y = 0; y < v.height(); ++y) { *loc = bgr121_blue; // set blue pixel value ++loc.x(); // INCREMENT X FIRST --loc.y(); }
Ha! I've just noticed failures between VS 2017 and 2019 are a bit different, still for bgr121_image_t though, still for 64-bit optimised builds only: VS2017 build fails at bgr121_views_90cw test case: ====== BEGIN OUTPUT ===== ... checking checksum for bgr121_basic_red_x (crc=7f6e24e7) Checking checksum for bgr121_basic_white_x (crc=e4aaa1d3) Checking checksum for bgr121_views_original (crc=7e5bb87d) Checking checksum for bgr121_views_cropped (crc=43305e15) Checking checksum for bgr121_views_gray8 (crc=bbabe219) Checking checksum for bgr121_views_my_gray8 (crc=3086d22f) Checking checksum for bgr121_views_transpose (crc=da2ff80) Checking checksum for bgr121_views_rot180 (crc=68f37202) Checksum error in bgr121_views_90cw (crc=8ffd852b != 8565efd8) Unknown exception ====== END OUTPUT = VS2019 build fails much sooner: ====== BEGIN OUTPUT ===== ... Checksum error in bgr121_basic_red_x (crc=85c86c53 != 7f6e24e7) Unknown exception ====== END OUTPUT = Clearly, there is something fishy going on near the MSVC++ optimisation. Looking at the C++ code, I'm getting very suspicious about what is going on in the decrement operator, or boost::iterator_facade implementor. I have not compared assembly for the minimal_test.cpp cases. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net
participants (1)
-
Mateusz Loskot