On Fri, Nov 29, 2019 at 1:15 AM Peter Dimov via Boost
Zach Laine wrote:
Ok, I understand your point a bit better now I think. Is it the unboundedly-large nature of a NTBS that has you concerned? That is, do you think that op+=(char) should assert and op+=(char const *) should throw?
No, I prefer all of them consistently throwing, even when the check is O(1). My primary concern is not the loss of efficiency due to double checking, it's to avoid buffer overflow exploits.
Ok, so the issue as you see it is that a program that is memory unsafe should die in release builds rather than allowing an attacker to take over, is that right? If so, I understand the motivation, but I don't share it -- I literally do not ever need to worry about such things in my work. Should everyone pay the checking cost, because this is a concern for some code (even if it is a common concern -- I'm not minimizing it more than to say that it's <100% of uses). Moreover, you mention using __builtin_trap below, which gives you a quick death instead of acting as an attack vector, but the proposed design does not. Again I ask, if we throw in this case, what do I write in catching code to remediate the situation that I need to stuff 10lbs of bits into a 5lb sack?
Yes, this will be less efficient if you += characters (but not by so much as one might think), and yes, you can write code that will be correct if op+= doesn't check. (Since operating on chars destroys optimizations due to aliasing, the inner loop improves from
.L5: add rax, 1 movzx ecx, BYTE PTR [rdx] cmp rax, 511 ja .L11 add rdx, 1 mov QWORD PTR [rdi], rax mov BYTE PTR [rdi+7+rax], cl cmp rsi, rdx jne .L5
to
.L3: movzx ecx, BYTE PTR [rdx+rax] add rax, 1 mov QWORD PTR [rdi], rax mov BYTE PTR [rdi+7+rax], cl cmp r8, rax jne .L3
which is better, but far from optimal.)
Ok, I tried this myself on Quickbench: http://quick-bench.com/WZPNHFBKI-hFxThBYRZVjndtBho I see a slowdown factor of 1.39 when throwing if a single newly appended character (the char op+= case) does not fit, vs. UB. UB is our friend here. It would be a shame if this code: std::ranges::copy(r, std::back_inserter(str)); (for some fixed_string str) were 1.39x slower, if I know that r is smaller than str's capacity, and str is empty.
I were implementing this class, I would always check in op+= (and append etc) even if the specification says "Expects" instead of "Throws", I'd just __builtin_trap instead of throwing. Otherwise, this is strcat all over again, and there's a reason there was a big concerted push against this style in/by Microsoft and elsewhere.
Zach