On 12/06/2017 22:05, Peter Dimov via Boost wrote:
Artyom Beilis wrote:
Deny of Service Attack Example:
- User creates a file with invalid UTF-16 - System monitors the file system and adds it to the XML report in WTF-8 format - The central server does not accept the XML since it fails UTF-8 validation - User does whatever he wants without monitoring - It removes the file - There were no reports generated during the period user needed -DOS attack
I can't help but note that the same attack would work under Unix. The user can easily create a file with an invalid UTF-8 name. And, since the library doesn't enforce valid UTF-8 on POSIX (right?) it would pass through.
Also, running sanitisation on UTF sequences is laudable and all, but what about the 80-90% of code which will never, ever feed strings from untrusted remote parties to system APIs? Do they pay for the UTF sequence validation too? Like Peter and many of us, I have a personal "nowide" library too. It simply wraps the *W() Win32 APIs consuming 16 bit characters with a call to the NT kernel function for converting UTF-8 to UTF-16 which is RtlUTF8ToUnicodeN(). It does nothing else fancy, it leaves it entirely up to the Windows kernel what that conversion means. I'd suggest that the nowide library ought to keep things similarly simple. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/