11 Apr
2012
11 Apr
'12
4:56 p.m.
I'm a happy user of boost.locale, but there is a use case that I can't see a solution for. I would like to concatenate two long, canonically normalized (NFC) UTF-8 strings. It seems that the only way to currently do this is by calling boost::locale::normalize on the resulting string. This is wasteful, as it requires walking the entire string when only a well-defined substring of each (at the boundary) can possibly require modification. The ideal solution would be for boost.locale to expose something like ICU's unorm2_normalizeSecondAndAppend, which takes advantage of normalization guarantees in the Unicode standard to only normalize the boundary where it is required. Does this capability already exist in boost.locale? Thanks.