Re: [Boost-users] Interest in a Unicode library for Boost?

2 Nov 2019


      On Sat, Nov 2, 2019 at 4:12 AM Rainer Deyke via Boost-users <
boost-users@lists.boost.org> wrote:
...
On 01.11.19 22:22, Zach Laine via Boost-users wrote:
...
Moreover, I don't know of a great use case for a boost::text::is_alpha().
Specifically, it seams that if you are looking for alphabetical
characters,
you are usually doing something like word-breaking, for which there is
already an algorithm, doing a regex match, for which is_alpha() is
insufficient, etc.  I'm open to hearing about such use cases, of course.
Filtering text input.  Parsing programming languages or data description
languages.  Gathering statistics on a piece of text.
My own codebase has has four instances of #include <cctype> and three
instances of #include <ctype.h>, but that's an artificially low number
because character classification is trivial to do by hand for ASCII and
because cctype doesn't support Unicode.  Exactly zero of these instances
can be replaced by any algorithm provided by the proposed library.  All
of them could technically be replaced by regular expressions, but only
in the sense that it is possible to (inefficiently) implement the cctype
interface in terms of regular expressions.
These use cases fall under the regex use case I mentioned.  I still think
they're more appropriately solved that way.  Have you heard of CTRE?  Hana
is working on adding Unicode support to that, including character classes
like is_alpha.

Zach