Re: [boost] [locale] Review part 1: headers

9 Apr 2011

      ...
...
I don't think neither ICU nor  Boost supports
non-32 bit platform.
I understand that  strictly speaking it is non-standard
but de-facto there are no problems  with this and this
assumption is quite safe.
Since this  assumption arbitrarily limits the library to particular 
platforms and it is  trivial to avoid relying it, I see no point in doing so.
As I told you I can fix it, but practically it does
not give any advantage.

Give me a name of modern compiler that supports C++03
and has sizeof(unsigned) < 4?
...
...
I can change this to uint32_t
...
line 131: Please be consistent  about using unsigned vs. uint32_t.
Good  point
...
line 140: shouldn't offset be  std::size_t?
   Why should the size of the string be  limited
   to 4GB?  std::size_t is guaranteed to  be
   large enough to handle anything that  can
   be stored in memory.
You are right, I'll fix this.
However, two  points:
1. I already know compilers were size_t != sizeof(void  *)
    so actually it must be uintptr_t... but it is not in  C++03
    but rather in C99. (it is in C++0x)
1) You  work with contiguous memory, so size_t is the correct one, not 
uintptr_t
size_t is used in boost as it is closest way to use something
in C++03 standard that is similar to uintptr_t.

That is way C99 and C++03 created uintptr_t.

In any case as I told I'll change this for better consistency/
...
2) uintptr_t should ideally be available from  boost/cstdint.hpp (it 
doesn't seem to be ATM, that's a bug)
3) I  don't know of compilers where those are different, mind saying 
which ones  exhibit this?
As I said before I prefer to relay on native types rather
then boost's types especially if it is feasible.

As I told above I'll change this were necessary but
it is just for "code-beauty".
...
...
2. There is no reason on planet earth you need  to
    create boundary mapping for text>=  4G!
Even entire War&  Peace book takes  about 3MB.
So if you do this you are probably  doing
    something very-very wrong.
But this is different story.
Another unnecessary arbitrary  limit.
Why would you want to prevent people from treating very large data  with 
your library?
On the Internet, there is text much bigger than  War & Peace, even if not 
necessarily in the form of  books.
As I told above I'll change it to size_t but...

You don't want to run boundary analysis on such chunks of text
in any case - and if you do it, you are doing something wrong.
...
...
...
Why do you have explicit specializations  of
boundary_indexing?  They're all the same.
The  compiler is perfectly capable of instantiating
them all from a  single definition.
Ok this is something that you'll  see all over the Boost.Locale
code when it comes to deal with facets  generations.
It is in order to support (brain damaged) DLL  platform.
Both MSVC and GCC support extern template, which is also a  C++0x feature.
Believe me, it was only feasible solution I've reached after
spending many man hours fighting DLL's, std::locale facets and
both compilers.

(AFAIR MSVC uses specialization as well for its own facets)

So after fighting both compilers and spending days on solving and tuning
this problem I can say this is what should be done.

It requires some good experience with std::locale facets, DLLs
exporting, and having specific tests.

So what I did is unfortunately required and most reliable solution.
...
...
Yes... However all supported platforms are at  least 32bits.
There is one small points. As far as I remember  I
always used native types over strict types in the interfaces
 (function calls) because if it is changed between
small releases (like  stdint.h found) it would not
break binary compatibility so native types  (as long
as they good enough) are better alternative.
(We are talking about C++ compilers that does not
handle C99 stdint  well)
Boost has a portable stdint equivalent.
I know and use it, see the description above.
...
In any case, you  shouldn't even use uint32_t, but uint_least32_t, or 
better yet, char32_t if  it is available.
No, see:

http://cppcms.sourceforge.net/boost_locale/html/appendix.html#why_no_special...

In any case I use uint32_t or uint16_t when I do relate to handling
of UTF-16 or UTF-32
...
boost/cuchar.cpp in my Unicode library shows the right  way to define the 
types Unicode needs to work  with.
Same link as above, typedef of uint16_t or uint32_t is unsuitable for 
Boost.Locale
character representation.

See your Unicode library is orthogonal to Boost.Locale, it
handles things very different and has different purpose.

Artyom