
AMDG Okay, I managed to get through the encoding, icu, and posix subdirectories today. encoding/codepage.cpp: line 28: In C++ you should use <cstring> (and std::strlen) encoding/conv.hpp: encoding/iconv_codepage.hpp I think the convention for headers like this which are really source files in disguise is to give them the extension .ipp. line 125: You don't actually care how the vector's contents are initialized, do you? You can just specify the size. line 150: I think sresult.append(out_start, out_ptr); would be slightly clearer. line 168: What are the possible errno values besides EILSEQ, EINVAL, and E2BIG? If you get an error you don't expect, is silently stopping the correct behavior? iconv_between: Inheritance from iconv_from_utf is a little odd. iconv_to_utf: The comments from iconv_from_utf apply. Also, the implementation is almost the same as iconv_from_utf. I think you could define a single function template to implement both. encoding/uconv_codepage.hpp: line 52: Do you really intend to translate any std::exception? Including std::bad_alloc? That isn't consistant with what iconv_codepage does. encoding/wconv_codepage.hpp: Missing #include <cstring>, <string> lines 86,101: The STL is your friend: size_t remove_substitutions(std::vector<wchar_t> &v) { v.erase(std::remove(v.begin(), v.end(), wchar_t(0xFFFD)), v.end()); return v.size(); } icu/all_generator.hpp: Why <vector>? You're using locale and string, but not vector. icu/boundary.cpp: icu/cdata.hpp: Missing #include <string> icu/uconv.hpp: Needs #include <string> and #include <memory>. line 8: This #include guard is excessively generic. icu/codecvt.cpp: lines 44, 48: If this throws, cvt_ needs to be closed. icu/codecvt.hpp: Missing #include <memory> The #include guard is wrong. icu/collator.cpp: line 115: Does do_basic_transform guarantee that the string contains no embedded null bytes? It seems that this is required for hash to work correctly. icu/conversion.cpp: icu/date_time.cpp: In calendar_impl: I can't figure out why you're locking in some functions and not others. For the most part, you seem to lock in non-mutating functions and not in mutating functions, so I was guessing that you are trying to guarantee that a calender can be read concurrently, but writes must be exclusive and that ICU does not guarantee that concurrent reads are okay. However, get_timezone and is_same don't lock. Does ICU handle these methods differently from the others W.R.T. thread-safety? line 236: Do you need a lock here? You're only using self which is a different object. icu/formatter.cpp: line 69: This seems like a really bad idea. It means that integers larger than 0x7FFFFFFFFFFFFFFF will be formatted incorrectly. If this is an unavoidable limitation of ICU, I think you shoud probably throw an exception and document the behavior. And don't think that people won't try to print unsigned numbers this big. I know I have. line 263: This comparison is subtly wrong for int64_t. Consider what happens when date is exactly 2^63. limits_type::max() will be 0x7FFFFFFFFFFFFFFF. According to [expr] this will be converted to double. For most rounding modes, the result will be 2^63, which is equal to date. Then, when we get to line 268 and cast back to ValueType, the result will be 0x8000000000000000, and the behavior is undefined according to [conv.fpint]. (Note: It's obviously not easy to trigger this, but I'm guessing that a specially crafted test case could) icu/formatter.hpp: line 44: What about float and long double. long double may have more precision than double. Also, even for float, the double rounding caused by working with double can produce slightly different results from when using float directly. Does ICU only allow you to use double and [u]int[32|64]_t? line 93: I find this comment a bit misleading. It wasn't clear until I looked at the implementation that what was being cached was the ICU formatter. The boost::locale::impl_icu::formatter object is not cached. icu/icu_backend.cpp: line 69: Ideally you should not mark the object as valid until you know that this function will not fail. line 91: I'd kind of like install to be thread-safe... icu/icu_backend.hpp: icu/icu_util.hpp: icu/numeric.cpp: In num_format: You don't handle bool? line 290: Alas, the standard doesn't guarantee that you can put_back an arbitrary number of characters. This isn't going to work if you're reading from a non-seekable stream and the line is sufficiently long. icu/predefined_formatters.hpp: icu/time_zone.hpp: icu/time_zone.cpp: icu/uconv.hpp: posix/all_generater.hpp: posix/codecvt.cpp: mb2_iconv_converter seems to be doing a lot of the same work as encoding/iconv_codepage. Is there any way to avoid the duplication? posix/codecvt.hpp: Missing #include <memory> and <string>. posix/collate.cpp: posix/converter.cpp: I noticed that you convert UTF8 to wchar_t. For the posix backend, are there other multibyte encodings that would need this? Or would converting everything to UTF32 cause other problems? posix/numeric.cpp: line 54: sizeof(buf) is 64, but you pass 256? line 57: is 1024 absolutely guaranteed to be enough? Otherwise you should use a loop with exponential reallocation. Line 97 too. lines 68, 75: std::copy? posix/posix_backend.cpp: install: The same comment W.R.T. thread-safety applies as for icu. line 85: If this fails, don't you still need to call freelocale? I think this would be simplified if you created a simple RAII wrapper that holds a locale_t by value. posix/posix_backend.hpp: (To be continued) In Christ, Steven Watanabe