
On Tue, Aug 16, 2011, Phil Endecott wrote:
Soares Chen Ruo Fei wrote:
I think it'll be easier to just remove the decrement function completely.
No, don't do that. (That would be like removing random access from std::vector because std::list can't implement it efficiently.)
I'm not familiar with the algorithms requiring bidirectional access that Artyom mentions, but a standard way to make them work with iterators for various different encodings would be to specialise the algorithms. You would have a main implementation that requires the bidirectional (or random access) iterator, and a forwarding implementation that looks like this:
template <typename FORWARD_ITER> void algorithm(FORWARD_ITER begin, FORWARD_ITER end) { // Make a copy of the range into a bidirectional container: std::vector< typename FORWARD_ITER::value_type > v(begin,end); // Call the other specialisation: algorithm(v.begin(),v.end()); }
That is the standard time-vs-space complexity trade-off.
Well I don't think forcing all generic Unicode algorithms to provide specialization version for forward-only iterators is any better than providing a less-efficient bidirectional iterator. Such a burden is too high for the algorithm developers. Or perhaps a better decision is to simply let the compiler yield a (friendly?) error when the generic algorithm uses the decrement/random access operator, and find a way to inform the user to convert the string to standard UTF strings before passing to the Unicode algorithms. Or perhaps I could find a way to let template instances of unicode_string_adapter with MBCS encoding to store convert the string to UTF string during construction and store the UTF encoded string instead. The only problem for this is that during conversion back to the raw string, the string adapter would have to reconvert the internally stored UTF-encoded string back to the MBCS-encoded string. This can be expensive if the user regularly wants access the raw string, unless we store two smart pointers within the string adapter - one for the MBCS string and one for the converted UTF string, but doing so would waste storage space as well.
(Actually it's also because I don't know if there is any way to conditionally let the code point iterator inherit from either std::forward_iterator or std::bidirectional_iterator)
You don't mean "inherit from". You mean "be a model of". See Artyom's "VERY BAD DESIGN" post. There should not be any virtual methods anywhere in this library. If you don't understand how that can be done, we should discuss that urgently.
The virtual functions are used in my prototype file dynamic_unicode_string.hpp. The design hasn't gone through much thought and I wrote it just to demonstrate to Artyom that dynamic encoded strings can be implemented at a higher layer by using virtual functions. There might be more efficient ways to do so but I'll leave it for another discussion thread. Soares