Re: [boost] UTF8 library - second call for informal review

Anthony, Thanks for your comments. 1) The typo. Thanks, I'll fix it. 2) Bidirectional iterator. Actually, random iterators are required, which I think makes sense for string containers as I explained to Hervé 3) For the iterators and codecvt, please see my reply to Rogier. Best, Nemanja Trifunovic ----- Original Message ---- From: Anthony Williams <anthony_w.geo@yahoo.com> To: boost@lists.boost.org Sent: Wednesday, December 6, 2006 6:22:41 AM Subject: Re: [boost] UTF8 library - second call for informal review Nemanja Trifunovic <nemanja_trifunovic@yahoo.com> writes:
This is the second call for the informal review of the UTF8 library. It is based on verson 1.02 of UTF8-CPP: http://utfcpp.sourceforge.net/ and you can find it at http://boost-consulting.com/vault/index.php?PHPSESSID=8j6irqkpv3reg5s1gv0lge6um5&direction=0&order=&directory=Strings%20-%20Text%20Processing
First, a typo: there is a missing 'n' in octet_differece_type defined in validate_next, on line 122. Secondly, a real issue: validate_next currently requires bidirectional iterators, in order to backtrack on error. Forward iterators are sufficient for this --- just store a copy of the original, and copy it back if there's an error. This would increase the number of use cases of the library. I agree that a converting iterator would be beneficial, as would a codecvt facet, but this is a good start. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost ____________________________________________________________________________________ Do you Yahoo!? Everyone is raving about the all-new Yahoo! Mail beta. http://new.mail.yahoo.com

Nemanja Trifunovic <nemanja_trifunovic@yahoo.com> writes:
2) Bidirectional iterator. Actually, random iterators are required, which I think makes sense for string containers as I explained to Hervé
I hadn't noticed the uage of "end - it". In any case, that can be replaced with std::distance(it,end) to work with forward iterators. Just because the vast majority of sequences will be random access doesn't mean we should limit ourselves to that, when it is so easy to allow the use of Forward Iterators. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Just because the vast majority of sequences will be random access doesn't mean we should limit ourselves to that, when it is so easy to allow the use of Forward Iterators.
Bidirectional maybe, but using a forward iterator would break some important functionality: for instance, if an invalid sequence is detected, we step back to the beginning of the sequence before reporting the error.

Nemanja Trifunovic <nemanja_trifunovic@yahoo.com> writes:
Just because the vast majority of sequences will be random access doesn't mean we should limit ourselves to that, when it is so easy to allow the use of Forward Iterators.
Bidirectional maybe, but using a forward iterator would break some important functionality: for instance, if an invalid sequence is detected, we step back to the beginning of the sequence before reporting the error.
Forward Iterators are copyable, so you can do this by keeping a copy of the beginning of the sequence, and returning that on error. Anthony -- Anthony Williams Software Developer Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk

Anthony Williams <anthony_w.geo@yahoo.com> writes:
Nemanja Trifunovic <nemanja_trifunovic@yahoo.com> writes:
Just because the vast majority of sequences will be random access doesn't mean we should limit ourselves to that, when it is so easy to allow the use of Forward Iterators.
Bidirectional maybe, but using a forward iterator would break some important functionality: for instance, if an invalid sequence is detected, we step back to the beginning of the sequence before reporting the error.
Forward Iterators are copyable, so you can do this by keeping a copy of the beginning of the sequence, and returning that on error.
Yeah, I *hope* you'd do the same with bidirectional iterators. -- Dave Abrahams Boost Consulting www.boost-consulting.com

David Abrahams <dave <at> boost-consulting.com> writes:
Anthony Williams <anthony_w.geo <at> yahoo.com> writes:
Nemanja Trifunovic <nemanja_trifunovic <at> yahoo.com> writes:
Just because the vast majority of sequences will be random access
doesn't
mean
we should limit ourselves to that, when it is so easy to allow the use of Forward Iterators.
Bidirectional maybe, but using a forward iterator would break some important functionality: for instance, if an invalid sequence is detected, we step back to the beginning of the sequence before reporting the error.
Forward Iterators are copyable, so you can do this by keeping a copy of the beginning of the sequence, and returning that on error.
Yeah, I *hope* you'd do the same with bidirectional iterators.
Sorry for the late answer. Not in this case. "Stepping back" occurs only in case of an invalid UTF-8 sequence which is an exceptional case. If we decide to keep a copy of the iterator, we would need to do it for each function call.
participants (3)
-
Anthony Williams
-
David Abrahams
-
Nemanja Trifunovic