
I am so sorry I forgot to mention I was referring to Natural Language Processing. Thanks for your advice! On 27 March 2011 17:00, Mathias Gaunard <mathias.gaunard@ens-lyon.org>wrote:
On 26/03/2011 18:54, Sarma Tangirala wrote:
Also, I was looking at some C++ code using Boost/tokenizer.hpp that
tokenized some text and it looked a bit scary.
Any suggestions or advice?
Look into iterators, ranges, and the various string manipulation and parsing libraries within Boost (Iterator, Range, StringAlgo, Regex, Spirit, Xpressive). You could also want to look into my Unicode library. I could add word boundaries for non-thai languages if your project needs that.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
-- Regards, Sarma Tangirala, Junior - Class of 2012, Department of Information Science and Technology, College of Engineering Guindy - Anna University