[boost.locale] Question about boundary rules
data:image/s3,"s3://crabby-images/58045/58045cee9bffe230ba6ac02313cb300555475fd1" alt=""
Have just been exploring boost.locale which I hadn't used before. However I'm not quite understanding some of the behaviour for boundary rules when segmenting text. Fortunately I can see this behaviour happening in the example code so its probably my misunderstanding and easily corrected. If I compile http://www.boost.org/doc/libs/1_49_0/libs/locale/doc/html/boundary_8cpp-exam... and run it I get :- [...skipped to avoid long quote ] Part [Linux2.6] has number(s) Part [ ] has no word characters Part [and] has letter(s) Part [ ] has no word characters Part [Windows7] has number(s) letter(s) Part [ ] has no word characters [...] However I don't understand why "Linux2.6" is detected as having number(s) but no letters whilst "Windows7" is detected as having both. It doesn't appear to be the decimal point "Linux26" has the same behaviour (whilst "Linux2" is detected as having both). I haven't debugged this just glanced at the code (which seems to be setting these flags based on all the icu ruleBasedBreakIterator getRuleStatusVec()). Thought I would just ask whether I'm misunderstanding something fundemental here before trying to understand what is going on here and where (if there is one) the problem is TIA Alex Perry ps Just in case this is a known platform / version issue I was running this on :- Windows7 MSVC 10 boost 1.49 icu 4.9.1 -- View this message in context: http://boost.2283326.n4.nabble.com/boost-locale-Question-about-boundary-rule... Sent from the Boost - Users mailing list archive at Nabble.com.
participants (1)
-
alex_perry