
6 Jul
2006
6 Jul
'06
4:54 p.m.
Reece Dunn wrote :
What I am interested in is *efficient* codepage -> codepage conversion. I may want to read a file in that is stored as 8-bit ASCII
8-bit ASCII doesn't exist. ASCII is defined on 7 bits.
as UTF8.
ASCII is valid UTF-8.
What you need is an encoding -> UTF32 converter and a UTF32 -> encoding converter.
Honestly, I don't see what's so good about UTF-32. Yes it has fixed size, but it wastes memory ; usually a bidirectional iterator is everything that you need to manipulate your string, so utf-8 seems like a more interesting base.