The serialization library uses a code_convert facet to generate utf-8 from wchar_t. I don't know about the BOM bytes. Sounds like this would require an enhanement to the xml_warchive and/or text_warchive implementation. Feel free to submit a suggested patch to the track system. Robert Ramey Tijmen van Voorthuijsen wrote:
Hi,
I am using boost::archive::xml_woarchive to create XML files under Windows, Visual Studio 2008, and in wide character mode. The boost::archive::xml_woarchive does not write the UTF-8 three BOM bytes to the file and from http://en.wikipedia.org/wiki/Byte_order_mark I understand that this is all right since it is optional and even not recommended.
Problems start when I want to edit the file in for example XML Notepad which adds the three BOM bytes when saving. Under Windows this seems normal behavior. Then parsing the XML file throws an exception through the boost::archive::xml_wiarchive.
My question/recommendation:
- Why can't the boost::archive::xml_serialization library not cope with the UTF-8 BOM bytes? - I would recommend that the library can handle XML UTF-8 files, with and without the three BOM bytes. Both are in fact valid UTF-8 XML files.
I now check for the BOM bytes myself before I parse the ifstream in boost::xml_serialization and that works fine.
Many thanks for your answer. Tijmen van Voorthuijsen
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users