
Hi, Nice! I recently worked on a xml parser and generator library. I had to work on several different xml formats, and writing sax code for all these formats looked like a stupid repetitive process. Using a dom parser did not help either, there still was lots of code which just forwarded parsed contents of strings to some method, or data structures. So i started working on a way to describe a xml format in c++ code, and generate the sax binding code for that format. So I also had to figure out how arbitrary objects of a certain type can be filled with the data in a string of the xml element. So I defined a 'value property' type for values, and a container property for sequence container ( other might follow ). These properties carry the type information, and the access path for reading and writing of that property. Usually the properties are grouped together in a so called property map, which maps certain key types on the property type ( and object ). With this meta information tool it was possible to build a class interface independent format description that generates a sax parser using the expat library to forward xml items directly to the structures. I have some example code, which is a reduced version of a real format: // two key types used for the property_map: struct Data{}; struct Name{}; // Node is struct Node { private: std::vector<Node*> nodes; std::string name,data; public: typedef property_map< mpl::vector< con<Node>, // adds the type con<Node,Node> so Node is // key and data, so Node can be used to access a // container property, that reflects a sequence // container of Nodes elem<Name, std::string>, // a std::string value property elem<Data,std::string> // like above but identified using Data >, Node > type_i; static type_i const& get_info(); }; I stripped the code in get_info, which initiallizes the property_map structure, bascially because the init code is pretty dense and needs a lot of improvement. There is another structure called RootNode which describes a similar structure but without Data. An example file for that format could look like that: <?xml version="1.0"?> <root_node name="example_tree"> <node name="empty" data="0" /> <node name="base_item1" data="124"> <node name="triple_obj" data="22"> <node name="hs1" data="9"/> <node name="hs2" data="13"/> <node name="hs3" data="10"/> </node> <node name="single" data="-120"/> </node> </root_node> With my library the format can be described like that: boost::shared_ptr<Receiver> basic_node; // Receiver is a base class for all classes which get called by the expat sax code // We now define the 'node' tag: basic_node = xml::gen_object_node( // we have to set the property map, and the tag name xml::sub_tag<Node>( Node::get_info(), "node") // no we add all attributes .attributes( xml::attribute.assign<Name>("name") | xml::attribute.assign<Data>("data") ) // and a sub tag which points on basic_node .sub_tags( xml::link_tag<Node>( basic_node, "node" ) ), Node::get_info() // the property map a second time.. :( ); // now the root tag: boost::shared_ptr<Receiver> root_node = xml::gen_root_node( xml::root_tag( RootNode::get_info(), "root_node") .attributes( xml::attribute.assign<Name>("name") ) .sub_tags( xml::link_tag<Node>( // here we link to basic_node basic_node, "node" ) ) ); Parser p; RootNode obj; try{ // parsing : p.parse( root_node, filename, &obj ); // printing: root_node->print( &obj, file_stream ); }catch ( std::exception &e){ // ... } The xml library was writen to handle lots of different formats, and to easily handle any changes of the format, during the development of the system. It was not intended to become the ultimative xml library, lots of features are missing, but i think it could be good part of a bigger more versatile xml library. Or put on top of the raw sax interface of that xml library. I have to admit that my personal intersts have moved, I am much more intersted in the property part, the defining of meta informations. I plan to write my master (diplom) thesis about that topic. So about defining type information, in C++ structures and types, and then showing how to use this information to simplyfy or automate libraries interfaces. I planed to use the xml library described above, and a simple database library as a proof of concept, maybe also a small gui library based on something like antigrain. The properties still have to be improved, their usage is still too complicated, and some features are missing. The code is available at http://svn.berlios.de/viewcvs/kant/trunk/source/src/util/ and http://svn.berlios.de/viewcvs/kant/trunk/source/src/serialize/ I think about changing the code daily, but i have to finish a different work at the university before i can focus on that code again: Currently the value property consists of a get and set part which allows const and non const access to a value: template <typename T, typename Compound = mpl::void_> struct value_property { boost::shared_ptr< setter<T,Compound> > set; boost::shared_ptr< getter<T,Compound> > get; }; getter and setter are base classes for lots of different kinds of access. The getter for example has 6 different implementations, that handle access by direct memory access, a method pointer that returns a const reference, a method pointer that returns a value, a method that expects reference parameter which gets the value assigned .... I now think about adding a feature to hook functionality into the get or set part of the property, e.g. to lock a mutex, or check the data passed to the property, for example to ensure a certain string format, and to throw on error, or to send a signal everytime the value changes ... Apart from that the property design needs a bigger change, because the current design of the value_property completly fails when used by multiple threads. -- I wish i had more time, these days -- So i would like to work on a 'property' library or meta type library, but this functionality could overlap with a possbile gui library, the boost::db ideas which were performed here and maybe also the boost::python/langbinding libraries. After that i would like to focus on either using that library in a database and/or gui library environment. Regards Andreas Pokorny On Sat, Nov 06, 2004 at 11:46:35AM +0100, Thorsten Ottosen <nesotto@cs.auc.dk> wrote:
Dear all,
Following our discussion of the unicode library, would it not be a good idea to persue such efforts more aggresively?
I could imagine it would help bring forward libraries much faster. I think it would be reasonable that the boost comunity provided
1. project descriptions 2. help and guidelines throughout the 6-12 months of the project
If we had small papers explaining potential projects, these can be sent to universities which can the in turn suggest them to their students.
Off the top of my head, I can think of these projects
1. C++ database library 2. C++ statistics library 3. exact reals class 4. An XML parser and generator library
I could probably be a co-author and contact person of (2).
Any thoughts?
-Thorsten