[BGL] support for graphml i/o

Hello. I've written a graphml reader/writer for BGL, based on the code Douglas Gregor posted on the boost users list at http://article.gmane.org/gmane.comp.lib.boost.user/17877/, and I'm sending it attached. It has the following additional features: * support for attributes of different types (int, float, etc.) * understand the following parse info attributes: parse.nodeids and parse.edgeids. If nodeids or edgeids are canonical, they are automatically converted into indexes, and there's no need to keep a name map, which saves memory for large graphs. If there's interest in having this in the library, I would gladly write documentation for it, and make the necessary modifications/adjustments. Notes: * The code uses some exceptions from the graphviz code. It just seemed unnecessary for both codes to return different exceptions. But this is is trivially changeable. * The code depends on expat. And also, when reading some invalid files, it sometimes just hangs. This is due, probably, with not setting up expat properly to return errors. I would also look into this. BTW, this was written for the graph-tool project, if anyone is interested: http://graph-tool.forked.de Thanks. -- Tiago de Paula Peixoto <tiago@forked.de>

On Aug 9, 2006, at 10:04 AM, Tiago de Paula Peixoto wrote:
Hello.
I've written a graphml reader/writer for BGL, based on the code Douglas Gregor posted on the boost users list at http://article.gmane.org/gmane.comp.lib.boost.user/17877/, and I'm sending it attached.
It has the following additional features:
* support for attributes of different types (int, float, etc.) * understand the following parse info attributes: parse.nodeids and parse.edgeids. If nodeids or edgeids are canonical, they are automatically converted into indexes, and there's no need to keep a name map, which saves memory for large graphs.
If there's interest in having this in the library, I would gladly write documentation for it, and make the necessary modifications/ adjustments.
This is great! If you would write up some documentation and a test case (for our regression tests), I'll review the code in more detail and add it to the BGL. Thanks! Doug

On 08/09/2006 11:28 AM, Doug Gregor wrote:
This is great! If you would write up some documentation and a test case (for our regression tests), I'll review the code in more detail and add it to the BGL.
Sure thing! Just a question: Should I use specific exceptions for the graphml code, or should I assume they will be shared with graphviz? Thanks. -- Tiago de Paula Peixoto <tiago@forked.de>

On Aug 9, 2006, at 11:15 AM, Tiago de Paula Peixoto wrote:
On 08/09/2006 11:28 AM, Doug Gregor wrote:
This is great! If you would write up some documentation and a test case (for our regression tests), I'll review the code in more detail and add it to the BGL.
Sure thing! Just a question: Should I use specific exceptions for the graphml code, or should I assume they will be shared with graphviz?
I think sharing them with the graphviz code would be the best option. Doug

On Aug 9, 2006, at 10:04 AM, Tiago de Paula Peixoto wrote:
I've written a graphml reader/writer for BGL, based on the code Douglas Gregor posted on the boost users list at http://article.gmane.org/gmane.comp.lib.boost.user/17877/, and I'm sending it attached.
I've been busy integrating this GraphML reader into the BGL Python bindings, and I have a few comments along the way: - I believe the boolean type in GraphML is named "boolean", not "bool" - It would be really great if we could get most of the GraphML reader code into a .cpp file, perhaps using the same tricks that we use in the GraphViz reader. Ideally, the BGL GraphML header would not include Expat at all. - We should have an overload that doesn't need the vertex_index map to be passed explicitly; it can just default to get(vertex_index, g). Doug

On 08/10/2006 12:21 PM, Doug Gregor wrote:
On Aug 9, 2006, at 10:04 AM, Tiago de Paula Peixoto wrote:
I've written a graphml reader/writer for BGL, based on the code Douglas Gregor posted on the boost users list at http://article.gmane.org/gmane.comp.lib.boost.user/17877/, and I'm sending it attached.
I've been busy integrating this GraphML reader into the BGL Python bindings, and I have a few comments along the way:
Cool!
- I believe the boolean type in GraphML is named "boolean", not "bool"
Yep. Thanks. ;-)
- It would be really great if we could get most of the GraphML reader code into a .cpp file, perhaps using the same tricks that we use in the GraphViz reader. Ideally, the BGL GraphML header would not include Expat at all.
I guess we can just write a mutate_graph virtual base class, and work only with that in the .cpp file, just like in the graphviz code, without any problems. I'll work on that this weekend, together with the documentation and test case.
- We should have an overload that doesn't need the vertex_index map to be passed explicitly; it can just default to get(vertex_index, g).
Yeah, ok. Thanks a lot for the comments! -- Tiago de Paula Peixoto <tiago@forked.de>

On Aug 10, 2006, at 7:47 PM, Tiago de Paula Peixoto wrote:
On 08/10/2006 12:21 PM, Doug Gregor wrote:
On Aug 9, 2006, at 10:04 AM, Tiago de Paula Peixoto wrote:
I've written a graphml reader/writer for BGL, based on the code Douglas Gregor posted on the boost users list at http:// article.gmane.org/gmane.comp.lib.boost.user/17877/, and I'm sending it attached. I've been busy integrating this GraphML reader into the BGL Python bindings, and I have a few comments along the way:
Cool!
And FWIW, everything has worked out very well. Great work! The only other issue I ran into is that I had to build expat carefully to get the C++ exceptions (e.g., undirected_graph_error) to propagate through expat properly. With GCC, this means compiler with -fexceptions; I'm not sure about other compilers. I think the "right" fix (which I've hacked up in the BGL-Python tree) is to build expat with a C++ compiler. We'll have to think about how to handle this in Boost. Doug

Hi. I'm sending attached a new version of the reader with the following modifications: - separation into graphml.hpp and graphml.cpp, where all the expat stuff is confined to the latter - support for default attribute values (had forgotten about this one) - the minor corrections you pointed out in the other email I'm sending also a test program with a test file, and some documentation (which I blatantly ripped off from read_graphviz and write_graphviz). I didn't check if the generated html is OK, so it will probably need some review. On 08/14/2006 02:06 PM, Douglas Gregor wrote:
I've been busy integrating this GraphML reader into the BGL Python bindings, and I have a few comments along the way: Cool!
And FWIW, everything has worked out very well. Great work!
That's excellent! I can't wait to use it myself...
The only other issue I ran into is that I had to build expat carefully to get the C++ exceptions (e.g., undirected_graph_error) to propagate through expat properly. With GCC, this means compiler with -fexceptions; I'm not sure about other compilers. I think the "right" fix (which I've hacked up in the BGL-Python tree) is to build expat with a C++ compiler. We'll have to think about how to handle this in Boost.
Well, since this would be a problem with every C++ program that tried to use expat, perhaps it could be considered just a problem with the system's expat build, and not Boost's fault (but a note should be added nevertheless to the documentation). In my system (gentoo GNU/linux), the exceptions work OK, which probably means that the library was build like you described (and thus whoever packaged it did the right thing). If it is really necessary to accept the lack of exceptions from expat, than perhaps the exceptions could be delayed and thrown only at the top level function, outside expat. But that would make the code a lot uglier... Take care. -- Tiago de Paula Peixoto <tiago@forked.de> ============================ |(logo)|__ ``read_graphml`` ============================ .. |(logo)| image:: ../../../boost.png :align: middle :alt: Boost __ ../../../index.htm :: void read_graphml(std::istream& in, MutableGraph& graph, dynamic_properties& dp); The ``read_graphml`` function interprets a graph described using the graphml_ format and builds a BGL graph that captures that description. Using this function, you can initialize a graph using data stored as text. The graphml format can specify both directed and undirected graphs, and ``read_graphml`` differentiates between the two. One must pass ``read_graphml`` an undirected graph when reading an undirected graph; the same is true for directed graphs. Furthermore, ``read_graphml`` will throw an exception if it encounters parallel edges and cannot add them to the graph. To handle attributes expressed in the graphml format, ``read_graphml`` takes a dynamic_properties_ object and operates on its collection of property maps. The reader passes all the properties encountered to this object, using the graphml attribute names as the property keys, and with the appropriate C++ type based on the graphml attribute type definition. Requirements: - The type of the graph must model the `Mutable Graph`_ concept. - The type of the iterator must model the `Multi-Pass Iterator`_ concept. - The property map value types must be default-constructible. .. contents:: Where Defined ------------- ``<boost/graph/graphml.hpp>`` Exceptions ---------- :: struct graph_exception : public std::exception { virtual ~graph_exception() throw(); virtual const char* what() const throw() = 0; }; struct bad_parallel_edge : public graph_exception { std::string from; std::string to; bad_parallel_edge(const std::string&, const std::string&); virtual ~bad_parallel_edge() throw(); const char* what() const throw(); }; struct directed_graph_error : public graph_exception { virtual ~directed_graph_error() throw(); virtual const char* what() const throw(); }; struct undirected_graph_error : public graph_exception { virtual ~undirected_graph_error() throw(); virtual const char* what() const throw(); }; struct parse_error : public graph_exception { parse_error(const std::string&); virtual ~parse_error() throw() {} virtual const char* what() const throw(); std::string statement; }; Under certain circumstances, ``read_graphml`` will throw one of the above exceptions. The three concrete exceptions can all be caught using the general ``graph_exception`` moniker when greater precision is not needed. In addition, all of the above exceptions derive from the standard ``std::exception`` for even more generalized error handling. The ``bad_parallel_edge`` exception is thrown when an attempt to add a parallel edge to the supplied MutableGraph fails. The graphml format supports parallel edges, but some BGL-compatible graph types do not. One example of such a graph is ``boost::adjacency_list<setS,vecS>``, which allows at most one edge can between any two vertices. The ``directed_graph_error`` exception occurs when an undirected graph type is passed to ``read_graph`` but the textual representation of the graph is directed, as indicated by the ``edgedefault="directed"`` graph attribute in the graphml format. The ``undirected_graph_error`` exception occurs when a directed graph type is passed to ``read_graph`` but the textual representation of the graph is undirected, as indicated by the ``edgedefault="undirected"`` graph attribute in the graphml format. Building the graphml reader ----------------------------- To use the graphml reader, you will need to build and link against the "bgl-graphml" library. The library can be built by following the `Boost Jam Build Instructions`_ for the subdirectory ``libs/graph/build``. Notes ----- - On successful reading of a graph, every vertex and edge will have an associated value for every respective edge and vertex property encountered while interpreting the graph. These values will be set using the ``dynamic_properties`` object. Some properties may be ``put`` multiple times during the course of reading in order to ensure the graphml semantics. Those edges and vertices that are not explicitly given a value for a property (and that property has no default) will be given the default constructed value of the value type. **Be sure that property map value types are default constructible.** - Nested graphs are supported as long as they are exactly of the same type as the root graph, i.e., are also directed or undirected. Note that since nested graphs are not directly supported by BGL, they are in fact completely ignored when building the graph, and the internal vertices or edges are interpreted as belonging to the root graph. - Hyperedges and Ports are not supported. See Also -------- write_graphml_ Future Work ----------- - Better expat error detection. .. _Graphml: http://graphml.graphdrawing.org/ .. _`Mutable Graph`: MutableGraph.html .. _`Multi-Pass Iterator`: ../../iterator/index.html .. _dynamic_properties: ../../property_map/doc/dynamic_property_map.html .. _write_graphml: write_graphml.html .. _Boost Jam Build Instructions: ../../../more/getting_started.html#Build_Install ============================ |(logo)|__ ``write_graphml`` ============================ .. |(logo)| image:: ../../../boost.png :align: middle :alt: Boost __ ../../../index.htm :: template<typename Graph> void write_graphml(std::ostream& out, const Graph& g, const dynamic_properties& dp, bool ordered_vertices=false); template<typename Graph, typename VertexIndexMap> void write_graphml(std::ostream& out, const Graph& g, VertexIndexMap vertex_index, const dynamic_properties& dp, bool ordered_vertices=false); This is to write a BGL graph object into an output stream in the graphml_ format. Both overloads of ``write_graphml`` will emit all of the properties stored in the dynamic_properties_ object, thereby retaining the properties that have been read in through the dual function read_graphml_. The second overload must be used when the graph doesn't have an internal vertex index map, which must then be supplied with the appropriate parameter. .. contents:: Where Defined ------------- ``<boost/graph/graphml.hpp>`` Parameters ---------- OUT: ``std::ostream& out`` A standard ``std::ostream11 object. IN: ``VertexListGraph& g`` A directed or undirected graph. The graph's type must be a model of VertexListGraph_. If the graph doesn't have an internal ``vertex_index`` property map, one must be supplied with the vertex_index parameter. IN: ``VertexIndexMap vertex_index``> A vertex property map containing the indexes in the range [0,num_vertices(g)]. IN: ``dynamic_properties& dp`` Contains all of the vertex and edge properties that should be emitted by the graphml writer. IN: ``bool ordered_vertices`` This tells whether or not the order of the vertices from vertices(g) matches the order of the indexes. If ``true``, the ``parse.nodeids`` graph attribute will be set to ``canonical``. Otherwise it will be set to ``free``. Example ------- This example demonstrates using BGL-graphml interface to write a BGL graph into a graphml format file. :: enum files_e { dax_h, yow_h, boz_h, zow_h, foo_cpp, foo_o, bar_cpp, bar_o, libfoobar_a, zig_cpp, zig_o, zag_cpp, zag_o, libzigzag_a, killerapp, N }; const char* name[] = { "dax.h", "yow.h", "boz.h", "zow.h", "foo.cpp", "foo.o", "bar.cpp", "bar.o", "libfoobar.a", "zig.cpp", "zig.o", "zag.cpp", "zag.o", "libzigzag.a", "killerapp" }; int main(int,char*[]) { typedef pair<int,int> Edge; Edge used_by[] = { Edge(dax_h, foo_cpp), Edge(dax_h, bar_cpp), Edge(dax_h, yow_h), Edge(yow_h, bar_cpp), Edge(yow_h, zag_cpp), Edge(boz_h, bar_cpp), Edge(boz_h, zig_cpp), Edge(boz_h, zag_cpp), Edge(zow_h, foo_cpp), Edge(foo_cpp, foo_o), Edge(foo_o, libfoobar_a), Edge(bar_cpp, bar_o), Edge(bar_o, libfoobar_a), Edge(libfoobar_a, libzigzag_a), Edge(zig_cpp, zig_o), Edge(zig_o, libzigzag_a), Edge(zag_cpp, zag_o), Edge(zag_o, libzigzag_a), Edge(libzigzag_a, killerapp) }; const int nedges = sizeof(used_by)/sizeof(Edge); typedef adjacency_list< vecS, vecS, directedS, property< vertex_color_t, string >, property< edge_weight_t, int > > Graph; Graph g(used_by, used_by + nedges, N); graph_traits<Graph>::vertex_iterator v, v_end; for (tie(v,v_end) = vertices(g); v != v_end; ++v) put(vertex_color_t(), g, *v, name[*v]); graph_traits<Graph>::edge_iterator e, e_end; for (tie(e,e_end) = edges(g); e != e_end; ++e) put(edge_weight_t(), g, *e, 3); dynamic_properties dp; dp.property("name", get(vertex_color_t(), g)); dp.property("weight", get(edge_weight_t(), g)); write_graphml(std::cout, g, dp, true); } The output will be: :: <?xml version="1.0" encoding="UTF-8"?> <graphml xmlns="http://graphml.graphdrawing.org/xmlns/graphml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns/graphml http://graphml.graphdrawing.org/xmlns/graphml/graphml-attributes-1.0rc.xsd"> <key id="key0" for="node" attr.name="name" attr.type="string" /> <key id="key1" for="edge" attr.name="weight" attr.type="int" /> <graph id="G" edgedefault="directed" parse.nodeids="canonical" parse.edgeids="canonical" parse.order="nodesfirst"> <node id="n0"> <data key="key0">dax.h</data> </node> <node id="n1"> <data key="key0">yow.h</data> </node> <node id="n2"> <data key="key0">boz.h</data> </node> <node id="n3"> <data key="key0">zow.h</data> </node> <node id="n4"> <data key="key0">foo.cpp</data> </node> <node id="n5"> <data key="key0">foo.o</data> </node> <node id="n6"> <data key="key0">bar.cpp</data> </node> <node id="n7"> <data key="key0">bar.o</data> </node> <node id="n8"> <data key="key0">libfoobar.a</data> </node> <node id="n9"> <data key="key0">zig.cpp</data> </node> <node id="n10"> <data key="key0">zig.o</data> </node> <node id="n11"> <data key="key0">zag.cpp</data> </node> <node id="n12"> <data key="key0">zag.o</data> </node> <node id="n13"> <data key="key0">libzigzag.a</data> </node> <node id="n14"> <data key="key0">killerapp</data> </node> <edge id="e0" source="n0" target="n4"> <data key="key1">3</data> </edge> <edge id="e1" source="n0" target="n6"> <data key="key1">3</data> </edge> <edge id="e2" source="n0" target="n1"> <data key="key1">3</data> </edge> <edge id="e3" source="n1" target="n6"> <data key="key1">3</data> </edge> <edge id="e4" source="n1" target="n11"> <data key="key1">3</data> </edge> <edge id="e5" source="n2" target="n6"> <data key="key1">3</data> </edge> <edge id="e6" source="n2" target="n9"> <data key="key1">3</data> </edge> <edge id="e7" source="n2" target="n11"> <data key="key1">3</data> </edge> <edge id="e8" source="n3" target="n4"> <data key="key1">3</data> </edge> <edge id="e9" source="n4" target="n5"> <data key="key1">3</data> </edge> <edge id="e10" source="n5" target="n8"> <data key="key1">3</data> </edge> <edge id="e11" source="n6" target="n7"> <data key="key1">3</data> </edge> <edge id="e12" source="n7" target="n8"> <data key="key1">3</data> </edge> <edge id="e13" source="n8" target="n13"> <data key="key1">3</data> </edge> <edge id="e14" source="n9" target="n10"> <data key="key1">3</data> </edge> <edge id="e15" source="n10" target="n13"> <data key="key1">3</data> </edge> <edge id="e16" source="n11" target="n12"> <data key="key1">3</data> </edge> <edge id="e17" source="n12" target="n13"> <data key="key1">3</data> </edge> <edge id="e18" source="n13" target="n14"> <data key="key1">3</data> </edge> </graph> </graphml> See Also -------- _read_graphml Notes ----- - Note that you can use graphml file write facilities without the library ``libbglgraphml.a``. .. _graphml: http://graphml.graphdrawing.org/ .. _dynamic_properties: ../../property_map/doc/dynamic_property_map.html .. _read_graphml: read_graphml.html .. _VertexListGraph: VertexListGraph.html

On 08/14/2006 11:14 PM, Tiago de Paula Peixoto wrote:
I'm sending also a test program with a test file, and some documentation (which I blatantly ripped off from read_graphviz and write_graphviz). I didn't check if the generated html is OK, so it will probably need some review.
I attached the wrong test file... The one I sent should throw an undirected_graph_error exception. The one I'm sending now should work. -- Tiago de Paula Peixoto <tiago@forked.de>
participants (3)
-
Doug Gregor
-
Douglas Gregor
-
Tiago de Paula Peixoto