1.33.1 read_graphviz possible parser issue

Brian Stadler

14 Sep 2007 14 Sep '07

8:06 p.m.

Hi all, I'm reading in thousands of files in the graphviz language into a program using read_graphviz(), library version 1.33.1. I noticed a problem with some of my results and discovered that read_graphviz doesn't appear to be interpreting the files correctly. The structure of the graphs is preserved (no edges gone astray). However, the labeling of the nodes is changed. I confirmed this by reading in a file and immediately writing it back out using write_graphviz(). Below is the code used to read in the files. It's very simple as I have nothing special being reprsented. Could someone else check this and confirm this is a problem. I've checked the bugs list and 1.34.x release notes and found nothing addressing the particular issue. thanks all. My read_graphviz code: bool file; ifstream in(openfile.c_str(), ios::in); dynamic_properties dp; dp.property("node_id", get(&ed_node::vertex_name, ug)); <--simple adjacency list with bundled properties being used file = read_graphviz(in, ug, dp, "node_id"); Source graphviz file: strict graph { 0 -- 3; 0 -- 4; 1 -- 3; 1 -- 4; 2 -- 3; 2 -- 4; 3 -- 4; } Output graphviz file: graph G { 0; 1; 2; 3; 4; 0--1 ; 0--2 ; 3--1 ; 3--2 ; 4--1 ; 4--2 ; 1--2 ; }

Attachments:

attachment.html (text/html — 1.5 KB)

Show replies by date

Krishna Roskin

17 Sep 17 Sep

7:38 a.m.

On 9/14/07, Brian Stadler <bdotstadler@gmail.com> wrote:

...

I'm reading in thousands of files in the graphviz language into a program using read_graphviz(), library version 1.33.1. I noticed a problem with some of my results and discovered that read_graphviz doesn't appear to be interpreting the files correctly. The structure of the graphs is preserved (no edges gone astray). However, the labeling of the nodes is changed. I confirmed this by reading in a file and immediately writing it back out using write_graphviz(). Below is the code used to read in the files. It's very simple as I have nothing special being reprsented. Could someone else check this and confirm this is a problem. I've checked the bugs list and 1.34.x release notes and found nothing addressing the particular issue.

thanks all.

My read_graphviz code:

bool file; ifstream in(openfile.c_str(), ios::in); dynamic_properties dp; dp.property ("node_id", get(&ed_node::vertex_name, ug)); <--simple adjacency list with bundled properties being used file = read_graphviz(in, ug, dp, "node_id");

Source graphviz file:

strict graph { 0 -- 3; 0 -- 4; 1 -- 3; 1 -- 4; 2 -- 3; 2 -- 4; 3 -- 4; }

Output graphviz file:

graph G { 0; 1; 2; 3; 4; 0--1 ; 0--2 ; 3--1 ; 3--2 ; 4--1 ; 4--2 ; 1--2 ; }

It looks like read_graphviz just adds the vertices in the order they appear in the graph. That's why they get the above vertex indices. My code does this all the time so I just learned to live with it. If I really care about the vertex names, I add them to the graphviz file with [label="name"] and not count on the vertex index. HTH, -krish

Brian Stadler

5:45 p.m.

Good observation. I see exactly what it's doing now. However, this is still a bug. If one is using dot language files across different programs then the parser needs to interpret it the same. Though the graph may retain its overall structure, the structure for that individual vertex has now changed. This will adversely affect the way algorithms make entree into the graph. For example, if vertex 0 once had a degree of two and now has a degree of 3 than the choices to move from 0 have changed from two choices to three. thanks again for the response! On 9/17/07, Krishna Roskin <krish@soe.ucsc.edu> wrote:

...

...
I'm reading in thousands of files in the graphviz language into a

...
using read_graphviz(), library version 1.33.1. I noticed a problem with some of my results and discovered that read_graphviz doesn't appear to be interpreting the files correctly. The structure of the graphs is

On 9/14/07, Brian Stadler <bdotstadler@gmail.com> wrote: program preserved

...
(no edges gone astray). However, the labeling of the nodes is changed. I confirmed this by reading in a file and immediately writing it back out using write_graphviz(). Below is the code used to read in the files. It's very simple as I have nothing special being reprsented. Could someone else check this and confirm this is a problem. I've checked the bugs list and 1.34.x release notes and found nothing addressing the particular issue.

thanks all.

My read_graphviz code:

bool file; ifstream in(openfile.c_str(), ios::in); dynamic_properties dp; dp.property ("node_id", get(&ed_node::vertex_name, ug)); <--simple adjacency list with bundled properties being used file = read_graphviz(in, ug, dp, "node_id");

Source graphviz file:

strict graph { 0 -- 3; 0 -- 4; 1 -- 3; 1 -- 4; 2 -- 3; 2 -- 4; 3 -- 4; }

Output graphviz file:

graph G { 0; 1; 2; 3; 4; 0--1 ; 0--2 ; 3--1 ; 3--2 ; 4--1 ; 4--2 ; 1--2 ; }

It looks like read_graphviz just adds the vertices in the order they appear in the graph. That's why they get the above vertex indices. My code does this all the time so I just learned to live with it. If I really care about the vertex names, I add them to the graphviz file with [label="name"] and not count on the vertex index.

HTH, -krish _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Ronald Garcia

18 Sep 18 Sep

3:49 p.m.

Hi Brian, I can see from your code snippet how your are reading the graph in. Could you show the code for how you are writing the graph out? I'm not sure if the interface has changed since 1.33.1, but with the version in CVS, you can pass along the dynamic_properties object and the name of the property to use for vertex names in order to get the same graph out: // Graph structure with dynamic property output template<typename Graph> void write_graphviz(std::ostream& out, const Graph& g, const dynamic_properties& dp, const std::string& node_id = "node_id"); HTH, ron On Sep 17, 2007, at 1:45 PM, Brian Stadler wrote:

...

Good observation. I see exactly what it's doing now.

However, this is still a bug. If one is using dot language files across different programs then the parser needs to interpret it the same. Though the graph may retain its overall structure, the structure for that individual vertex has now changed. This will adversely affect the way algorithms make entree into the graph. For example, if vertex 0 once had a degree of two and now has a degree of 3 than the choices to move from 0 have changed from two choices to three.

thanks again for the response!

On 9/17/07, Krishna Roskin <krish@soe.ucsc.edu> wrote: On 9/14/07, Brian Stadler <bdotstadler@gmail.com> wrote:

...
My read_graphviz code:

bool file; ifstream in(openfile.c_str(), ios::in); dynamic_properties dp; dp.property ("node_id", get(&ed_node::vertex_name, ug)); <--simple adjacency list with bundled properties being used file = read_graphviz(in, ug, dp, "node_id");

Source graphviz file:

strict graph { 0 -- 3; 0 -- 4; 1 -- 3; 1 -- 4; 2 -- 3; 2 -- 4; 3 -- 4; }

Output graphviz file:

graph G { 0; 1; 2; 3; 4; 0--1 ; 0--2 ; 3--1 ; 3--2 ; 4--1 ; 4--2 ; 1--2 ; }

Brian Stadler

19 Sep 19 Sep

2:41 p.m.

Ron, Code for writing is very simple. I dont use this code unless I am doing testing. ofstream fo(filename.c_str()); write_graphviz(fo, [graph here]); fo.close(); The function you speak of is listed on http://boost.org/libs/graph/doc/write-graphviz.html. It is included with 1.33.1. To be honest, I wish read_graphviz() did not require a dynamic properties object. If one wants to read a very simple dot file without any extra stuff then they basically have to create a fake dp object just to do this. Quick note, I changed the dot files I was reading in from the original post to the following, which is the way the boost library outputs them. graph G { "0"; "1"; "2"; "0" -- "1"; "0" -- "2"; "1" -- "2"; } Vertices are quoted but boost is able to handle it correctly. --Brian On 9/18/07, Ronald Garcia <garcia@cs.indiana.edu> wrote:

...

Hi Brian, I can see from your code snippet how your are reading the graph in. Could you show the code for how you are writing the graph out? I'm not sure if the interface has changed since 1.33.1, but with the version in CVS, you can pass along the dynamic_properties object and the name of the property to use for vertex names in order to get the same graph out:

// Graph structure with dynamic property output template<typename Graph> void write_graphviz(std::ostream& out, const Graph& g, const dynamic_properties& dp, const std::string& node_id = "node_id");

HTH, ron

On Sep 17, 2007, at 1:45 PM, Brian Stadler wrote:

Good observation. I see exactly what it's doing now.

However, this is still a bug. If one is using dot language files across different programs then the parser needs to interpret it the same. Though the graph may retain its overall structure, the structure for that individual vertex has now changed. This will adversely affect the way algorithms make entree into the graph. For example, if vertex 0 once had a degree of two and now has a degree of 3 than the choices to move from 0 have changed from two choices to three.

thanks again for the response!

On 9/17/07, Krishna Roskin <krish@soe.ucsc.edu> wrote:

...
On 9/14/07, Brian Stadler <bdotstadler@gmail.com> wrote:

...
My read_graphviz code:

bool file; ifstream in(openfile.c_str(), ios::in); dynamic_properties dp; dp.property ("node_id", get(&ed_node::vertex_name, ug)); <--simple adjacency list with bundled properties being used file = read_graphviz(in, ug, dp, "node_id");

Source graphviz file:

strict graph { 0 -- 3; 0 -- 4; 1 -- 3; 1 -- 4; 2 -- 3; 2 -- 4; 3 -- 4; }

Output graphviz file:

graph G { 0; 1; 2; 3; 4; 0--1 ; 0--2 ; 3--1 ; 3--2 ; 4--1 ; 4--2 ; 1--2 ; }

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Ronald Garcia

3:01 p.m.

Hi Brian, On Sep 19, 2007, at 10:41 AM, Brian Stadler wrote:

...

Code for writing is very simple. I dont use this code unless I am doing testing.

ofstream fo(filename.c_str()); write_graphviz(fo, [graph here]); fo.close();

...

The function you speak of is listed on http://boost.org/libs/graph/ doc/write-graphviz.html. It is included with 1.33.1.

Based on my reading of the write_graphviz docs, it seems that the function that I suggested will get you the results that you want. Also, I should mention that you don't need to keep the dynamic_properties object around to get the vertex names. They are also available directly from the internal vertex_name property map of your graph. You could use one of the versions of write_graphviz that takes a VertexID argument instead.

...

To be honest, I wish read_graphviz() did not require a dynamic properties object. If one wants to read a very simple dot file without any extra stuff then they basically have to create a fake dp object just to do this.

Bear in mind that the graphs you are parsing with read_graphviz are a special case: The names of the vertices happen to be integral numbers. In general, graphviz graphs can have almost arbitrary string identifiers as node id's. Even though node id's do not have to be in quotes, they're still just character strings as far as the DOT language is concerned. I think that in the most common-cases: 1) the names of graph nodes are arbitrary strings (not integers) 2) the consumer of the DOT-specified graph cares about those names The dynamic_properties object at the least guarantees that you can associate the names of the vertices with the vertices that are created in your graph object. The problem that you have been having that you originally reported is that you were indeed losing the node name information (the string representations of integers) when you were writing out your graphs using write_graphviz. That information is preserved in the vertex_name property map that you passed along to dynamic_properties, and also in the dynamic_properties

...

Quick note, I changed the dot files I was reading in from the original post to the following, which is the way the boost library outputs them.

graph G { "0"; "1"; "2"; "0" -- "1"; "0" -- "2"; "1" -- "2"; }

Vertices are quoted but boost is able to handle it correctly.

This is actually a good demonstration of how graphviz node id's are really strings, not numbers. The version of the graph output by write_graphviz may be textually different (because of the addition of quotation marks), but as far as any DOT tool is concerned, it's the same graph. HTH, ron

Brian Stadler

20 Sep 20 Sep

2:29 a.m.

Hello again, I'm still not convinced boost is interpreting these file correctly. I understand and agree that node id's are strings. I do not understand why you think these graph files are a special case. Answer me the following. Why is that when I process the following using the dot binary that i get back a picture with the correct interpretation, but when I read the same file using the boost library its interpretation is skewed? strict graph { 0 -- 3; 0 -- 4; 1 -- 3; 1 -- 4; 2 -- 3; 2 -- 4; 3 -- 4; } On 9/19/07, Ronald Garcia <garcia@cs.indiana.edu> wrote:

...

Hi Brian, On Sep 19, 2007, at 10:41 AM, Brian Stadler wrote:

Code for writing is very simple. I dont use this code unless I am doing testing.

ofstream fo(filename.c_str()); write_graphviz(fo, [graph here]); fo.close();

The function you speak of is listed on http://boost.org/libs/graph/doc/write-graphviz.html. It is included with 1.33.1.

Based on my reading of the write_graphviz docs, it seems that the function that I suggested will get you the results that you want. Also, I should mention that you don't need to keep the dynamic_properties object around to get the vertex names. They are also available directly from the internal vertex_name property map of your graph. You could use one of the versions of write_graphviz that takes a VertexID argument instead.

To be honest, I wish read_graphviz() did not require a dynamic properties object. If one wants to read a very simple dot file without any extra stuff then they basically have to create a fake dp object just to do this.

Bear in mind that the graphs you are parsing with read_graphviz are a special case: The names of the vertices happen to be integral numbers. In general, graphviz graphs can have almost arbitrary string identifiers as node id's. Even though node id's do not have to be in quotes, they're still just character strings as far as the DOT language is concerned. I think that in the most common-cases: 1) the names of graph nodes are arbitrary strings (not integers) 2) the consumer of the DOT-specified graph cares about those names

The dynamic_properties object at the least guarantees that you can associate the names of the vertices with the vertices that are created in your graph object. The problem that you have been having that you originally reported is that you were indeed losing the node name information (the string representations of integers) when you were writing out your graphs using write_graphviz. That information is preserved in the vertex_name property map that you passed along to dynamic_properties, and also in the dynamic_properties

Quick note, I changed the dot files I was reading in from the original post to the following, which is the way the boost library outputs them.

graph G { "0"; "1"; "2"; "0" -- "1"; "0" -- "2"; "1" -- "2"; }

Vertices are quoted but boost is able to handle it correctly.

This is actually a good demonstration of how graphviz node id's are really strings, not numbers. The version of the graph output by write_graphviz may be textually different (because of the addition of quotation marks), but as far as any DOT tool is concerned, it's the same graph.

HTH, ron

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

Ronald Garcia

5:04 a.m.

On Sep 19, 2007, at 10:29 PM, Brian Stadler wrote:

...

Hello again,

I'm still not convinced boost is interpreting these file correctly. I understand and agree that node id's are strings. I do not understand why you think these graph files are a special case. It is a special case that your node id's happen to be integers. The node name information is stored in the dynamic_properties (and in the internal vertex_name property of your graph). The write_graphviz operation requires you to explicitly pass along a mapping from nodes to ids if you want to control the ids of the vertices.

...

Answer me the following. Why is that when I process the following using the dot binary that i get back a picture with the correct interpretation, but when I read the same file using the boost library its interpretation is skewed?

What precisely do you mean when you say that its interpretation is "skewed"? Cheers, ron

Brian Stadler

28 Sep 28 Sep

3:15 p.m.

Hello again, I apologize for the late reply. I've been busy. To restate the question, I orginally claimed that a file being interpreted with read_graphviz() was being interpreted incorrectly. The file in question is below. Source graphviz file: strict graph { 0 -- 3; 0 -- 4; 1 -- 3; 1 -- 4; 2 -- 3; 2 -- 4; 3 -- 4; } I then claimed there was a problem and after looking at the output from write_graphviz() I noticed that what was read and what was written were different. The output from write_graphviz() after reading it with read_graphviz() is below. Notice that the labeling is no longer the same but the actual structure of the graph is retained, e.g. it still looks like a triangle or something. graph G { 0; 1; 2; 3; 4; 0--1 ; 0--2 ; 3--1 ; 3--2 ; 4--1 ; 4--2 ; 1--2 ; } So, after this, I beleived it was a problem with boost and I still beleive that. So, I took this same file and ran it through graphviz's dot with the following command and received the output below. (command was dot -o <source file> output went to stdout) strict graph { node [label="\N"]; graph [bb="0,0,201,180"]; 0 [pos="27,162", width="0.75", height="0.50"]; 3 [pos="82,90", width="0.75", height="0.50"]; 4 [pos="109,18", width="0.75", height="0.50"]; 1 [pos="100,162", width="0.75", height="0.50"]; 2 [pos="174,162", width="0.75", height="0.50"]; 0 -- 3 [pos="39,146 48,134 61,118 70,106"]; 0 -- 4 [pos="28,144 29,125 33,94 46,72 56,54 74,40 89,30"]; 1 -- 3 [pos="96,144 93,133 89,119 86,108"]; 1 -- 4 [pos="107,144 111,134 115,120 118,108 122,83 117,54 113,36"]; 2 -- 3 [pos="156,148 140,135 116,116 100,104"]; 2 -- 4 [pos="166,145 154,117 129,63 117,35"]; 3 -- 4 [pos="89,72 93,61 98,47 103,36"]; } Notice how graphviz preprocesses the file and creates the nodes with the correct labels first. Then it goes back and creates the edges between the nodes. Boost doesn't do this. With the original file (seen below) read_graphviz() reads and creates nodes as it goes in a linear fashion. For example, with the first line it reads 0 -- 3 and creates node 0, as it should, and goes on to read 3. But, the problem starts here. Instead of read_graphviz() preprocessing the entire file it simply creates as it goes. So, it created a node internally, starting at 0. Then when it came time to create another node it just counted up to 1, created it and gave it the label 3. This is where I beleive read_graphviz() is broken. This is not the way graphviz's programs interpret the file. I believe read_graphviz() should work as it does with the creators language. I find it silly to have to include code with appropiate labeling when reading such simple files. Nothing extra should be needing to interpret something like this in read_graphviz(). strict graph { 0 -- 3; 0 -- 4; 1 -- 3; 1 -- 4; 2 -- 3; 2 -- 4; 3 -- 4; } I hope this clears up any previous misunderstandings. Sincerely, --Brian On 9/20/07, Ronald Garcia <garcia@cs.indiana.edu> wrote:

...

On Sep 19, 2007, at 10:29 PM, Brian Stadler wrote:

...
Hello again,

I'm still not convinced boost is interpreting these file correctly. I understand and agree that node id's are strings. I do not understand why you think these graph files are a special case. It is a special case that your node id's happen to be integers. The node name information is stored in the dynamic_properties (and in the internal vertex_name property of your graph). The write_graphviz operation requires you to explicitly pass along a mapping from nodes to ids if you want to control the ids of the vertices.

...
Answer me the following. Why is that when I process the following using the dot binary that i get back a picture with the correct interpretation, but when I read the same file using the boost library its interpretation is skewed?

What precisely do you mean when you say that its interpretation is "skewed"?

Cheers, ron

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users

6522

Age (days ago)

6536

Last active (days ago)

List overview

Download

8 comments

3 participants

participants (3)

Brian Stadler
Krishna Roskin
Ronald Garcia