My objective is to extract paths that contain a certain label within a dot file. However, this is the first time I have worked with a dot file. I have no idea how to extract the labels of a dot file using Python. For instance, in the dot file below, I want to extract the path that belong to the label "V1". Here is my dot file -
digraph "MVICFG" {
label="MVICFG";
/* Generating Nodes */
subgraph cluster_1 {
label="main";
"6" [label="4294967294::Entry::main"];
"2" [label="0:: %1 = alloca i32, align 4"];
"3" [label="0:: store i32 0, i32* %1, align 4"];
"4" [label="0:: %2 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([12 x i8], [12 x i8]* @.str, i32 0, i32 0)), !dbg !13"];
"5" [label="4:: ret i32 0, !dbg !14"];
"7" [label="4294967293::Exit::main"];
"11" [label="3:: %1 = alloca i32, align 4"];
"12" [label="3:: store i32 0, i32* %1, align 4"];
"13" [label="3:: %2 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([13 x i8], [13 x i8]* @.str, i32 0, i32 0)), !dbg !13"];
}
subgraph cluster_9 {
label="External_Node_Func";
"10" [label="4294967294::External_Node"];
}
/* Generating Edges */
"2" -> "3" [arrowhead = normal, penwidth = 1.0, color = black, label="V1"];
"3" -> "4" [arrowhead = normal, penwidth = 1.0, color = black, label="V1"];
"6" -> "2" [arrowhead = normal, penwidth = 1.0, color = pink, label="V1::Virtual"];
"5" -> "7" [arrowhead = normal, penwidth = 1.0, color = pink, label="V1,V2::Virtual"];
"4" -> "5" [arrowhead = normal, penwidth = 1.0, color = black, label="V1"];
"6" -> "11" [arrowhead = normal, penwidth = 1.0, color = pink, label="V2::Virtual"];
"13" -> "5" [arrowhead = normal, penwidth = 1.0, color = black, label="V2"];
"11" -> "12" [arrowhead = normal, penwidth = 1.0, color = black, label="V2"];
"12" -> "13" [arrowhead = normal, penwidth = 1.0, color = black, label="V2"];
}
Here is what I've done - I looked into a popular Python library that worked with dot files, called pydot. I wrote the following code, but couldn't get to the stage of extracting labels.
import pydot
dot_string = """digraph "MVICFG" {
label="MVICFG";
/* Generating Nodes */
subgraph cluster_1 {
label="main";
"6" [label="4294967294::Entry::main"];
"2" [label="0:: %1 = alloca i32, align 4"];
"3" [label="0:: store i32 0, i32* %1, align 4"];
"4" [label="0:: %2 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([12 x i8], [12 x i8]* @.str, i32 0, i32 0)), !dbg !13"];
"5" [label="4:: ret i32 0, !dbg !14"];
"7" [label="4294967293::Exit::main"];
"11" [label="3:: %1 = alloca i32, align 4"];
"12" [label="3:: store i32 0, i32* %1, align 4"];
"13" [label="3:: %2 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([13 x i8], [13 x i8]* @.str, i32 0, i32 0)), !dbg !13"];
}
subgraph cluster_9 {
label="External_Node_Func";
"10" [label="4294967294::External_Node"];
}
/* Generating Edges */
"2" -> "3" [arrowhead = normal, penwidth = 1.0, color = black, label="V1"];
"3" -> "4" [arrowhead = normal, penwidth = 1.0, color = black, label="V1"];
"6" -> "2" [arrowhead = normal, penwidth = 1.0, color = pink, label="V1::Virtual"];
"5" -> "7" [arrowhead = normal, penwidth = 1.0, color = pink, label="V1,V2::Virtual"];
"4" -> "5" [arrowhead = normal, penwidth = 1.0, color = black, label="V1"];
"6" -> "11" [arrowhead = normal, penwidth = 1.0, color = pink, label="V2::Virtual"];
"13" -> "5" [arrowhead = normal, penwidth = 1.0, color = black, label="V2"];
"11" -> "12" [arrowhead = normal, penwidth = 1.0, color = black, label="V2"];
"12" -> "13" [arrowhead = normal, penwidth = 1.0, color = black, label="V2"];
}
"""
graphs = pydot.graph_from_dot_data(dot_string)
graph = graphs[0]
Update 1:
If I am looking for the edges corresponding to the label "V1", I'd like this type of output -
"2" -> "3"
"3" -> "4"
"4" -> "5"
I can get that from the code that SultanOrazbayev posted by adding the following line -
G_sub.edges
Assuming that the dot
file is named test.dot
, the following procedure will use networkx
to load the dot
file (this requires pydot
to be installed also), and then filter the edges, returning a subgraph with the desired edges.
from networkx import subgraph_view
from networkx.drawing.nx_pydot import read_dot
# load the dot file
G = read_dot('test.dot')
# define the function to filter edges
def filter_edge(source, target, edge_id):
"""Note this function hardcodes the desired edge label,
also note the nested quoting of the label to match the raw data."""
if G[source][target][edge_id].get('label')=='"V1"':
return True
G_sub = subgraph_view(G, filter_edge=filter_edge)
print(G_sub)
# MultiDiGraph named 'MVICFG' with 10 nodes and 3 edges
If you also want to remove the isolates
, then use the relevant networkx
function:
from networkx import isolates, MultiDiGraph
# make a modifiable copy of the graph
G_sub = MultiDiGraph(G_sub)
# identify which nodes to remove
remove_nodes = list(isolates(G_sub))
G_sub.remove_nodes_from(remove_nodes)
print(G_sub)
# MultiDiGraph named 'MVICFG' with 4 nodes and 3 edges
Note that the result of isolates
is stored in a list
, this is to avoid iterating over a graph that is being modified, see this PR and associated GH issue.