Tag Archives: graph

Analyzing Graphs with Julia

By: DSB

Re-posted from: https://medium.com/coffee-in-a-klein-bottle/analyzing-graphs-with-julia-38e26d1d2f62?source=rss-8bd6ec95ab58------2

A brief tutorial on how to use Julia to analyze graphs using the JuliaGraphs packages

Example of Graph created using LigthsGraphs.jl and VegaLite.jl

First of all, let’s be clear. The goal of this article is to briefly introduce how to use Julia for analyzing graphs. Besides the many types of graphs (undirected, directed, bipartite, weighted…), there are also many methods for analyzing them (degree distribution, centrality measures, clustering measures, visual layouts …). Hence, a comprehensive introduction to Graph Analysis with Julia would be too large of a task.

Therefore, this tutorial focuses on undirected weighted graphs, since they encompass weightless graphs, and are usually more common than directed graphs¹.

JuliaGraphs Project

Almost every package you will need can be found in the JuliaGraphs Project. The project contains specific packages for plotting, network layouts, weighted graphs, and more. In our example, we’ll be using GraphPlot.jl and SimpleWeightedGraphs.jl. The good thing about the project is that these packages work together and are very similar in design. Hence, the functions you use for creating a weighted graph are very similar to the ones you use for creating a simple graph with LightGraphs.jl.

Creating your first Graph

Let’s create a DataFrame using the DataFrames.jl package. Each column will represent a person, and each row will represent an attribute. Therefore, our graph will be composed of nodes (people) and edges (people share the same attribute).

DataFrame used for creating the Graph

After creating the DataFrame, we create the graph. Note here that they are actually separate objects. To create the graph, you only need to specify the number of nodes, which is the number of columns. Below I present the code for the creation of the DataFrame and the Graph.

The nodes were inserted, now we have to create the appropriate edges. This is done by using the command add_edge!() which takes the graph, the nodes that must be connected, and the edge weight.

In our example, the the weight is equal to the number of shared attributes between two columns. For example, the first and second column both have 1’s in rows 1 and 7. Therefore, they share an edge with weight 2:

add_edge!(g,1,2,2) # add_edge!(graph, node_1, node_2, weight)

The following code loops through the data, and adds the edges to the graph.

Visualizing the Graph

With this, our graph is ready to be visualized. This can be easily done with the following command:

gplot(g,nodelabel=names(df),edgelinewidth=ew)
Output from gplot()

There are several different layouts to chose from, just take a look at the GraphPlots.jl page. Here is another example:

gplot(g,nodelabel=names(df),edgelinewidth=ew,layout=circular_layout)
Output of gplot() using another layout

Centrality and Minimum Spanning Tree

Let’s do some analysis in this graph. As I said in the beginning, there are many way to analyze a graph. Two very common methods are studying the centrality of each node, and creating a Minimum Spanning Tree (MST). The goal of this article is not to explain theoretical aspects of graphs, so I’ll assume that the reader knows what I’m talking about.

Here is an example of how to calculate the betweeness centrality and visualizing the results. Note that I added 1 in the first line of code just to enable a better visualization.

Output of code above. The nodes colors represent the centrality.

Finally, one can also easily create a MST. To do so, we must create a new graph, since we’ll be removing edges and adjusting the nodes’ locations. The code below shows how to do this. The only new function here is the kruskal_mst() which will give us the MST. You can choose if the MST is minimizing of maximizing the path. In our case, we want to create a tree the contains only the “strongest” connections, hence, we are maximizing.

MST from the code above.

Conclusion

This is the end of this brief tutorial. There are much more functionalities in the JuliaGraphs project, enabling a much more thorough analysis than the one presented here. Also, one can create more interactive visualizations using VegaLite.jl, but I’ll leave that for another article.

¹ Authors opinion.

² A Jupyter Notebook can also be found on Github.


Analyzing Graphs with Julia was originally published in Coffee in a Klein Bottle on Medium, where people are continuing the conversation by highlighting and responding to this story.

#MonthOfJulia Day 24: Graphs

Julia-Logo-Graphs

If you’re not too familiar Graph Theory, then it might be an idea to take a moment to get the basics. Graphs are an extremely versatile data structure for storing data consisting of linked entities. I’m going to look at two packages for managing graphs in Julia: LightGraphs and Graphs.

LightGraphs

As usual, the first step is to load the package.

julia> using LightGraphs

LightGraphs has methods which generate a selection of standard graphs like StarGraph(), WheelGraph() and FruchtGraph(). There are also functions for random graphs, for example, erdos_renyi() and watts_strogatz(). We’ll start off by creating two small graphs. One will have 10 nodes connected by 20 random edges. The other will be a directed star graph consisting of four nodes, the central node being connected to every other node.

julia> g1 = Graph(10, 20)
{10, 20} undirected graph
julia> g2 = StarDiGraph(4)
{4, 3} directed graph
julia> edges(g2)
Set{Pair{Int64,Int64}}({edge 1 - 2,edge 1 - 4,edge 1 - 3})

It’s simple to find the degree and neighbours of a given node.

julia> degree(g1, 4)                       # How many neighbours for vertex 4?
6
julia> neighbors(g1, 4)                    # Find neighbours of vertex 4
6-element Array{Int64,1}:
 1
 3
 6
 2
 9
 7

There’s a straightforward means to add and remove edges from the graph.

julia> add_edge!(g1, 4, 8)                 # Add edge between vertices 4 and 8
edge 4 - 8
julia> rem_edge!(g1, 4, 6)                 # Remove edge between vertices 4 and 6
edge 6 - 4

The package has functionality for performing high level tests on the graph (checking, for instance, whether it is cyclic or connected). There’s also support for path based algorithms, but we’ll dig into those when we look at the Graphs package.

Graphs

Before we get started with the Graphs package you might want to restart your Julia session to purge all of that LightGraphs goodness. Take a moment to browse the Graphs.jl documentation, which is very comprehensive.

julia> using Graphs

As with LightGraphs, there are numerous options for generating standard graphs.

julia> g1a = simple_frucht_graph()
Undirected Graph (20 vertices, 18 edges)
julia> g1b = simple_star_graph(8)
Directed Graph (8 vertices, 7 edges)
julia> g1c = simple_wheel_graph(8)
Directed Graph (8 vertices, 14 edges)

Graphs uses the GraphViz library to generate plots.

julia> plot(g1a)

sample-graph

Of course, a graph can also be constructed manually.

julia> g2 = simple_graph(4)
Directed Graph (4 vertices, 0 edges)
julia> add_edge!(g2, 1, 2)
edge [1]: 1 -- 2
julia> add_edge!(g2, 1, 3)
edge [2]: 1 -- 3
julia> add_edge!(g2, 2, 3)
edge [3]: 2 -- 3

Individual vertices (a vertex is the same as a node) can be interrogated. Since we are considering a directed graph we look separately at the edges exiting and entering a node.

julia> num_vertices(g2)
4
julia> vertices(g2)
1:4
julia> out_degree(1, g2)
2
julia> out_edges(1, g2)
2-element Array{Edge{Int64},1}:
 edge [1]: 1 -- 2
 edge [2]: 1 -- 3
julia> in_degree(2, g2)
1
julia> in_edges(2, g2)
1-element Array{Edge{Int64},1}:
 edge [1]: 1 -- 2

Vertices can be created with labels and attributes.

julia> V1 = ExVertex(1, "V1");
julia> V1.attributes["size"] = 5.0
5.0
julia> V2 = ExVertex(2, "V2");
julia> V2.attributes["size"] = 3.0
3.0
julia> V3 = ExVertex(3, "V3")
vertex [3] "V3"

Those vertices can then be used to define edges, which in turn can have labels and attributes.

julia> E1 = ExEdge(1, V1, V2)
edge [1]: vertex [1] "V1" -- vertex [2] "V2"
julia> E1.attributes["distance"] = 50
50
julia> E1.attributes["color"] = "green"
"green"

Finally the collection of vertices and edges can be gathered into a graph.

julia> g3 = edgelist([V1, V2], [E1], is_directed = true)
Directed Graph (2 vertices, 1 edges)

It’s possible to systematically visit all connected vertices in a graph, applying an operation at every vertex. traverse_graph() performs the graph traversal using either a depth first or breadth first algorithm. In the sample code below the operation applied at each vertex is LogGraphVisitor(), which is a simple logger.

julia> traverse_graph(g1c, DepthFirst(), 1, LogGraphVisitor(STDOUT))
discover vertex: 1
examine neighbor: 1 -> 2 (vertexcolor = 0, edgecolor= 0)
discover vertex: 2
open vertex: 2
examine neighbor: 2 -> 3 (vertexcolor = 0, edgecolor= 0)
discover vertex: 3
open vertex: 3
examine neighbor: 3 -> 4 (vertexcolor = 0, edgecolor= 0)
discover vertex: 4
open vertex: 4
examine neighbor: 4 -> 5 (vertexcolor = 0, edgecolor= 0)
discover vertex: 5
open vertex: 5
examine neighbor: 5 -> 6 (vertexcolor = 0, edgecolor= 0)
discover vertex: 6
open vertex: 6
examine neighbor: 6 -> 7 (vertexcolor = 0, edgecolor= 0)
discover vertex: 7
open vertex: 7
examine neighbor: 7 -> 8 (vertexcolor = 0, edgecolor= 0)
discover vertex: 8
open vertex: 8
examine neighbor: 8 -> 2 (vertexcolor = 1, edgecolor= 0)
close vertex: 8
close vertex: 7
close vertex: 6
close vertex: 5
close vertex: 4
close vertex: 3
close vertex: 2
examine neighbor: 1 -> 3 (vertexcolor = 2, edgecolor= 0)
examine neighbor: 1 -> 4 (vertexcolor = 2, edgecolor= 0)
examine neighbor: 1 -> 5 (vertexcolor = 2, edgecolor= 0)
examine neighbor: 1 -> 6 (vertexcolor = 2, edgecolor= 0)
examine neighbor: 1 -> 7 (vertexcolor = 2, edgecolor= 0)
examine neighbor: 1 -> 8 (vertexcolor = 2, edgecolor= 0)
close vertex: 1

We can use Dijkstra’s Algorithm to calculate the distance from a given vertex to all other vertices in the graph. We see, for instance, that the distance from vertex 1 to vertex 4 is three steps. Since vertex 1 and vertex 20 are not connected, the distance between them is infinite. There are a couple of other algorithms available for calculating shortest paths.

julia> distances = ones(num_edges(g1a));   # Assign distance of 1 to each edge.
julia> d = dijkstra_shortest_paths(g1a, distances, 1);
julia> d.dists                             # Vector of distances to all other vertices.
20-element Array{Float64,1}:
   0.0
   1.0
   2.0
   3.0
   3.0
   2.0
   1.0
   1.0
   3.0
   4.0
   2.0
   2.0
 Inf  
 Inf  
 Inf  
 Inf  
 Inf  
 Inf  
 Inf  
 Inf  

As with the most of the packages that I have looked at already, the functionality summarised above is just a small subset of what’s available. Have a look at the home pages for these packages and check out the full code for today (which looks at a number of other features) on github. Some time in the future I plan on looking at the EvolvingGraphs which caters for graphs where the structure changes with time.

493-drawing-stars-0

The post #MonthOfJulia Day 24: Graphs appeared first on Exegetic Analytics.