# Basics of the igraph Package

with tags igraph -There are multiple packages for the analysis of networks in R. This page concentrates on the `igraph`

package, which allows for a broad range of applications. But before we get into it in more detail, it is useful to know that there are two possible ways to represent the *edges*, i.e. the connections, of a network:

**Adjacency matrix**: This is a square matrix, where each row and column corresponds to an entity. If two entities are conencted, the respective field in the matrix takes the value one and zero otherwise.**A list of connections**: In its most simple form this is a list, where each edge of a network is represented by a row with one entity in the first column and the other in the second. For the purpose of this post this is our preferred representation.

## Create an artificial graph

We can illustrate these two representations by looking at an artificial network. Such a network could be generated with certain functions of the `igraph`

package. However, the following code only uses base functionalities of R. It results in a data frame with the names of the connected entities in the first and second row. The third row contains a random indicator for the strength of the connection. It is based on the square of a value from a standard normal distribution. To reduce the number of resulting edges, only values above a certain threshold are kept. Also, the code excludes the connections a node has with itself.

```
# Set seed for reproducibility
set.seed(12345)
# Generate data
raw <- data.frame(byer = rep(letters, 1, each = 26),
sllr = rep(letters, 26),
con = rnorm(26 * 26)^2) # Calculate arbitrary weights
# Drop connections with oneself
raw <- raw[raw$byer != raw$sllr,]
# Only keep strong connections
raw <- raw[abs(raw$con) > 1,]
# Reformat row numbers
rownames(raw) <- NULL
# Look at result
head(raw)
```

```
## byer sllr con
## 1 a f 3.304964
## 2 a l 3.302623
## 3 a s 1.255997
## 4 a v 2.119310
## 5 a x 2.412236
## 6 a y 2.552676
```

Basically, this is a list repesentation of the artifical network. In a next step we convert the data frame to an `igraph`

object.

`igraph`

objects

```
# Load package
library(igraph)
```

The transformation is straightforward and can be done with the `graph_from_data_frame`

function, where for simplicity we set the argument `directed = FALSE`

to indicated that the network is not directed. *Note that it is important that the first and second column of the data frame contain the names of the nodes that have a connection.*

`graph_df <- graph_from_data_frame(raw, directed = FALSE)`

The function `simplify`

can be used to get rid of loops and multiple edges:

`graph_df <- simplify(graph_df)`

Now take a look at the object:

`graph_df`

```
## IGRAPH 2e4c8fd UN-- 26 173 --
## + attr: name (v/c)
## + edges from 2e4c8fd (vertex names):
## [1] a--f a--g a--l a--m a--s a--v a--x a--y a--z b--d b--f b--g b--h b--k b--l
## [16] b--m b--o b--p b--q b--r b--t b--u b--v b--w b--x b--z c--g c--h c--j c--l
## [31] c--m c--n c--p c--q c--r c--s c--u c--v c--w c--x c--z d--e d--f d--h d--o
## [46] d--q d--t d--x d--z e--f e--g e--h e--i e--j e--k e--o e--q e--r e--s e--t
## [61] e--v e--w e--x e--y e--z f--g f--i f--k f--l f--m f--n f--q f--r f--u f--z
## [76] g--l g--m g--p g--s g--t g--u g--w g--x g--y g--z h--m h--o h--r h--s h--t
## [91] h--v h--x h--y h--z i--j i--k i--n i--q i--u i--v i--y i--z j--m j--p j--r
## [106] j--t j--u j--v j--w j--y j--z k--l k--m k--n k--p k--q k--t k--u k--v k--x
## + ... omitted several edges
```

It does not look like the data frame list. However, it can be seen that each element of the list of edges refers to a conenction between to entities.

Let’s compare this list to the adjacency matrix representation. It can be obtained directly from the `igraph`

object by using the `as_adj`

function.

`graph_adj <- as_adj(graph_df)`

Now look at the first six rows and columns of the resulting object.

`graph_adj[1:6, 1:6]`

```
## 6 x 6 sparse Matrix of class "dgCMatrix"
## a b c d e f
## a . . . . . 1
## b . . . 1 . 1
## c . . . . . .
## d . 1 . . 1 1
## e . . . 1 . 1
## f 1 1 . 1 1 .
```

An existing connection between two nodes is indicated by the value one. If there is no connection between them, the field contains a dot. For example, the (undirected) connection between *a* and *f* is described by the 1 in the sixth value of the first column and the sixth value in the first row. This corresponds to the first entry in the list representation of the above `igraph`

object. If the network were directed, only one field would contain a 1 - unless the connection was mutual.

Note that the resulting object is not an object of class `igraph`

. `as_adj`

produces a `dgCMatrix`

object, which comes with the `Matrix`

package and was specifically designed to implement so-called *sparse matrices*. Sparse matrices are simply matrices which contain a lot of zeros. By only considereing non-zero values, sparse matrices allow for a lot of additional computational efficiency, which is beneficial for the analysis of very large networks.

Now that we are familiar with the basic structure of `igraph`

objects we can proceed with the calculation of basic network summary statistics.

## References

Csardi G., & Nepusz, T. (2006). The igraph software package for complex network research, *InterJournal* Complex Systems, 1695.