Objectives
This is an introduction to social networks using built-in functions in R and the packages sna
and network
. We will learn the
- basic features of the adjacency matrix represented as a
matrix
object,- calculate the degrees of the nodes, and
- calculate some fundamental descriptives of the network.
- We will then translate the network
- from a
matrix
object to anetwork
object in order to - plot the sociogram.
- from a
For full details of the packages, see https://cran.r-project.org/web/packages/sna/sna.pdf and https://cran.r-project.org/web/packages/network/network.pdf. For accessile general R-help see https://www.statmethods.net/ and for any kind of errors use https://google.com.
This introduction is deliberately writen in inelegant R, using as basic functions as possible. Many packages offer sleaker and more userfriendly network routines, such as ‘igraph’. In particular, I would like to reccomend the packages of David Schooch http://mr.schochastics.net/ for accessible and elegant network analysis in R. In general, basic plots in R (described in https://www.statmethods.net/graphs/index.html) are functional but more advanced and better looking plots can be acchieved through ‘ggplot’.
For basic concepts in network analysis see Robins (2015) and Borgatti, Everett, and Johnson (2018). There is also a handy online bool http://faculty.ucr.edu/~hanneman/nettext/ (Hanneman and Riddle 2005).
Build your own network
To use sna
(Butts 2016) and network
(Butts 2015) for the first time, install the packages
install.packages("sna")
install.packages("network")
Once packages are installed, load them
library("sna")
library("network")
The Matrix
Create an empty adjacency matrix for n = 5
nodes
n <- 5
ADJ <- matrix(0,n,n) # create a matrix with n rows and n columns and all values 0
Add ties \(1 \rightarrow 2\), \(1 \rightarrow 3\), \(2 \rightarrow 3\), \(3 \rightarrow 4\), and , \(4 \rightarrow 5\)
ADJ[1,2] <- 1
ADJ[1,3] <- 1
ADJ[2,3] <- 1
ADJ[3,4] <- 1
ADJ[4,5] <- 1
ADJ
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0 1 1 0 0
## [2,] 0 0 1 0 0
## [3,] 0 0 0 1 0
## [4,] 0 0 0 0 1
## [5,] 0 0 0 0 0
To make the network undirected, add the ties \(2 \rightarrow 1\), \(3 \rightarrow 1\), \(3 \rightarrow 2\), \(4 \rightarrow 3\), and \(5 \rightarrow 4\)
ADJ[2,1] <- 1
ADJ[3,1] <- 1
ADJ[3,2] <- 1
ADJ[4,3] <- 1
ADJ[5,4] <- 1
ADJ
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0 1 1 0 0
## [2,] 1 0 1 0 0
## [3,] 1 1 0 1 0
## [4,] 0 0 1 0 1
## [5,] 0 0 0 1 0
Cells in the adjacency matrix and tie-variables
In general the cell ADJ[i,j]
corresponds to the tie-variable \(X_{i,j}\). Here \(x_{1,2}=1\)
ADJ[1,2]
## [1] 1
but, for example, \(x_{1,4}=0\)
ADJ[1,4]
## [1] 0
The ties of node \(i=1\) is the \(i\)’th row
ADJ[1,]
## [1] 0 1 1 0 0
Density
The adjcenacy matrix has
dim(ADJ)
## [1] 5 5
rows and columns. This means that there are \(n \times n\) cells in the adjacency matrix.
dim(ADJ)[1]*dim(ADJ)[2]
## [1] 25
length(ADJ)
## [1] 25
The \(n\) diagonal elements \(x_{1,1},x_{2,2},\ldots,x_{n,n}\) are zero by definition, which means that there are \(n \times n - n = n(n-1)\) variables that can be non-zero, here
dim(ADJ)[1]*dim(ADJ)[2] - n
## [1] 20
Density: How many variables are equal to 1 out of the total posible?
The total number of ones \[L = \sum_{i,j,i\neq j}x_{i,j}=x_{1,2}+\cdots+x_{1,n}+x_{2,1}+\cdots+x_{n-1,n}\] is simply a count of the number of non-zero entries
sum(ADJ)
## [1] 10
The density thus is
sum(ADJ)/(n*(n-1))
## [1] 0.5
and 50% of possible ties are present in the network.
Degree
How many ties does a node have?
The degree \(d_i\) of a node \(i\) is defined as the sum \(d_i=\sum_{j}x_{i,j}=x_{i,2}+x_{i,2}+\cdots + x_{i,n}\). The degree of node \(i=1\) is thus
sum(ADJ[1,])
## [1] 2
and the degree of node \(i=2\) is
sum(ADJ[2,])
## [1] 2
Degree distribution
Calculate the column sum of the adjacency matrix to get the vector of degrees (note the capital S)
colSums(ADJ)
## [1] 2 2 3 2 1
The degree distribution is the table of frequencies of degrees
table( colSums(ADJ) )
##
## 1 2 3
## 1 3 1
You can chart the degree distribution with a bar chart
plot( table( colSums(ADJ) ))
You can use standard R-routines to explore the adjacency matrix
For example finding what node (-s) have, say, degree 3
which(colSums(ADJ)==3)
## [1] 3
Or subsetting the adjacency matrix to look only at nodes with degree 2 or greater
use <- which(colSums(ADJ)>=2) # for each row there will be a logical TRUE or FALSE
ADJ[use,use]
## [,1] [,2] [,3] [,4]
## [1,] 0 1 1 0
## [2,] 1 0 1 0
## [3,] 1 1 0 1
## [4,] 0 0 1 0
Fun Fact: Linear algebra
Most network metrics can be calculated using linear algebra. For example, if \(X_{i,j}\) in \(X\) tell you if \(i\) and \(j\) are directly connected, element \((XX)_{i,j}\) of the matrix product \(XX\), tells you how many paths \(i \rightarrow k \rightarrow j\) there are
ADJ %*% ADJ
## [,1] [,2] [,3] [,4] [,5]
## [1,] 2 1 1 1 0
## [2,] 1 2 1 1 0
## [3,] 1 1 3 0 1
## [4,] 1 1 0 2 0
## [5,] 0 0 1 0 1
Element \((XXX)_{i,j}\) of the matrix product \(XXX\), tells you how many paths \(i \rightarrow k \rightarrow h \rightarrow j\) there are
ADJ %*% ADJ %*% ADJ
## [,1] [,2] [,3] [,4] [,5]
## [1,] 2 3 4 1 1
## [2,] 3 2 4 1 1
## [3,] 4 4 2 4 0
## [4,] 1 1 4 0 2
## [5,] 1 1 0 2 0
Network object
Plotting the matrix
object ADJ
is not meaningful because R does not know that this is an adjacency matrix. To interpret ADJ
as a network, translate the adjacency matrix to a network
object
net <- as.network(ADJ, directed = FALSE)
NB: in the network
package you use directed=FALSE
in lieu of setting mode
equal to graph
.
The new object net
is an object of type
class(net)
## [1] "network"
While printing ADJ
to screen just gives you the matrix, priniting net
gives you a summary of the network
net
## Network attributes:
## vertices = 5
## directed = FALSE
## hyper = FALSE
## loops = FALSE
## multiple = FALSE
## bipartite = FALSE
## total edges= 5
## missing edges= 0
## non-missing edges= 5
##
## Vertex attribute names:
## vertex.names
##
## No edge attributes
Plot sociogram
When plotting a network
object, R knows that you want to plot the sociogram
plot( net )
For various plotting option see ?plot.network
. For example, set node-size to degree, include labels, and set different colours
plot( net , # the network object
vertex.cex = degree(net) , # how should nodes (vertices) be scaled
displaylabels = TRUE, # display the labels of vertices
vertex.col = c('red','blue','grey','green','yellow'))
Note that degree(net)
is a built-in function in network
for calculating the degrees of the nodes. The next step will explore more of these functions.
References
Borgatti, Stephen P, Martin G Everett, and Jeffrey C Johnson. 2018. Analyzing Social Networks. Sage.
Butts, Carter T. 2015. https://CRAN.R-project.org/package=network.
———. 2016. https://CRAN.R-project.org/package=sna.
Hanneman, Robert A., and Mark Riddle. 2005. Introduction to Social Network Methods. Riverside, CA: University of California, Riverside. http://faculty.ucr.edu/~hanneman/.
Robins, Garry. 2015. Doing Social Network Research: Network-Based Research Design for Social Scientists. Sage.