rigraphdistributionsnapower-law

Generating a connected and directed network in R


I am trying to generate a directed and connected scale-free network in R, such that, the in- and out-degrees of the network follow a power-law. I tried the below methods,

  1. g<-sample_fitness_pl(10000, 1000000, 2.7, 2.7) generates a network with a in-/out-degree distribution that does not follow a power-law

  2. The following function generates power-law distributions of the in- and out-degrees of the nodes, however, the resulting network is not even weakly connected,

N <- 10000
    
g <- sample_fitness(10*N, sample((1:50)^-2, N, replace=TRUE),sample((1:50)^-2, N, replace=TRUE))

I appreciate any comments that help me learn more about it.


Solution

  • As explained in the documentation, the function sample_fitness_pl() generates a non-growing random graph with expected power-law degree distributions. We can verify this by plotting the (in-) degree distribution and fitting a power-law distribution function f(x) = c.x^(-α) with non-linear least-square (nls) model.

    library(igraph)
    g<-sample_fitness_pl(10000, 1000000, 2.7, 2.7)
    df <- as.data.frame(table(degree(g, mode='in')))
    names(df) <- c('in.deg', 'freq')
    df$in.deg <- as.integer(as.character(df$in.deg))
    head(df)
    #   in.deg freq
    # 1     31    3
    # 2     32    3
    # 3     33    3
    # 4     34    8
    # 5     35    9
    # 6     36   14
    model <- nls(freq ~ c*in.deg^(-alpha), data=df, start = list(c = 10,alpha = 0.1))
    summary(model)
    # Formula: freq ~ c * in.deg^(-alpha)
    
    # Parameters:
    #        Estimate Std. Error t value Pr(>|t|)    
    # c     3.281e+03  8.315e+02   3.946 9.12e-05 ***
    # alpha 9.471e-01  5.926e-02  15.981  < 2e-16 ***
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    
    # Residual standard error: 31.54 on 487 degrees of freedom
    
    # Number of iterations to convergence: 35 
    # Achieved convergence tolerance: 8.192e-06
    

    which shows that the degree distribution can be approximated by the power-law distribution f(x) = 3281 . x ^(-0.95), as shown below (the red curve represents the fitted distribution):

    deg <- seq(min(df$in.deg), max(df$in.deg), 1)
    pow.law.pred <- predict(model, newdata = data.frame(in.deg=deg))
    pred.df <- data.frame(deg=deg, pred=pow.law.pred)
    library(ggplot2)
    ggplot(df, aes(in.deg, freq)) + geom_point() + 
      geom_line(data=pred.df, aes(deg, pred), col='red')
    

    enter image description here

    Also, the log-log plot below shows resemblance to power-law distribution, except possibly a few nodes with low in-degree values (we can remove them?)

    ggplot(df, aes(in.deg, freq)) + geom_point() +   coord_trans(y ='log10', x='log10')
    

    enter image description here

    Also, we can find that the graph is connected, since the number of components is 1

    count_components(g) 
    [1] 1