I am fitting a power law to a 45 million rows vector, for which I am using the poweRlaw package in R: https://arxiv.org/pdf/1407.3492.pdf
The most computationally intensive part of the process is estimating the lower bound, which is done with the estimate_xmin()
function. It's taking a lot of time.
The code goes like this (w
is the vector and c_pl
comes from "continuous power-law"):
c_pl <- conpl$new(w)
est <- estimate_xmin(c_pl)
c_pl$setXmin(est)
I am wondering how to use the estimate_xmin()
function in a way that minimises processing time (maybe parallel computations?) I am working on an AWS instance with 16 cores and 64GB of RAM. Thanks.
The reason that estimate_xmin
takes so long is because it is trying all possible values of xmin
. The function has an argument xmins
that you can use to truncate this search, e.g.
estimate_xmin(m, xmins=c(10, 100, 1000, 10000))
will find the optimal xmin out of 10, 100, 1000 and 10000.