I am fitting a Pareto distribution to some data and have already estimated the maximum likelihood estimates for the data. Now I need to create a fitdist (fitdistrplus library) object from it, but I am not sure how to do this. I need a fitdist object because I would like to create qq, density etc. plots with the function such as denscomp. Could someone help?
The reason I calculated the MLEs first is because fitdist does not do this properly - the estimates always blow up to infinity, even if I give the correct MLEs as the starting values (see below). If the earlier option of manually giving fitdist my parameters is not possible, is there an optimization method in fitdist that would allow the pareto parameters to be properly estimated?
I don't have permission to post the original data, but here's a simulation using MLE estimates of a gamma distribution/pareto distribution on the original.
library(fitdistrplus)
library(actuar)
sim <- rgamma(1000, shape = 4.69, rate = 0.482)
fit.pareto <- fit.dist(sim, distr = "pareto", method = "mle",
start = list(scale = 0.862, shape = 0.00665))
#Estimates blow up to infinity
fit.pareto$estimate
If you look at the ?fitdist
help topic, it describes what fitdist
objects look like: they are lists with lots of components. If you can compute substitutes for all of those components, you should be able to create a fake fitdist
object using code like
fake <- structure(list(estimate = ..., method = ..., ...),
class = "fitdist")
For the second part of your question, you'll need to post some code and data for people to improve.
Edited to add:
I added set.seed(123)
before your simulation of random data. Then I get the MLE from fitdist
to be
scale shape
87220272 9244012
If I plot the log likelihood function near there, I get this:
loglik <- Vectorize(function(shape, scale) sum(dpareto(sim, shape, scale, log = TRUE)))
shape <- seq(1000000, 10000000, len=30)
scale <- seq(10000000, 100000000, len=30)
surface <- outer(shape, scale, loglik)
contour(shape, scale, surface)
points(9244012, 87220272, pch=16)
That looks as though fitdist
has made a somewhat reasonable choice, though there may not actually be a finite MLE. How did you find the MLE to be such small values? Are you sure you're using the same parameters as dpareto
uses?