I would like to implement a custom survival function in the flexsurv
package, similar to Section 4 (pg. 12) here:
https://cran.r-project.org/web/packages/flexsurv/vignettes/flexsurv.pdf
Say I have a dataframe, df
, where the status
determines if an item is failed. 1=failed
and 0=non-failed
:
> library(flexsurv)
> library(survival)
> df
equip_id age status
1 20.33812 0
2 28.44830 1
3 22.47019 0
4 47.21489 1
5 30.42093 1
Now say I have a density function:
# must be named 'dcustom.weibull' to denote density function
dcustom.weibull <- function(shape, scale, t)
{
f_t <- (shape/scale) * ((t/scale)^(shape - 1)) * exp(-(t/scale)^shape)
return(f_t)
}
I try creating a custom function for use in the flexsurv
package:
custom.weibull <- list(name = "custom.weibull", pars = c("shape", "scale"),
location = "scale",
transforms = c(log, log),
inv.transforms = c(exp, exp),
inits = function(t){ c(1, median(t)) })
I try calling this with the flexsurv
package:
flexsurvreg(Surv(age, status)~1, data = df, dist = custom.weibull)
I get the following error:
Forming integrated rmst function...
Forming integrated mean function...
Error in (function (shape, scale, t) :
unused arguments (x = c(28.4482952329068, 47.2148908312751, 30.4209324295688), log = TRUE)
Now I realize there are plenty of packages that can do this for a 2-parameter Weibull function (e.g.,fitdistplus
). I'm trying to understand how to set this up with a known density function so that I can eventually implement a different, less traditional density function with different parameters.
Generally one must build a proper family of functions for a new distribution. I only see a d-dist function. Furthermore, the "x" parameter of a density function is usually placed first, so to my (R-experienced) eyes, seeing it as "t" and in the third position is rather unexpected. So I built a custom dweibull and added a log parameter because my first version threw an error complaining about an unused log argument.
library(flexsurv)
# it's poor practice to name a data object with the same token as an existing density fun
df1 <- read.table(text=" equip_id age status
1 20.33812 0
2 28.44830 1
3 22.47019 0
4 47.21489 1
5 30.42093 1", header=TRUE)
dcustom.weibull <- function(x, shape, scale, log=FALSE)
{
f_x <- (shape/scale) * ((x/scale)^(shape - 1)) * exp(-(x/scale)^shape)
return(f_x)
}
flexsurvreg(Surv(age, status)~1, data = df1, dist = custom.weibull)
#-------------------------
Forming cumulative distribution function...
Forming integrated rmst function...
Forming integrated mean function...
Call:
flexsurvreg(formula = Surv(age, status) ~ 1, data = df1, dist = custom.weibull)
Estimates:
est L95% U95% se
shape 4.07 NA NA NA
scale 196.43 NA NA NA
N = 5, Events: 3, Censored: 2
Total time at risk: 148.8924
Log-likelihood = 0.0001369537, df = 2
AIC = 3.999726
Warning message:
In flexsurvreg(Surv(age, status) ~ 1, data = df1, dist = custom.weibull) :
Optimisation has probably not converged to the maximum likelihood - Hessian is not positive definite.
So the density function was apparently sufficient without either a p- or q- function to complete the custom family, so the rewrite and addition of a log parameter was sufficient to cure the issue.