rcluster-analysisfunction-callshclust

How to use 'hclust' as function call in R


I tried to construct the clustering method as function the following ways:

mydata <- mtcars

# Here I construct hclust as a function
hclustfunc <- function(x) hclust(as.matrix(x),method="complete")

# Define distance metric
distfunc <- function(x) as.dist((1-cor(t(x)))/2)

# Obtain distance
d <- distfunc(mydata)

# Call that hclust function
fit<-hclustfunc(d)

# Later I'd do
# plot(fit)

But why it gives the following error:

Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed 65536") : 
  missing value where TRUE/FALSE needed

What's the right way to do it?


Solution

  • Do read the help for functions you use. ?hclust is pretty clear that the first argument d is a dissimilarity object, not a matrix:

    Arguments:
    
           d: a dissimilarity structure as produced by ‘dist’.
    

    Update

    As the OP has now updated their question, what is need is

    hclustfunc <- function(x) hclust(x, method="complete")
    distfunc <- function(x) as.dist((1-cor(t(x)))/2)
    d <- distfunc(mydata)
    fit <- hclustfunc(d)
    

    Original

    What you want is

    hclustfunc <- function(x, method = "complete", dmeth = "euclidean") {    
        hclust(dist(x, method = dmeth), method = method)
    }
    

    and then

    fit <- hclustfunc(mydata)
    

    works as expected. Note you can now pass in the dissimilarity coefficient method as dmeth and the clustering method.