Consider several points:
A = (1, 2.5), B = (5, 10), C = (23, 34), D = (45, 47), E = (4, 17), F = (18, 4)
How can I perform hierarchical clustering on them with R?
I've read this example Cluster Analysis but I'm not sure how to enter these values as points rather than just regular numbers.
When I do
x <- c(...) #x values
y <- c(...) #y values
I can plot them using
plot(x,y)
But how can I specify those values like in the example:
mydata <- scale(mydata)
Doing
mydata <- scale(x,y)
I get the following error
Error in scale.default(x, y) :
length of 'center' must equal the number of columns of 'x'
Something like this??
A = c(1, 2.5); B = c(5, 10); C = c(23, 34)
D = c(45, 47); E = c(4, 17); F = c(18, 4)
df <- data.frame(rbind(A,B,C,D,E,F))
colnames(df) <- c("x","y")
hc <- hclust(dist(df))
plot(hc)
This puts the points into a data frame with two columns, x
and y
, then calculates the distance matrix (pairwise distance between every point and every other point), and does the hierarchical cluster analysis on that.
We can then plot the data with coloring by cluster.
df$cluster <- cutree(hc,k=2) # identify 2 clusters
plot(y~x,df,col=cluster)