I'm trying to follow this
https://uc-r.github.io/kmeans_clustering
library(tidyverse) # data manipulation
library(cluster) # clustering algorithms
library(factoextra) # clustering algorithms & visualization
distance <- get_dist(df)
fviz_dist(distance, gradient = list(low = "#00AFBB", mid = "white", high = "#FC4E07"))
Which, as expected, works great.
It may be something really simple: why is there no column name for what is obviously the state field?
If I try and use this methodology with a dataset like this
ipl <- read.csv("https://query.data.world/s/3kadbuzyj25jwe42k6tgij56gscept?dws=00000", header=TRUE, stringsAsFactors=FALSE)
ipl <- na.omit(ipl)
distanceipl <- get_dist(ipl)
fviz_dist(distanceipl, gradient = list(low = "#00AFBB", mid = "white", high = "#FC4E07"))
Instead of the player names on each axis, I get what I think are the row numbers. How do I get the player names in PLAYER on the axes?
There are two solutions here: either label the vizualisation by using ggplot2
+scale_y_discrete (labels = FIELDFORLABELLING)
or pass the player name to row names:
rownames(dataframe) <-dataframe$FIELDFORLABELLING
thanks for the answers!
From the docs:
fviz_dist(): returns a ggplot2
So you can just add labels the way you would with a normal ggplot2
object, i.e.:
fviz_dist(distanceipl, gradient = list(low = "#00AFBB", mid = "white", high = "#FC4E07")) + scale_y_discrete(labels = ipl$PLAYER)