I currently self-study multidimensional scaling. Amongst others, I study Borg & Groenen (2005): Modern multidimensional scaling : theory and applications.
On page 10, they present a real-life data set reported by Wish (1971). Wish (1971) asked 18 students to rate the global similarity of different pairs of nations such as France and China on a 9-point rating scale ranging from 1 = very different to 9 = very similar. Since the data set is publicly available, I wanted to replicate the result in R for practice purposes. As a first step, I wanted to replicate the following configuration also presented in Borg & Groenen (2005, p. 10).
I proceeded as follows:
library(smacof) ### this package contains the data set
data(wish) ### that is the data set
Since the data sets contains similarity ratings, I applied non metric multidimensional scaling using the isoMDS command of the MASS package. Even though the textbook authors speak of a "two-dimensional MDS configuration", I also tried higher-dimensional solutions. I therefore coded a loop that carries out multidimensional scaling for configurations containing dimensions from 2 to 9.
X <- c()
for (i in 2:9) {
MDS <- isoMDS(wish, k = i)
X <- c(X, MDS$stress)
plot(MDS$points[,c(1,2)])
text(MDS$points[, 1], MDS$points[, 2], colnames(as.matrix(wish)), cex=.6,
pos = 1)
}
plot(X, type = "b") ### this allowed me to plot the stress levels associated with each configuration
None of the resulting plots resembled the one presented in Borg & Groenen (2005, p. 10). For example, the map for 2 dimensions is as follows:
I checked that the data set is identical to the one reported by Borg & Groenen (2005, p. 10). I also tried out metric scaling as follows:
for (i in 2:9) {
plot(smacofSym(wish, ndim=i))
}
Again, I was not able to replicate the results reported by Borg & Groenen (2005,p. 10). However, I am not sure if I made any mistake while trying to replicate the results.
Using the base R cmdscale
, I get a similar result to Borg & Groenen.
If you look at the structure of wish
you will see that it is
simply a vector of 66 numbers. I interpret this as the lower triangluar
similarity matrix. I convert this into a full dissimilarity matrix so that I
can use cmdscale
and plot. The positions roughly align with the positions
from Borg & Groenen.
library(smacof)
data(wish)
## Construct distance matrix
SM = matrix(0, nrow=12, ncol=12)
SM[lower.tri(SM)] = wish
SM = SM + t(SM)
diag(SM) = 9
DM = 9-SM
## MDS & plotting
MDS = cmdscale(DM)
plot(MDS, pch=20, xlim=c(-4,4), ylim=c(-4,4))
text(MDS, labels = attr(wish, "Labels"), adj=c(0.5,-0.6), cex=0.8)
abline(0.5,0.3, lty=2)
abline(-1,-3.8, lty=2)