I am trying to create a chord diagram graph such as the one below.
Here, you can see which values of TRAV (such as TRAV21, TRAV1-2 etc) are matched with values of TRBV (such as TRBV6-4, TRBV6-1 ec), along with how often that pairing occurs.
Using the documentation provided (https://jokergoo.github.io/circlize_book/book/the-chorddiagram-function.html), I have to first create a matrix of my data in this format:
E1 E2 E3 E4 E5 E6
S1 4 14 13 17 5 2
S2 7 1 6 8 12 15
S3 9 10 3 16 11 18
Then, convert the matrix to a dataframe in this format:
from to value
1 S1 E1 4
2 S2 E1 7
3 S3 E1 9
4 S1 E2 14
5 S2 E2 1
6 S3 E2 10
7 S1 E3 13
8 S2 E3 6
9 S3 E3 3
10 S1 E4 17
So far, I have the code below:
df<- structure(list(TRAV = c("TRAV1-2", "TRAV1-2", "TRBV20-1",
"TRAV12-1", "TRAV1-2", "TRAV1-2", "TRAV1-2", "TRAV8-2", "TRAV1-2",
"TRBV20-1", "TRAV1-2", "TRAV1-2", "TRAV12-2", "TRAV12-3", "TRAV12-2",
"TRAV1-2", "TRAV1-2", "TRAV1-2", "TRAV19", "TRAV1-2", "TRAV1-2",
"TRAV19", "TRAV16", "TRAV27", "TRAV5", "TRAV1-2", "TRAV22", "TRAV1-2",
"TRAV27", "TRAV26-2", "TRAV1-2", "TRAV1-2", "TRAV1-2", "TRAV1-2",
"TRAV1-2", "TRAV41", "TRAV1-2", "TRAV1-2", "TRAV1-2", "TRAV1-2",
"TRAV1-2", "TRAV35", "TRAV1-2", "TRAV1-2", "TRBV19", "TRAV1-2",
"TRAV12-2", "TRAV1-2", "TRAV1-2", "TRAV16", "TRAV17", "TRAV35",
"TRAV1-2", "TRBV4-1", "TRAV1-2", "TRBV5-5", "TRAV1-2", "TRBV6-4",
"TRAV12-2", "TRAV1-2", "TRAV1-2", "TRAV1-2", "TRAV1-2", "TRAV22",
"TRAV1-2", "TRAV1-2", "TRAV8-4", "TRAV1-2", "TRAV1-2", "TRAV8-3",
"TRBV5-1", "TRAV12-2", "TRAV1-2", "TRBV6-2", "TRAV19", "TRAV1-2",
"TRAV1-2", "TRAV1-2", "TRAV1-2", "TRAV1-2", "TRAV1-2", "TRAV21",
"TRBV15", "TRAV24", "TRBV6-1", "TRAV1-2", "TRAV1-2", "TRAV1-2",
"TRAV1-2", "TRAV12-2", "TRAV12-2", "TRAV29/DV5", "TRAV8-2", "TRAV12-2",
"TRAV1-2", "TRAV1-2", "TRAV12-2", "TRAV1-2", "TRBV6-1", "TRAV1-2"
), TRBV = c("TRBV6-4", "TRBV6-2", NA, "TRBV4-3", "TRBV6-4",
"TRBV6-2", "TRAV5", "TRBV6-2", "TRBV6-4", NA, "TRBV3-1", "TRBV6-2",
"TRBV6-6", "TRBV10-2", "TRBV6-2", "TRBV6-4", "TRBV6-4", "TRBV6-4",
"TRBV6-6", "TRBV20-1", "TRBV4-2", "TRBV9", "TRBV6-2", "TRBV5-5",
"TRBV4-3", NA, "TRBV11-3", "TRBV4-2", "TRBV5-5", "TRBV9", "TRBV6-4",
"TRBV6-4", "TRBV4-2", "TRBV4-3", "TRBV6-1", "TRBV12-4", "TRBV6-4",
"TRBV6-4", "TRBV6-4", "TRBV6-4", "TRBV6-4", "TRBV12-4", "TRBV6-4",
"TRBV6-4", NA, "TRBV6-4", "TRBV6-2", "TRBV19", "TRBV28", "TRBV20-1",
"TRBV5-5", "TRAV41", "TRBV6-4", NA, "TRBV6-4", NA, "TRBV6-4",
NA, "TRAV5", "TRAV23/DV6", "TRBV28", "TRBV6-4", "TRBV4-2", "TRAV35",
"TRBV6-4", "TRBV6-4", "TRBV6-2", NA, NA, "TRBV3-1", NA, "TRBV3-1",
"TRBV6-4", NA, "TRBV9", "TRBV4-3", "TRBV20-1", "TRAV6", "TRBV6-4",
"TRBV15", "TRBV20-1", "TRAV30", NA, "TRBV19", NA, "TRBV6-4",
"TRBV25-1", "TRBV6-2", "TRBV6-1", "TRBV3-1", "TRBV6-6", "TRBV11-2",
"TRBV27", "TRBV6-6", "TRBV6-4", "TRBV4-2", "TRBV6-6", "TRBV6-1",
NA, "TRBV6-4")), row.names = c(NA, 100L), class = "data.frame")
df<- data.matrix(df)
xtabs( ~ TRAV+TRBV, data=df)
However, the output of xtabs removes the different levels of TRAV and TRBV columns and instead just provides the numbers. How can I create a matrix and then dataframe as described in the documentation so that I am able to create a chord diagram graph?
Many thanks in advance!
The issue is not xtabs
, the issue is that you convert your dataframe to a data.matrix
. Instead you can apply xtabs
on your dataframe and pass the output to chordDiagram
:
library(circlize)
chordDiagram(xtabs(~ TRAV + TRBV, data = df))
Or use e.g. aggregate
to get an adjacency list as a dataframe and pass that to chordDiagram
:
result <- aggregate(
rep(1, nrow(df)),
by = list(TRBV = df$TRBV, TRAV = df$TRAV), FUN = length
)
names(result) <- c("from", "to", "value")
chordDiagram(result)