rmatrixggplot2heatmapasymmetric

ggplot2 expecting square matrix even though matrix is not symmetric


Hi I am trying to plot a heat map in ggplot2, using a matrix with 9 rows and 10 columns

I melt the matrix using the "as.matrix" notation in reshape2 and get the following output

A1 = melt(as.matrix(A))

  Var1  Var2        value
1     1 X0.05 8.690705e-01
2     2 X0.05 1.930320e-01
3     3 X0.05 1.474900e-02
4     4 X0.05 3.498176e-04
5     5 X0.05 2.451419e-06
6     6 X0.05 4.946808e-09
7     7 X0.05 2.832895e-12
8     8 X0.05 4.563140e-16
9     9 X0.05 2.055474e-20
10    1  X0.1 5.906241e-01
11    2  X0.1 7.416265e-01
12    3  X0.1 2.311771e-01
13    4  X0.1 3.892639e-02
14    5  X0.1 3.361408e-03
15    6  X0.1 1.445629e-04
16    7  X0.1 3.043528e-06
17    8  X0.1 3.103555e-08
18    9  X0.1 1.522292e-10

The output is correct with each column represented by 9 values

I then rescale by value and get the following output

A2 = ddply(A1, .(var2), transform, rescale = rescale(value))

Var1  Var2        value      rescale
1     1 X0.05 8.690705e-01 1.000000e+00
2     2 X0.05 1.930320e-01 2.221132e-01
3     3 X0.05 1.474900e-02 1.697101e-02
4     4 X0.05 3.498176e-04 4.025192e-04
5     5 X0.05 2.451419e-06 2.820737e-06
6     6 X0.05 4.946808e-09 5.692068e-09
7     7 X0.05 2.832895e-12 3.259684e-12
8     8 X0.05 4.563140e-16 5.250361e-16
9     9 X0.05 2.055474e-20 0.000000e+00
10    1  X0.1 5.906241e-01 7.963902e-01
11    2  X0.1 7.416265e-01 1.000000e+00
12    3  X0.1 2.311771e-01 3.117163e-01
13    4  X0.1 3.892639e-02 5.248786e-02
14    5  X0.1 3.361408e-03 4.532480e-03
15    6  X0.1 1.445629e-04 1.949266e-04
16    7  X0.1 3.043528e-06 4.103651e-06
17    8  X0.1 3.103555e-08 4.164269e-08
18    9  X0.1 1.522292e-10 0.000000e+00

Everything still looks fine and when I plot the heat map the output is correct, so far so good

ggplot(A2, aes(Var2, Var1)) + geom_tile(aes(fill = rescale), colour = "white") + scale_fill_gradient(low = "light blue", high = "dark blue")

Heat_map1

The problem comes up when I add custom labels, where the y axis goes from 1 to 9 (displaying the number of heterozygote individuals) and the x-axis goes from 0.05 to 0.5 (displaying the minor allele frequency)

x = [0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50]
y = [1 2 3 4 5 6 7 8 9]

ggplot(A4, aes(Var2, Var1)) + geom_tile(aes(fill = rescale), colour = "white") + scale_fill_gradient(low = "light blue", high = "dark blue") + scale_x_discrete(labels= x, expression("Minor allele frequency")) + scale_y_discrete(labels= y, expression("Number of Heterozygotes"))

However this time the y axis is all messed up

Heat_map2

It seems to me that ggplot automatically assumes a 10X10 matrix and tries to add the missing labels. I tried to find an option in reshape where I could maybe declare the shape of the matrix, however I was unable to find a solution. Has anyone come across this problem. Any help would be much appreciated, thanks in advance


Solution

  • Here is one approach. You can change tick mark labels with scale_x_discrete. As for y, I converted Var1 to factor.

    ggplot(mydf, aes(x = Var2, y = as.factor(Var1), fill = rescale)) +
    geom_tile(color = "white") +
    scale_fill_gradient(low = "light blue", high = "dark blue") +
    scale_x_discrete(breaks=c("X0.05", "X0.1"), labels=c("0.05", "0.1")) +
    xlab("Minor allele frequency") +
    ylab("Number of Heterozygotes")
    

    enter image description here