I would like to generate a plot depicting 14 linear chromosomes for the organism I work on, to scale, with coloured bars at specified locations along each chromosome. Ideally I'd like to use R as this is the only programming language I have experience with.
I have explored various ways of doing this e.g. with GenomeGraphs but I have found this is all more complicated than what I want/ displays a lot more data than what I have (e.g. displaying cytogenic bands) and is often specific for human chromosomes.
All I essentially want is 14 grey bars of the following sizes:
chromosome size
1 640851
2 947102
3 1067971
4 1200490
5 1343557
6 1418242
7 1445207
8 1472805
9 1541735
10 1687656
11 2038340
12 2271494
13 2925236
14 3291936
And then to have coloured marks depicting about 150 locations scattered along the chromosome lengths. e.g. marks at these loci:
Chromosome Position
3 817702
12 1556936
13 1131566
Ideally I would also like to be able to specify a few different colours depending on the loci, e.g.
Chromosome Position Type
3 817702 A
12 1556936 A
13 1131566 A
5 1041685 B
11 488717 B
14 1776463 B
Where 'A' was marked in blue and 'B' was marked in green, for example.
A very similar plot to what I would like to produce is pasted in this image (from Bopp et al. PlOS Genetics 2013;9(2):e1003293):
Can anyone recommend a way of doing this? It doesn't necessarily have to be a bioinformatics package, if there is another way I can use R to generate 14 bars of certain proportional sizes with markings at specified locations along the bars. e.g. I've been thinking about modifying a simple bar chart from ggplot2 but I don't know how to put the markings along the bars at specific locations.
Just save your barplot
call and then call segments
to make the marks at an appropriate location. E.g.:
bp <- barplot(dat$size, border=NA, col="grey80")
with(marks,
segments(
bp[Chromosome,]-0.5,
Position,
bp[Chromosome,]+0.5,
Position,
col=Type,
lwd=2,
lend=1
)
)
Data used:
dat <- structure(list(chromosome = 1:14, size = c(640851L, 947102L,
1067971L, 1200490L, 1343557L, 1418242L, 1445207L, 1472805L, 1541735L,
1687656L, 2038340L, 2271494L, 2925236L, 3291936L)), .Names = c("chromosome",
"size"), class = "data.frame", row.names = c(NA, -14L))
marks <- structure(list(Chromosome = c(3L, 12L, 13L, 5L, 11L, 14L), Position = c(817702L,
1556936L, 1131566L, 1041685L, 488717L, 1776463L), Type = structure(c(1L,
1L, 1L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("Chromosome",
"Position", "Type"), class = "data.frame", row.names = c(NA,
-6L))