I am using the R package circlize to create a circos plot.
I am aiming to create something similar to Figure 2 in this paper: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004812.
I would like to custom specify where to shade parts of the chromosomes with different, manually entered colours, but I am struggling.
Reproducible code:
### load packages
library("tidyverse")
library("circlize")
### Generate mock data
# Chromosome sizes - genome with 5 chromosomes size 1-5kb
chrom <- c(1,2,3,4,5)
start <- c(0,0,0,0,0)
end <- c(1000,1700,2200,3100,5000)
chr_sizes_df <- data.frame(chrom,start,end)
# Areas of interest - where I want 'shade_col' shading
chrom_num <- c(1,1,2,2,3,3,3,4,4,5,5,5)
chr <- c("chr1","chr1","chr2","chr2","chr3","chr3","chr3","chr4","chr4","chr5","chr5","chr5")
start <- c(0,900,0,1550,0,800,2000,0,2800,0,3000,4800)
end <- c(150,1000,185,1700,210,1000,2200,300,3100,400,3300,5000)
chr_regions_df <- data.frame(chr,start,end)
# Recombinations - to be depicted with lines connecting chromosomes
chr1 <- c(1,2,2,3,3,3,3,4,4,5,5,5,5)
chr1_pos <- c(100,150,170,20,2100,900,950,200,3000,100,3100,3300,4900)
chr2 <- c(1,4,2,1,3,3,5,5,4,3,5,4,2)
chr2_pos <- c(100,3000,170,100,100,900,3200,4800, 3050,10,3100,3300,40)
location <- c("Non coding", "Coding", "Non coding", "Non coding", "Coding", "Coding", "Coding", "Non coding", "Non coding", "Non coding", "Coding", "Coding", "Non coding")
sv_df <- data.frame(chr1,chr1_pos,chr2,chr2_pos,location)
# SNPs - to be depicted with dots or lines
chrom <- c(1,1,2,2,2,3,3,3,3,4,4,4,4,4,5,5,5,5,5,5)
pos <- c(350,600,200,650,700,300,1100,1500,2000,400,1500,1800,2000,2700,200,1000,1050,2000,2500,4950)
snp_df <- data.frame(chrom,pos)
### Prepare for plot
# Generate colour scheme
sv_df$location_col <- ifelse(sv_df$location=="Coding", "#FB8072",
ifelse(sv_df$location=="Non coding", "#80B1D3",
"#e9e9e9")
)
# Specify chromosome block shading
shade_col <- "#3F75AB"
# Format rearrangement data
nuc1 <- sv_df %>% select(chr1,chr1_pos) # Start positions
nuc2 <- sv_df %>% select(chr2,chr2_pos) # End positions
### Generating plot
## Basic circos graphic parameters
circos.clear()
circos.par(cell.padding=c(0,0,0,0),
track.margin=c(0,0.05),
start.degree = 90,
gap.degree = 3,
clock.wise = TRUE)
## Sector details
circos.initialize(factors = chr_sizes_df$chrom,
xlim = cbind(chr_sizes_df$start, chr_sizes_df$end))
## Generate basic outline with chromosomes
circos.track(ylim=c(0, 1), panel.fun=function(x, y) {
chr=CELL_META$sector.index
xlim=CELL_META$xlim
ylim=CELL_META$ylim
circos.text(mean(xlim), mean(ylim), chr)
},bg.col="#cde3f9", bg.border=TRUE, track.height=0.1)
## Add recombinations - coloured by coding vs non-coding etc
circos.genomicLink(nuc1, nuc2,
col=sv_df$location_col,
h.ratio=0.6,
lwd=3)
The above code produces the plot shown below:
I want to use chr_regions_df to specify the chromosome areas for shading using shade_col. Have tried a few things - draw.sector doesn't work well because it requires to know the angles rather than positions, which is hard to work out. There are cytoband options using circos.initializeWithIdeogram() but this seems to use pre-specified cytoband formats for certain species, rather than custom made areas for shading as in my use case (also why I couldn't use supplying user defined color in r circlize package).
Many thanks for your help.
To draw custom colored areas within chromosomes, use circos.genomicTrackPlotRegion
, where you need to provide a bed-like data frame with an additional column specifying the color to be used for each area.
#the first column should match the chromosome names used in 'circos.initialize'
chrom_num <- c(1,1,2,2,3,3,3,4,4,5,5,5)
#chr <- c("chr1","chr1","chr2","chr2","chr3","chr3","chr3","chr4","chr4","chr5","chr5","chr5")
start <- c(0,900,0,1550,0,800,2000,0,2800,0,3000,4800)
end <- c(150,1000,185,1700,210,1000,2200,300,3100,400,3300,5000)
shade_col <- c("blue","red","blue","red","blue","red","blue","red","blue","red","blue","red")
chr_regions_df <- data.frame(chrom_num,start,end,shade_col)
After running circos.initialize
, draw the chromosomes with their shaded area. In panel.fun
, the first argument (region) contains the coordinates of each feature while the second (value) contains all but the first 3 columns of the data frame.
circos.genomicTrackPlotRegion(chr_regions_df, ylim = c(0, 1),
panel.fun = function(region, value, ...) {
col = value$shade_col
circos.genomicRect(region, value,
ybottom = 0, ytop = 1,
col = col, border = NA)
xlim = get.cell.meta.data("xlim")
circos.rect(xlim[1], 0, xlim[2], 1, border = "black")
ylim = get.cell.meta.data("ylim")
chr = get.current.sector.index()
circos.text(mean(xlim), mean(ylim), chr)
}, bg.col = "#cde3f9", bg.border=TRUE, track.height=0.1)