pythonrcirclizecircos

Software recommendation for circos plot with discrete axis


I would like to make a circos-like plot to visualize SNPs only (with multiple tracks for SNPs attributes). It could be done either with python, R or I am happy to consider other languages.

So far, I have taken a look at the circlize R package. However, I get the error "Range of the sector ('C') cannot be 0" when initializing the circos plot. I believe that this error arises from the fact that I have discrete data (SNPs) instead of having data for all positions. Or maybe this is because I have some data points that are repeated.

I have simplified my data below and show the code that I have tried so far:

Sample  Gene    Pos read_depth  Freq
1   A   20394   43  99
1   B   56902   24  99
2   A   20394   50  99
2   B   56902   73  99
3   A   20394   67  50
3   B   56902   20  99
3   C   2100394 21  50
install.packages("circlize")
library(circlize)
data <- read.table("test_circos.txt", sep='\t', header=TRUE)
circos.par("track.height" = 0.1)
circos.initialize(factors = data$Gene, x = data$Pos)

I would like to know whether it is possible to get a circos-like plot where each of my data points (7 in my example) is plotted as an individual data point without any other points being plotted, in the way of a discrete axis.


Solution

  • If it is of interest to anyone, I decided to do as follows:

    1. Number datapoints per category (='Gene'); new column 'Number':
    Sample  Gene  Pos     depth  Freq   Number
    1       A     20394   43     99     1      
    1       B     56902   24     99     1
    2       A     20394   50     99     2
    2       B     56902   73     99     2
    3       A     20394   67     50     3
    3       B     56902   20     99     3
    3       C     2100394 21     50     1
    
    1. Design circos config file as follows (header not included in real config file):
    chr - ID  LABEL START END COLOUR
    chr - A   A     0     3   chr1
    chr - B   B     0     3   chr2
    chr - C   C     0     1   chr3
    

    This means that my genes will have length equal to the number of SNPs identified in said genes and that each bp of the genes will represent one line (=SNP) in my SNP file.

    I can then use circos as normal.

    In the end, I chose circos because it seemed best documented, therefore easier to learn with the addition of appearing more flexible.