pythonbioinformaticsvcf-variant-call-formatpyranges

Using pyranges library, How to check if a chromosome position is contained in any interval?


I have a .vcf file containing variants information and a .bed file containing region studied information. I am using pyranges library to read the .bed file. I want to filter out all the variants in .vcf file that lies in the region studied intervals specified in .bed file. Since, pyranges provides a pandas dataframe, i could iterate over each row and check for containment of my variant position; But, I am looking for an API that helps me achieve this.

Example:

>> df = pd.DataFrame({"Chromosome": ["chr1", "chr2"], "Start": [100, 200],
...                    "End": [150, 201]})
>> pr.PyRanges(df)

+--------------+-----------+-----------+
| Chromosome   |     Start |       End |
| (category)   |   (int32) |   (int32) |
|--------------+-----------+-----------|
| chr1         |       100 |       150 |
| chr2         |       200 |       201 |
+--------------+-----------+-----------+

Is there an API to find if a chromosome position 125 for chromosome "chr1" lies in any of the interval. In above case it will be True because 125 lies in interval 100 - 150.


Solution

  • gr = pr.PyRanges(df)
    gr['chr1', 125:126]