I have several SNVs:
#CHROM POS ID REF ALT
chr1 1000 . A C
chr1 2000 . T A
Can I get their genotype data (REF coverage, ALT coverage) from some new bam files? for example, bam files from 2 tumor samples:
tumor1.bam
tumor2.bam
I expect a VCF format result like below, other formats work too.
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=DR,Number=1,Type=Integer,Description="Number of reference reads">
##FORMAT=<ID=DV,Number=1,Type=Integer,Description="Number of variant reads">
#CHROM POS ID REF ALT FORMAT tumor1.bam tumor2.bam
chr1 1000 . A C GT:DR:DV 0/1:20:10 0/1:30:5
chr1 2000 . T A GT:DR:DV 0/1:80:10 0/1:20:8
Any suggestions?
Thanks to @Pierre, one solution to this question was found.
bcftools mpileup can output genotypes in a vcf format:
bcftools mpileup -f ref.fa -R input.vcf tumor1.bam tumor2.bam -a FORMAT/AD
The output will be like this:
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths (high-quality bases)">
#CHROM POS ID REF ALT FORMAT tumor1.bam tumor2.bam
chr1 1000 . A C,<*> PL:AD xx:20,10,0 xx:30,5,0
chr1 2000 . T A,<*> PL:AD xx:80,10,0 xx:20,8,0