splitbioinformaticsgeneticsvcf-variant-call-formatmap-files

Split multiallelic to biallelic in vcf by plink 1.9 and its variant name


I am trying to use plink1.9 to split multiallelic into biallelic. The input is that

1       chr1:930939:G:A 0       930939  G       A
1       chr1:930947:G:A 0       930947  A       G
1       chr1:930952:G:A;chr1:930952:G:C 0       930952  A       G

What it done is:

1       chr1:930939:G:A 0       930939  G       A
1       chr1:930947:G:A 0       930947  A       G
1       chr1:930952:G:A;chr1:930952:G:C 0       930952  A       G
1       chr1:930952:G:A;chr1:930952:G:C 0       930952  A       G

What I expect is:

1       chr1:930939:G:A 0       930939  G       A
1       chr1:930947:G:A 0       930947  A       G
1       chr1:930952:G:A 0       930952  A       G
1       chr1:930952:G:C 0       930952  A       G

Please help me to make a vcf or ped or map file like what I expect. Thank you.


Solution

  • I used bcftools to complete the task.

    https://github.com/samtools/bcftools/issues/1193