pandasnumpyscikit-learnpython-3.6vcftools

VCF file is missing mandatory header line ("#CHROM...")


I am getting an error when I am going to read a VCF file using scikit-allel library inside a docker image and os ubuntu 18.04. It shows that

raise RuntimeError('VCF file is missing mandatory header line ("#CHROM...")') RuntimeError: VCF file is missing mandatory header line ("#CHROM...")

But in the VCF file is well-formatted.

Here is my code of how I applied :

import pandas as pd
import os
import numpy as np
import allel
import tkinter as tk
from tkinter import filedialog
import matplotlib.pyplot as plt
from scipy.stats import norm

GenomeVariantsInput = allel.read_vcf('quartet_variants_annotated.vcf', samples=['ISDBM322015'],fields=[ 'variants/CHROM', 'variants/ID', 'variants/REF',
 'variants/ALT','calldata/GT'])

version what Installed : Python 3.6.9 Numpy 1.19.5 pandas 1.1.5 scikit-allel 1.3.5


Solution

  • You need to add a line like this in the first:

    #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA00001 NA00002 NA00003

    but it's not static for all of the files, you have to make a Header like above for your file. (I suggest try this header first and if it's got error then customize it)