I am getting an error when I am going to read a VCF file using scikit-allel library inside a docker image and os ubuntu 18.04. It shows that
raise RuntimeError('VCF file is missing mandatory header line ("#CHROM...")') RuntimeError: VCF file is missing mandatory header line ("#CHROM...")
But in the VCF file is well-formatted.
Here is my code of how I applied :
import pandas as pd
import os
import numpy as np
import allel
import tkinter as tk
from tkinter import filedialog
import matplotlib.pyplot as plt
from scipy.stats import norm
GenomeVariantsInput = allel.read_vcf('quartet_variants_annotated.vcf', samples=['ISDBM322015'],fields=[ 'variants/CHROM', 'variants/ID', 'variants/REF',
'variants/ALT','calldata/GT'])
version what Installed : Python 3.6.9 Numpy 1.19.5 pandas 1.1.5 scikit-allel 1.3.5
You need to add a line like this in the first:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA00001 NA00002 NA00003
but it's not static for all of the files, you have to make a Header
like above for your file. (I suggest try this header first and if it's got error then customize it)