I need to extract RS=368138379
string from following lines in a vcf
file of few thousand millions lines. I am wondering how can we use grep -o ""
and regular expression to quickly extract that?
AF_ESP=0.0001;ALLELEID=359042;CLNDISDB=MedGen:C0678202,OMIM:266600;CLNDN=Inflammatory_bowel_disease_1;CLNHGVS=NC_000006.11:g.31779521C>T;CLNREVSTAT=no_assertion_criteria_provided;CLNSIG=association;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=HSPA1L:3305;MC=SO:0001583|missense_variant;ORIGIN=4;RS=368138379
Thanks very much indeed.
Let's say text.log
contains your log you can use:
grep -oE "RS=[0-9]+" test.log
If you want to print also the line numbers:
grep -noE "RS=[0-9]+" test.log