rloopsfor-loopgenomevcf-variant-call-format

Looping over an R function


I have a function in R which receives a .vcf file and parses it.

For instance

x <- "file1.vcf"

file1.parsed <- parse_vcf_alt1(x)

I have 177 .vcf files in a folder.

They look like this file

https://www.dropbox.com/s/lcdmk57sy3dxexp/file1.vcf?dl=0

I want to feed each of these .vcf files, one by one, into the parse_vcf_alt1 function and obtain a parsed file from that.

Manually doing this is a pain.

How can I automate this in R?

This code gives output

lapply(dir(), function(f) { if grep('vcf', f) { parse_vcf_alt1(f) }})

but I don't know how to save or write output for each parsed vcf separately with its own name.

> dput(frame)

 structure(list(), .Names = character(0), row.names = integer(0), class = "data.frame") > 

Solution

  • frame <- data.frame()
    lapply(dir(), function(f) { if(grep('vcs', f)) { frame[f] <- parse_vcf_alt1(f) }})
    write.csv(frame, 'filename.csv')
    

    dir returns the list of files, grep checks their name and calls parse_vcf if it matches. frame[f] assigns it to a column of the data frame, after which it can be retrieved using, say, frame$vcs.hd1.vcf or whatever the filename happens to be. Finally, write.csv will write your results to filename.csv in the current directory.