This is a follow-up of sorts to this question about using NamedTemporaryFile()
I have a function that creates and writes to a temporary file. I then want to use that file in a different function, which calls a terminal command that uses that file (the program is from the Blast+ suite, blastn
).
def db_cds_to_fna(collection="genes"): # collection gets data from mongoDB
tmp_file = NamedTemporaryFile()
# write stuff to file
return tmp_file
def blast_all(blast_db, collection="genes"):
tmp_results = NamedTemporaryFile()
db_fna = db_cds_to_fna(collection) # should return another file object
Popen(
['blastn',
'-query', db_fna.name,
'-db', blast_db,
'-out', tmp_results.name,
'-outfmt', '5'] # xml output
)
return tmp_results
When I call blast_all
, I get an error from the blastn
command:
Command line argument error: Argument "query". File is not accessible: `/var/folders/mv/w3flyjvn7vnbllysvzrf9y480000gn/T/tmpAJVWoz'
But, just prior to the Popen
call, if I do os.path.isfile(db_fna.name)
it evaluates to True
. I can also do
print Popen(['head', db_fna.name]).communicate(0)
And it properly spits out the first lines of the file. So the file exists, and it's readable. Further, I use the same strategy to call a different program from the same blast+ suite (makeblastdb
, see question linked at the top) and it works. Is there possibly some problem with permissions? FWIW blastn
returns the same error if the file doesn't exist, but it seems clear that I'm correctly creating the file and it's readable when I make the Popen
call, so I'm stumped.
I believe I figured out the things conspiring to cause this behavior. First, the Popen()
function does not normally wait until the external command finishes before proceeding past it. Second, because as user glibdud mentioned in his answer to my other question, NamedTemporaryFile
acts like TemporaryFile
in that
It will be destroyed as soon as it is closed (including an implicit close when the object is garbage collected).
Since the end of my blast_all()
function does not return the query
temp file, it gets closed and garbage collected while the external blastn
command is running, so the file is deleted. I'm guessing that the external head
command goes so quickly it doesn't encounter this problem, but blastn
can take up to a couple of minutes to run.
So the solution is to force Popen()
to wait:
Popen(
['blastn',
'-query', db_fna.name,
'-db', blast_db,
'-out', tmp_results.name,
'-outfmt', '5'] # xml output
).wait()