I have a numerical simulation that spits out a text file with a sequence of square matrices with complex entries. They look like this:
(#,#) (#,#) (#,#)
(#,#) (#,#) (#,#)
(#,#) (#,#) (#,#)
-----
(#,#) (#,#) (#,#)
(#,#) (#,#) (#,#)
(#,#) (#,#) (#,#)
-----
...
I want a fast way of reading this files and store the matrices as numpy arrays in a list. I wrote the following python code to try and solve this problem
with open(f'fort.{n}', 'r') as f:
l = [[num for num in line.split(' ') if num != ''] for line in f]
l = list(filter((l[-2]).__ne__, l))
l = [[num for num in line if num != '\n'] for line in l]
l = [[split(num) for num in line] for line in l]
l = list(filter(([['-----\n']]).__ne__, l))
l = [[float(num[0])+1j*float(num[1]) for num in line] for line in l]
Solutions = []
for i in range(len(l)):
if (i+1)%160 == 0:
Solutions.append(l[i-159:i+1])
but the output files contain hundreds of 160x160 matrices and my program is taking most of its running time to read them. I want to optimize this process but don't know how.
As @dankal444 pointed out in his comment, the .npy format is much better to do what I want, which is transfering data from a Fortran program to a Python script that analyses it. With that in mind, I found a module that lets you write Fortran arrays as .npy files. It looks to be as fast as writing the arrays in unformatted binary files.