I have a huge file of numbers in binary format, and only certain parts of it needs to be parsed into an array. I looked into numpy.fromfile
and open
, but they don't have the option to read from location A to location B in the file. Can this be done?
If you're dealing with "huge files", I would not simply read-ignore everything up until the point where you actually need the data.
Instead: file objects in Python have a .seek()
method which you can use to jump right where you need to start parsing the data efficiently bypassing everything before.
with open('huge_file.dat', 'rb') as f:
f.seek(1024 * 1024 * 1024) # skip 1GB
...
See also: http://docs.python.org/2/tutorial/inputoutput.html#methods-of-file-objects