I am developing code with a compiled language (Fortran 95) that does certain calculations on a huge galaxy catalog. Each time I implement some change, I compile and run the code, and it takes about 3 minutes just reading the ASCII file with the galaxy data from disk. This is a waste of time.
Had I started this project in IDL or Matlab, then it would be different, because the variables containing the array data would be kept in memory between different compilations.
However, I think something could be done to speed up that unnerving reading from disk, like having the files in a fake RAM partition or something.
Instead of going into details on RAM disks I propose you switch from ASCII databases to Binary ones. here is a very simplistic example... An array of random numbers, stored as ASCII (ASCII.txt) and as binary date (binary.bin):
program writeArr
use,intrinsic :: ISO_Fortran_env, only: REAL64
implicit none
real(REAL64),allocatable :: tmp(:,:)
integer :: uFile, i
allocate( tmp(10000,10000) )
! Formatted read
open(unit=uFile, file='ASCII.txt',form='formatted', &
status='replace',action='write')
do i=1,size(tmp,1)
write(uFile,*) tmp(:,i)
enddo !i
close(uFile)
! Unformatted read
open(unit=uFile, file='binary.bin',form='unformatted', &
status='replace',action='write')
write(uFile) tmp
close(uFile)
end program
Here is the result in terms of sizes:
:> ls -lah ASCII.txt binary.bin
-rw-rw-r--. 1 elias elias 2.5G Feb 20 20:59 ASCII.txt
-rw-rw-r--. 1 elias elias 763M Feb 20 20:59 binary.bin
So, you save a factor of ~3.35 in terms of storage. Now comes the fun part: reading it back in...
program readArr
use,intrinsic :: ISO_Fortran_env, only: REAL64
implicit none
real(REAL64),allocatable :: tmp(:,:)
integer :: uFile, i
integer :: count_rate, iTime1, iTime2
allocate( tmp(10000,10000) )
! Get the count rate
call system_clock(count_rate=count_rate)
! Formatted write
open(unit=uFile, file='ASCII.txt',form='formatted', &
status='old',action='read')
call system_clock(iTime1)
do i=1,size(tmp,1)
read(uFile,*) tmp(:,i)
enddo !i
call system_clock(iTime2)
close(uFile)
print *,'ASCII read ',real(iTime2-iTime1,REAL64)/real(count_rate,REAL64)
! Unformatted write
open(unit=uFile, file='binary.bin',form='unformatted', &
status='old',action='read')
call system_clock(iTime1)
read(uFile) tmp
call system_clock(iTime2)
close(uFile)
print *,'Binary read ',real(iTime2-iTime1,REAL64)/real(count_rate,REAL64)
end program
The result is
ASCII read 37.250999999999998
Binary read 1.5460000000000000
So, a factor of >24!
So instead of thinking of anything else, please switch to a binary file format first.