pythonexcellarge-datalarge-data-volumesdata-management

Splitting large data file in python


I output a large array into a text file in python. I then read it in excel to plot the data. Currently, the file I am writing is too large to read in excel.

I use file open and close functions and write the data in array ( please refer the code):

with open("abc.txt", "w") as file:
    file.write(str(abc_value))
    file.close()

Question: How can I split the data file so that after 1000000 steps (approximately), the file closes and starts writing to another file.

At the end, there should be multiple data files which I can read in excel separately.

Any leads much appreciated!


Solution

  • I am not sure what is type(abc_value) originally, but if you can submit in in a form of array this code should work:

    counter = 1
    for i in range(0, len(abc_value), 1000000):
        with open(f"abc{counter}.txt", "w") as file:
            for val in abc_value[i:i + 1000000]:
                file.write(str(val))
            file.close()
            counter += 1
    

    The main idea is just to split your original data and then create and open deferent files in for loop.

    Output files should be "abc1.txt","abc2.txt",...

    Hope I understood your question correctly and this answers it.