pythonpandasmemorymemory-management

How to concatenate multiple pandas.DataFrames without running into MemoryError


I have three DataFrames that I'm trying to concatenate.

concat_df = pd.concat([df1, df2, df3])

This results in a MemoryError. How can I resolve this?

Note that most of the existing similar questions are on MemoryErrors occuring when reading large files. I don't have that problem. I have read my files in into DataFrames. I just can't concatenate that data.


Solution

  • I'm grateful to the community for their answers. However, in my case, I found out that the problem was actually due to the fact that I was using 32 bit Python.

    There are memory limits defined for Windows 32 and 64 bit OS. For a 32 bit process, it is only 2 GB. So, even if your RAM has more than 2GB, and even if you're running the 64 bit OS, but you are running a 32 bit process, then that process will be limited to just 2 GB of RAM - in my case that process was Python.

    I upgraded to 64 bit Python, and haven't had a memory error since then!

    Other relevant questions are: Python 32-bit memory limits on 64bit windows, Should I use Python 32bit or Python 64bit, Why is this numpy array too big to load?