windowspandassumoverflowint64

Pandas sum of column overflow on Windows


I'm developing an app that must be executed in a Windows Server (2012 R2). When I run it locally (Win 7), It looks fine, but when I run it in the server I had negative results when it supposed to be positive:

DataFrame.column.sum()

I read that's because there is a bug between Python 2.7 and some Windows versions.

The problem is there I have a lot of parts in the code where I use pandas.col.sum() and a few of the columns are dtype = int64, is there a way to solve this? maybe changing the dtype when I read the df?


Solution

  • I found a workaround based on this answer

    In [1]: import pandas as pd
    
    In [2]: s = pd.Series([2**31])
    
    In [3]: s.sum()
    Out[3]: -2147483648
    
    In [4]: from pandas.core import nanops
    
    In [5]: nanops._USE_BOTTLENECK
    Out[5]: True
    
    In [6]: nanops._USE_BOTTLENECK = False
    
    In [7]: s.sum()
    Out[7]: 2147483648