pythonpandaspytables

Filtering PyTables PerformanceWarning with warnings.filterwarnings() fails


There are a number of answers on this website detailing how one can ignore specific warnings in python (either by category or by providing a regex to match a warning message).

However, none of these seem to work when I try try to suppress PerformanceWarnings coming from PyTables.

Here's an MWE:

import pandas as pd 
import warnings 
from tables import NaturalNameWarning, PerformanceWarning

data = {
    'a' : 1,
    'b' : 'two'
} 
df = pd.DataFrame.from_dict(data, orient = 'index') # mixed types will trigger PerformanceWarning

dest = pd.HDFStore('warnings.h5', 'w') 

#dest.put('data', df) # mixed type will produce a PerformanceWarning
#dest.put('data 1', df) # space in 'data 1' will trigger NaturalNameWarning in addition to the PerformanceWarning

warnings.filterwarnings('ignore', category = NaturalNameWarning) # NaturalNameWarnings ignored 
warnings.filterwarnings('ignore', category = PerformanceWarning) # no effect
warnings.filterwarnings('ignore', message='.*PyTables will pickle') # no effect
#warnings.filterwarnings('ignore') # kills all warnings, not what I want

dest.put('data 2', df) # PerformanceWarning

dest.close()

Using a context manager doesn't help either:

with warnings.catch_warnings():
    warnings.filterwarnings("ignore", category=PerformanceWarning) # no effect
    warnings.filterwarnings('ignore', message='.*PyTables') # no effect
    dest.put('data 6', df)

Nor does using warnings.simplefilter() instead of warnings.filterwarnings().

Perhaps relevant, here is the PerformanceWarning:

test.py:21: PerformanceWarning: 
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed-integer,key->block0_values] [items->Int64Index([0], dtype='int64')]

  dest.put('data 2', df) # PerformanceWarning

Contrast this with the NaturalNameWarning, which doesn't come from the offending line in test.py, but from tables/path.py:

/home/user/.local/lib/python3.8/site-packages/tables/path.py:137: NaturalNameWarning: object name is not a valid Python identifier: 'data 2'; it does not match the pattern ``^[a-zA-Z_][a-zA-Z0-9_]*$``; you will not be able to use natural naming to access this object; using ``getattr()`` will still work, though
  check_attribute_name(name)

This is with tables 3.7.0/python 3.8.10. Any ideas?


Solution

  • This may be confusing but the PerformanceWarning is not emitted by the tables package but by pandas:

    Try:

    from pandas.errors import PerformanceWarning
    

    Example:

    import pandas as pd 
    import warnings 
    from tables import NaturalNameWarning
    from pandas.errors import PerformanceWarning
    
    data = {
        'a' : 1,
        'b' : 'two'
    } 
    df = pd.DataFrame.from_dict(data, orient = 'index')
    
    dest = pd.HDFStore('warnings.h5', 'w') 
    
    with warnings.catch_warnings():
        warnings.filterwarnings("ignore", category=PerformanceWarning)
        dest.put('data', df) # mixed type will produce a PerformanceWarning
        dest.put('data 1', df) # space in 'data 1' will trigger NaturalNameWarning
    
    dest.close()
    

    Only the NaturalNameWarning should remain in the above example.