pythonpandasdoctest

Doctest: How can I identify differences between expected and actual output?


In the below doctest, the expected and actual output look identical. I'm pretty sure there is some white space error since when I enable NORMALIZE_WHITESPACE, the test is successful. Is there another flag that can help a user triangulate on the location of such a whitespace error?

"""
>>> df  # doctest:-NORMALIZE_WHITESPACE
              Open
Date
2021-05-17  126.82
"""

import pandas as pd

df = pd.DataFrame({'Date': ['2021-05-17'], 'Open': [126.82]}).set_index('Date')

if __name__ == "__main__":
    import doctest
    doctest.testmod()
Failed example:
    df  # doctest:-NORMALIZE_WHITESPACE
Expected:
                  Open
    Date
    2021-05-17  126.82
Got:
                  Open
    Date
    2021-05-17  126.82

Solution

  • Try using the flag REPORT_NDIFF. The docs say, "This is the only method that marks differences within lines as well as across lines", which will help identify any whitespace problems, for example:

    """
    >>> print('Hello goodbye')
    Hello goodbye\r
    """
    
    Failed example:
        print('Hello goodbye')
    Differences (ndiff with -expected +actual):
        - Hello goodbye
        ?              -
        + Hello goodbye
    

    Note that the carriage return doesn't actually appear here.


    In this particular case, the problem is that dataframes with an index name have trailing whitespace, so it's failing on Date vs Date [...]. The whitespace in the docstring was probably stripped by your editor (which is good, and NORMALIZE_WHITESPACE is the right tool to account for that), but I'm not sure why it was stripped from the "Got" output too — maybe your terminal did that (which is not good) or maybe you pasted it into your editor then copied that to Stack Overflow.