pythonpytestseaborn

testing the output of seaborn figure level plots


I'm writing a piece of code involving data displaying with seaborn. I would like to develop some testing units to test that the output of figure level plots are properly generated.

I can test that the figure file is created, but how can I check that the output is the correct one? My idea would be to use a standard dataset and compare the output figure with a reference one. But how can I compare the two plots? I can calculate the checksum of the output figure and of the reference one, is this test accurate?

Thanks for your help!


Solution

  • Your code is using seaborn as a library. So you shouldn't in principle write test code to test the functioning of seaborn itself.

    My idea would be to use a standard dataset and compare the output figure with a reference one. But how can I compare the two plots? I can calculate the checksum of the output figure and of the reference one, is this test accurate?

    I wouldn't recommend this. It could work in theory but it's just a bad way of testing because I don't think seaborn promises a pixel-to-pixel exact image output which will stay consistent over new version releases. Also what happens if there is some "randomness" in seaborn implementation of some function, like what if it chooses a random colour when none is specified when you try to plot a histogram for example? Your tests would fail unpredictably.

    Generally, the approach would be to just trust that seaborn works properly and write tests that check if you are calling seaborn correctly.

    Here is an example. If you want to test this function

    def create_tip_distribution_plot(data):
        plt.figure(figsize=(8, 6))
        ax = sns.histplot(data['tip'], kde=True)
        ax.set_title('Tip Distribution')
        ax.set_xlabel('Tip Amount')
        ax.set_ylabel('Frequency')
        return ax
    
    

    The test would just make sure that histplot gets called with expected parameters when your function gets called.

    class TestTipDistributionPlot(unittest.TestCase):
        def test_seaborn_method_calls(self):
            test_data = pd.DataFrame({'tip': [1, 2, 3, 4, 5]})
            
            with patch('seaborn.histplot') as mock_histplot:
                create_tip_distribution_plot(test_data)
                mock_histplot.assert_called_once_with(
                    test_data['tip'], 
                    kde=True
                )