pythonpytestconftest

How to best structure conftest and fixtures in across multiple pytest files


Let's say I have 3 lists of DataFrames containing different data that I want to run the same test cases on. How do I best structure my files and code so that I have one conftest.py (or some sort of parent class) that contains all the test cases that each list needs to run on, and 3 child classes that have different ways of generating each list of DataFrames but run the same test cases?

This is how I am currently constructing it.

import pytest

class TestOne:
    
    # this method usually takes 10 mins to run
    # so we want this to run once and use the same Dict for all test cases
    dfs: Dict[str, pd.DataFrame] = get_list_of_dfs_somewhere("one")

    def test_dfs_type(self):
        assert isinstance(self.dfs, dict)

    def test_another_one(self):
        assert ...

dfs will not be modified throughout the test suite, so I want to treat this like a setup.

TestTwo and TestThree are the same thing except it will be get_list_of_dfs_somewhere("two") and get_list_of_dfs_somewhere("three")

Any tips on how to efficiently structure this would be appreciated!


Solution

  • In case if you need to run the same test case but with different data you can use the parametrize function. So, let's say this is you test:

    def test_dfs_type():
            assert isinstance(dict)
    

    And you need to run it 3 times. One for each data frame you have.
    To do that you can put all the data you need into a list.
    But first, let's create the classes (I've simplified them a bit):

    # classes.py
    class ClassOne:
        # this method usually takes 10 mins to run
        # so we want this to run once and use the same Dict for all test cases
        # dfs: Dict[str, pd.DataFrame] = get_list_of_dfs_somewhere("one")
        dfs: dict[str, str] = {'one': 'class one value'}
    
    
    class ClassTwo:
        # this method usually takes 10 mins to run
        # so we want this to run once and use the same Dict for all test cases
        # dfs: Dict[str, pd.DataFrame] = get_list_of_dfs_somewhere("two")
        dfs: dict[str, str] = {'two': 'class two value'}
    
    
    class ClassThree:
        # this method usually takes 10 mins to run
        # so we want this to run once and use the same Dict for all test cases
        # dfs: Dict[str, pd.DataFrame] = get_list_of_dfs_somewhere("three")
        dfs: dict[str, str] = {'three': 'class three value'}
    

    Now, let's create the file with tests:

    # test_classes.py
    
    import pytest
    from classes import ClassOne, ClassTwo, ClassThree
    
    
    DATA_FRAMES = [ClassOne.dfs, ClassTwo.dfs, ClassThree.dfs]
    
    
    @pytest.mark.parametrize('data_frame', DATA_FRAMES)  # Here we create a parameter "data_frame" that will give one object from a list at each test run.
    def test_dfs_type(data_frame):  # And here is the arguments we indicate that the test waits for that parameter.
        print(data_frame)  # Print data just to see what happens in each test
        assert isinstance(data_frame, dict)
    

    The result is:

    >> pytest -v -s
    test_classes.py::test_dfs_type[data_frame0] {'one': 'class one value'}
    PASSED
    test_classes.py::test_dfs_type[data_frame1] {'two': 'class two value'}
    PASSED
    test_classes.py::test_dfs_type[data_frame2] {'three': 'class three value'}
    PASSED