bddgherkinacceptance-testing

Should BDD scenarios include actual test data, or just describe it?


We've come to a point where we've realised that there are two options for specifying test data when defining a typical CRUD scenario:

Option 1: Describe the data to use, and let the implementation define the data

Scenario: Create a region
    Given I have navigated to the "Create Region" page
      And I have typed in a valid name
      And I have typed in a valid code
    When I click the "Save" button
    Then I should be on the "Regions" page
     And the page should show the created region details

Option 2: Explicitly state the test data to use

Scenario: Create a region
    Given I have navigated to the "Create Region" page
      And I have filled out the form as follows
        | Label | Value  |
        | Name  | Europe |
        | Code  | EUR    |
    When I click the "Save" button
    Then I should be on the "Regions" page
     And the page should show the following fields
        | Name   | Code |
        | Europe | EUR  |

In terms of benefits and drawbacks, what we've established is that:

Option 1 nicely covers the case when the definition of say a "valid name" changes. This could be more difficult to deal with if we went with Option 2 where the test data is in several places. Option 1 explicitly describes what's important about the data for this test, especially if it were a scenario where we were saying something like "has typed in an invalid credit card number". It also "feels" more abstract and BDD somehow, being more concerned with description than implementation.

However, Option 1 uses very specific steps which would be hard to re-use. For example "the page should show the created region details" will probably only ever be used by this scenario. Conversely we could implement Option 2's "the page should show the following fields" in a way that it could be re-used many times by other scenarios.

I also think Option 2 seems more client-friendly, as they can see by example what's happening rather than having to interpret more abstract terms such as "valid". Would Option 2 be more brittle though? Refactoring the model might mean breaking these tests, whereas if the test data is defined in code the compiler will help us with model changes.

I appreciate that there won't be a right or wrong answer here, but would like to hear people's opinions on how they would decide which to use.

Thanks!


Solution

  • I would say it depends. There are times when a Scenario might require a large amount of data to complete a successful run. Often the majority of that data is not important to the thing we are actually testing and therefore becomes noise distracting from the understanding we are trying to achieve with the Scenario. I started using something I call a Default Data pattern to provide default data that can be merged with data specific to the Scenario. I have written about it here:

    I hope this helps.