my application, among other things, uses some crawlers to read information exposed by a remote xml feed by another application (we're not responsible for this one). Crawled data is later displayed to the user. The xml might contain simple data and links, that we follow if we need additional data.
The tests in our system are both unit tests, that test that we parse correctly the xml documents, and acceptance tests, that are meant to test what we display in our ui.
I was reasoning about the acceptance tests, and that's what this question is about. Right now, for each acceptance test, we bring an embedded http server that serves some test data, that is specific for the test. We then start up our application, we crawl the test data and we verify the criteria for the test. While this approach has the advantage of testing the whole system from end to end, it has also the side effect of increasing the build time considerably each time we add a new acceptance test.
Is this the right approach for the acceptance tests? I was wondering if, since the system that provides the feeds is an external one, wouldn't it be better to test the network communication layer and the crawlers at unit level and run the acceptance tests assuming the data has already been crawled?
I'd like to hear some thought from somebody else. :-)
Thanks!
Acceptance tests do tend to run slowly, require more setup and tend to be far more brittle than unit or integration tests. If you search the web for "test pyramid" you will find plenty of information on this. The general consensus is that you should have tests at the unit, integration and acceptance levels. With most tests being unit tests and just a few acceptance tests that do the end-to-end stuff. Often development teams will setup their ci servers to only run any long running acceptance tests during their nightly build processes so that they don't impact the performance of the unit test test runs.