Consider the following piece of code:
data Slice = Slice
{ text :: String,
color :: Color
}
newtype Color = Color
{ string :: String
}
mainList :: [FilePath] -> [FilePath] -> [String] -> [[Slice]]
mainList somethingA somethingB codedLines = ...
The Slice
record represents a result of decoding a string with ANSI escape codes.
Inside the mainList
function I use some function, let's call it categorize
that parses the codedLines
(this represents a line of ANSI escape coded strings) parameter to [[Slice]]
.
How would I write unit tests for the mainList
function using the Quickcheck unit testing framework?
I am fairly new to Haskell. Now, if this had been some OOP language, I know what to do: have the class that has the categorize
method passed as a parameter to the ctor of the class that has the mainList
method and then mock my heart out of it using some mocking library or even write the mock manually. But what a man to do in Haskell, where there is no notion of classes?
Maybe I can change the mainList
function like follows:
mainList :: [FilePath] -> [FilePath] -> ([String] -> [[Slice]]) -> [[Slice]]
But then, I would have to pass it a mock instead of the third parameter. Since Haskell is not an OOP language, is there even a notion of a mock? Is this idiomatic Haskell? Or maybe I'm mistakenly projecting OOP principles onto a functional language?
Thanks in advance.
You have correctly noticed that if you want to use different implementations of functions that are called by a function that you're testing, you can simply turn them into parameters instead of calling them directly:
mainList :: TypeOfCategorize -> [FilePath] -> [FilePath] -> [String] -> [[Slice]]
mainList categorize ... =
... categorize ...
mainListForTesting = mainList categorizeMock
mainListForProd = mainList categorizeReal
This is in fact the exact same idea of which "dependency injection frameworks" are a massively more complicated version. If you would like it to be more complicated in Haskell too you can apply any of the standard techniques used for managing additional pieces of information you need to pass around and don't want to manage so explicitly; the fact that you're doing this for testing purposes isn't hugely different from many other contexts in which this concern arises. A few thoughts off the top of my head:
ReaderT
to make such a record-of-functions accessible without needing to be explicitly passed all the time. (This will only save you much if you have a lot of functions calling each other with the same set of implementations, rather than every function needing a different set of functions that could be mocks or real)But fundamentally it all boils down to: if you want to use different values on some invocations than on others, then those values are some form of parameter. And in Haskell, an implementation is a function is a value. So you use the same tools for getting different implementations into your mainList
calls for testing purposes that you would for getting any other value into different mainList
calls for any other purpose.
You also ask whether "mocking" is idiomatic Haskell, and I would say the answer is "it depends, but largely no" (full disclaimer: I would also say that mocking is massively over-used in OOP testing too, so maybe my preferred ways of thinking about testing are just not to your taste).
mainList
is pure (and is constrained to be so from the type, as long as we ignore unsafe*
functions). That means that everything it calls is also pure. Almost all of the time, that means that there's no need to mock anything.
mainList
will call categorise
with some particular arguments, perhaps multiple times. To test that mainList
does the correct thing, you need return values for those calls. Since those calls are pure, each of them has a single correct answer that is completely independent of anything else that is going on in your program; there's no state to setup, no side effects you might not want, no external environment required. If mainList
uses anything other than the single correct return value for each call to categorize
, then you're not testing mainList
's real behaviour, you're testing hypothetical behaviour when categorize
returns something other than what it should; it's not going to do that in production, so testing what it does in those conditions isn't very useful. One way to get mainList
to use the correct values so you can test the behaviour of mainList
is to very carefully consider the test case you're running and work out what that correct answer(s) should be, and pass in a mock categorize
that returns the correct answer(s) for this test case. But a far easier way is to just pass in the real categorize
and let mainList
get the correct answer from the function you wrote to produce the correct answers.
This does mean you might get a test failure from mainList
that only happens because categorize
returned the wrong result. But the tests for mainList
aren't what guards against that vulnerability; the tests for categorize
do that job! If you get test failures for mainList
and categorize
at the same time, just start debugging categorize
first (which is a good idea anyway, in case your changes there require flow-on changes in the things that call it, including mainList
). You're far more likely to get a false negative test from mainList
from a bad mock than you are from the real categorize
. After all categorize
is the thing you are specifically engineering to return the correct result for categorize
calls; no mock is ever going to be as likely to succeed as that job. And if the requirements for categorize
change what the correct result is (either from external change or because something was misunderstood originally), you're much better off having the test for mainList
pick up the changed categorize
and detect whether that causes mainList
to fail. If categorize
is mocked then the tests for mainList
will continue to pass against the old definition for what categorize
is supposed to return, and then fail in production. The danger of mocks getting out of sync with real implementations that change is much worse than the danger of getting two test failures caused by one defect, in my opinion.
You might need to mock if:
In Haskell that's generally a pretty small minority of functions we want to test. Purity and the strong type system mean we're encouraged to have most of our code in pure functions, that can be tested independently of any external environment. Only a relatively small fraction of the code base directly deals with the external environment (rather than being parameterised on things that have been computed from it). And functions that take a long time are usually scaling with the size of some inputs, so they can be tested with smaller inputs that take less time.
So in Haskell mocking is not generally considered a core testing technique that should be applied to everything.
The mocking that we do do often isn't called or thought of as mocking. Using things like monad transformers or effect systems, we often write code that does depend on an environment or have side effects in a way that is polymorphic in the specific implementation of the effects. The motivation for this is usually talked about more in terms of getting better type safety and composability, but as a side effect (pun intended) it also means we can swap different implementations for testing. But this isn't used to mock out arbitrary pure functions called by pure functions under test.
You mention "the QuickCheck unit testing framework". QuickCheck can indeed be regarded as a unit testing framework, but it usually isn't because it's a library for property-testing, which is a very specific kind of unit test (or isn't unit testing at all, depending on your definitions).
For property tests, you don't just write one specific test case and check that you get the expected output. Instead you write a property which is parameterised on some inputs and tests if the property holds; this needs to be a general test of the property regardless of the specific value of the input (so very different from a typical unit test where you check features of the output for specific known inputs). QuickCheck will then randomly generate lots of values trying to find one that falsifies your property; the test passes if it can't find one.
The standard trivial example (from the QuickCheck docs) is:
import Test.QuickCheck
-- Reversing a list twice results in the original input list
prop_reverse :: [Int] -> Bool
prop_reverse xs = reverse (reverse xs) == xs
We didn't test this by checking that reverse (reverse [1, 2, 3]) == [1, 2, 3]
, we wrote a property that will test that for any list of integers.
It's not my intent to talk a lot about property based testing, but I'm hoping that you might see why I bring it up in the context of your questions about mocks and testing with QuickCheck.
If you were using QuickCheck to test mainList
, you would be writing general properties that should work for any random input that QuickCheck generates. This does not go very well with the idea of mocking the functions that mainList
calls. We don't have a specific test case with known inputs and therefore known calls that should be made to categorize
, so we can't just make a mock categorize
that ignores its inputs and returns the values we expect mainList
to get. In an ideal world, the generated inputs for mainList
will do a decent job of representing every possible input and code path that mainList
could face in production, so we'd need a mock categorize
that can return the correct result for any possible call. We already have a function with that specification: categorize
. Writing categorize
a second time just to test mainList
is a bad idea.
(If categorize
has a tricky/fragile implementation for performance reasons, writing a more straightforward and obviously-correct implementation would be an excellent way to test categorize
itself; just write a property test that the obviously-correct-but-slow implementation always returns the same thing as the fragile-but-fast implementation, for all inputs! But there's still no point using the second implementation in property tests of things that call categorize
)
The popularity of property-based testing in Haskell is another reason why mocking is not such a common technique. If you have properties that actually do a reasonable job characterising what correct behaviour of a function looks like, then you generally can't test those properties without real implementations