I'm reading Learn You a Haskell, and in the monad chapters, it seems to me that ()
is being treated as a sort of "null" for every type. When I check the type of ()
in GHCi, I get
>> :t ()
() :: ()
which is an extremely confusing statement. It seems that ()
is a type all to itself. I'm confused as to how it fits into the language, and how it seems to be able to stand for any type.
tl;dr ()
does not add a "null" value to every type, hell no; ()
is a "dull" value in a type of its own: ()
.
Let me step back from the question a moment and address a common source of confusion. A key thing to absorb when learning Haskell is the distinction between its expression language and its type language. You're probably aware that the two are kept separate. But that allows the same symbol to be used in both, and that is what is going on here. There are simple textual cues to tell you which language you're looking at. You don't need to parse the whole language to detect these cues.
The top level of a Haskell module lives, by default, in the expression language. You define functions by writing equations between expressions. But when you see foo :: bar in the expression language, it means that foo is an expression and bar is its type. So when you read () :: ()
, you're seeing a statement which relates the ()
in the expression language with the ()
in the type language. The two ()
symbols mean different things, because they are not in the same language. This replication often causes confusion for beginners, until the expression/type language separation installs itself in their subconscious, at which point it becomes helpfully mnemonic.
The keyword data
introduces a new datatype declaration, involving a careful mixture of the expression and type languages, as it says first what the new type is, and secondly what its values are.
data TyCon tyvar ... tyvar = ValCon1 type ... type | ... | ValConn type ... type
In such a declaration, type constructor TyCon is being added to the type language and the ValCon value constructors are being added to the expression language (and its pattern sublanguage). In a data
declaration, the things which stand in argument places for the ValCons tell you the types given to the arguments when that ValCon is used in expressions. For example,
data Tree a = Leaf | Node (Tree a) a (Tree a)
declares a type constructor Tree
for binary tree types storing a
elements at nodes, whose values are given by value constructors Leaf
and Node
. I like to colour type constructors (Tree
) blue and value constructors (Leaf
, Node
) red. There should be no blue in expressions and (unless you're using advanced features) no red in types. The built-in type Bool
could be declared,
data Bool = True | False
adding blue Bool
to the type language, and red True
and False
to the expression language. Sadly, my markdown-fu is inadequate to the task of adding the colours to this post, so you'll just have to learn to add the colours in your head.
The "unit" type uses ()
as a special symbol, but it works as if declared
data () = () -- the left () is blue; the right () is red
meaning that a notionally blue ()
is a type constructor in the type language, but that a notionally red ()
is a value constructor in the expression language, and indeed () :: ()
. [ It is not the only example of such a pun. The types of larger tuples follow the same pattern: pair syntax is as if given by
data (a, b) = (a, b)
adding (,)
to both type and expression languages. But I digress.]
So the type ()
, often pronounced "Unit", is a type containing one value worth speaking of: that value is also written ()
but in the expression language, and is sometimes pronounced "void". A type with only one value is not very interesting. A value of type ()
contributes zero bits of information: you already know what it must be. So, while there is nothing special about type ()
to indicate side effects, it often shows up as the value component in a monadic type. Monadic operations tend to have types which look like
val-in-type-1 -> ... -> val-in-type-n -> effect-monad val-out-type
where the return type is a type application: the (type) function tells you which effects are possible and the (type) argument tells you what sort of value is produced by the operation. For example
put :: s -> State s ()
which is read (because application associates to the left ["as we all did in the sixties", Roger Hindley]) as
put :: s -> (State s) ()
has one value input type s
, the effect-monad State s
, and the value output type ()
. When you see ()
as a value output type, that just means "this operation is used only for its effect; the value delivered is uninteresting". Similarly
putStr :: String -> IO ()
delivers a string to stdout
but does not return anything exciting.
The ()
type is also useful as an element type for container-like structures, where it indicates that the data consists just of a shape, with no interesting payload. For example, if Tree
is declared as above, then Tree ()
is the type of binary tree shapes, storing nothing of interest at nodes. Similarly [()]
is the type of lists of dull elements, and if there is nothing of interest in a list's elements, then the only information it contributes is its length.
To sum up, ()
is a type. Its one value, ()
, happens to have the same name, but that's ok because the type and expression languages are separate. It's useful to have a type representing "no information" because, in context (e.g., of a monad or a container), it tells you that only the context is interesting.