What is the difference when I write this?
data Book = Book Int Int
versus
newtype Book = Book (Int, Int) -- "Book Int Int" is syntactically invalid
Great question!
There are several key differences.
Representation
newtype
guarantees that your data will have exactly the same representation at runtime, as the type that you wrap.data
declares a brand new data structure at runtime.So the key point here is that the construct for the newtype
is guaranteed to be erased at compile time.
Examples:
data Book = Book Int Int
newtype Book = Book (Int, Int)
Note how it has exactly the same representation as a (Int,Int)
, since the Book
constructor is erased.
data Book = Book (Int, Int)
Has an additional Book
constructor not present in the newtype
.
data Book = Book {-# UNPACK #-}!Int {-# UNPACK #-}!Int
No pointers! The two Int
fields are unboxed word-sized fields in the Book
constructor.
Algebraic data types
Because of this need to erase the constructor, a newtype
only works when wrapping a data type with a single constructor. There's no notion of "algebraic" newtypes. That is, you can't write a newtype equivalent of, say,
data Maybe a = Nothing
| Just a
since it has more than one constructor. Nor can you write
newtype Book = Book Int Int
Strictness
The fact that the constructor is erased leads to some very subtle differences in strictness between data
and newtype
. In particular, data
introduces a type that is "lifted", meaning, essentially, that it has an additional way to evaluate to a bottom value. Since there's no additional constructor at runtime with newtype
, this property doesn't hold.
That extra pointer in the Book
to (,)
constructor allows us to put a bottom value in.
As a result, newtype
and data
have slightly different strictness properties, as explained in the Haskell wiki article.
Unboxing
It doesn't make sense to unbox the components of a newtype
, since there's no constructor. While it is perfectly reasonable to write:
data T = T {-# UNPACK #-}!Int
yielding a runtime object with a T
constructor, and an Int#
component. You just get a bare Int
with newtype
.
References: