Algebraic data types make it easy to allow one item from a set to be selected: just use an appropriate sum type.
I'm curious how one would implement the option of selecting strictly n items or fewer from a set. I can see how it might be possible to implement this with Data.Set, but I wonder if there's a pattern or an algebraic structure that would be more robust.
Say a burger can have three toppings from the set Pickle | Onion | Lettuce | Tomato etc. It's reasonable to want to display all the options in a UI, so if we use Data.Set, right away I notice that Data.Set.all is missing, so there's no easy way to print all the values that can be selected. But Data.Set.powerSet is available, so I'm thinking maybe I should be getting the user to choose an element of the subset of the power set with cardinality <= 3.
Using an element of the power set to represent a selection from the base set seems like a good idea. It doesn't look much like any UI I can think of, though. The power set functor is a monad, though I'm not sure if this is relevant (See discussion about power set as a functor here Sets, Functors and Eq confusion).
Maybe someone who has solved a similar problem before will have advice on how to best make this work. I'm really looking for conceptual insights into what an "n choose k" type looks like, if that makes sense.
Here's an answer, of sorts.
Here are a few extensions and imports needed for the rest of the code below.
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}
import GHC.TypeNats
import Data.Proxy
import Data.Set (Set)
import qualified Data.Set as Set
As you're probably aware, the nice thing about representing a single item as an ADT:
data Topping = Pickle | Onion | Lettuce | Tomato deriving (Show, Eq, Ord, Enum, Bounded)
is that the definition of the type makes it impossible to represent anything other than exactly one choice of Topping
, provided we ignore the possibility of having an undefined
value. It's equally easy to represent "up to one choice of Topping
" in this foolproof way, using a Maybe
type:
type UpToOneTopping = Maybe Topping
We can also represent zero or more (including a possibly infinite number of) toppings, with possible duplicates, using a list:
type ManyToppings = [Topping]
However, it's difficult to represent a set of toppings without duplicates this same way. We can use the Set Topping
type, but we're no longer relying on the type system to make sets with duplicates unrepresentable. Instead, we're relying on the implementation of the Set
type in the containers
package, and the restricted interface it provides, to guarantee that the set won't contain any duplicates. If we dug into the guts of the Set
implementation, we'd probably discover that the internal representation can allow sets with duplicates, and we're relying on the correctness of the code written by the Set
author to make sure this can't happen.
In contrast, the Topping
type above makes a stronger guarantee than this, making anything other than a single type unrepresentable by construction. (Well, I guess you could say we're relying on the correctness of the compiler, but it's common to think of this kind of guarantee as "stronger" than one that relies on the correctness of the Set
implementation.)
In a similar vein to the Set
type, it would be possible to use an UpToThree Topping
type to represent a set of zero to three distinct toppings, but the guarantee would be provided by the implementation and its sanctioned interface, not by the data representation itself. That is, we could write:
newtype UpToThree a = UpToThree (Set a)
upToThree :: (Ord a) => [a] -> UpToThree a
upToThree xs | length xs' > 3 = error "no more than three allowed"
| otherwise = UpToThree xs'
where xs' = Set.fromList xs
If we only create UpToThree
values via this upToThree
"smart constructor" (either through programmer discipline or using Haskell's package system to limit the interface exported to users of the data type), then we can provide a guarantee of sorts that any UpToThree
value will represent no more than three distinct a
values.
In practical Haskell code, this is probably the most reasonable approach. One limitation is that it requires a separate new type for each up-to-k
number. We could make the number a field in the data type:
data UpToK a = UpToK Int (Set a)
upToK :: (Ord a) => Int -> [a] -> UpToK a
upToK k xs | length xs' > k = error "too many"
| otherwise = UpToK k xs'
where xs' = Set.fromList xs
but this UpToK
type is much less useful than it might first appear. The problem is that the smart constructor ensures that a particular UpToK
represents a number of distinct values of type a
that's no more than than the count given by the integer field k
encoded in the value, so it ensures consistency between the two fields in the value, but the type system doesn't enforce anything about k
, so if we have a function that requires no more than three toppings using the original UpToThree
:
data Burger = Burger
placeOnBurger :: UpToThree Topping -> Burger
placeOnBurger u = undefined
we can ensure through a type signature using UpToThree
that we receive no more than three toppings, provided the implementation of the UpToThree
type is sound, but if we try to write an UpToK
version, we end up having to inspect the field anyway to ensure our precondition is met:
placeOnBurger' :: UpToK Topping -> Burger
placeOnBurger' (UpToK k _) | k > 3 = error "no one can fit so many toppings!"
placeOnBurger' u = undefined
which means we could have just checked the length of a Set
directly and didn't need the UpToK
type at all:
placeOnBurger'' :: Set Topping -> Burger
placeOnBurger'' s | length s > 3 = error "too many toppings for a burger"
placeOnBurger'' s = undefined
Fortunately, there is a way to use the DataKinds
extension, to index a collection of types by a non-negative number. With a few other extensions, this allows us to write:
newtype UpTo (k :: Nat) a = UpTo (Set a)
upTo :: forall k a. (KnownNat k, Ord a) => [a] -> UpTo k a
upTo xs | length xs' > fromIntegral (natVal (Proxy @k)) = error "too many"
| otherwise = UpTo xs'
where xs' = Set.fromList xs
This UpTo
type is much more useful than UpToK
. It still relies on a protected upTo
smart constructor to ensure that only valid UpTo
values are produced, but the upper limit is now part of the type, so we can write:
placeOnBurger''' :: UpTo 3 Topping -> Burger
placeOnBurger''' s = undefined
and we have a strong, type system guarantee that this version of placeOnBurger'''
will only be called with an UpTo 3 Topping
value, which implies a (weaker) smart constructor guarantee that the number of toppings passed will be no more than three.
If you want the stronger guarantee of encoding the topping limit in the type so that it's enforced by the type system itself, the same way Maybe Topping
enforces a zero-or-one topping rule, then it's possible, but it can quickly get unwieldy.
One method is to use a GADT in combination with a Peano natural that's automatically promoted to the type level by the DataKinds
extension:
data Peano = Z | S Peano
It's also helpful to have a type-level function for converting from plain type-level naturals to these Peano type-level naturals:
type family P n where
P 0 = Z
P n = S (P (n-1))
Then we can easily create a GADT representing a list of "no more than n
" elements:
-- a list of no more than `n` elements of type `a`
data LimitedList (n :: Peano) a where
Empty :: LimitedList n a
Cons :: a -> LimitedList n a -> LimitedList (S n) a
infixr 5 `Cons`
deriving instance (Show a) => Show (LimitedList n a)
It takes a little study to understand why this representation works, but observe what happens in GHCi when we try to create a 3-element list whose type-level limits are 3, 5, and 2:
λ> 1 `Cons` 2 `Cons` 3 `Cons` Empty :: LimitedList (P 3) Int
...works...
λ> 1 `Cons` 2 `Cons` 3 `Cons` Empty :: LimitedList (P 5) Int
...works...
λ> 1 `Cons` 2 `Cons` 3 `Cons` Empty :: LimitedList (P 2) Int
...generates a type error...
The downside of this GADT is that it allows duplicates and doesn't provide a unique representation, so if we had multiple representations of pickles and onions as a no-more-than-3 list of toppings:
top1 = Pickle `Cons` Onion `Cons` Empty :: LimitedList (P 3) Topping
top2 = Onion `Cons` Pickle `Cons` Empty :: LimitedList (P 3) Topping
top3 = Onion `Cons` Pickle `Cons` Onion `Cons` Empty :: LimitedList (P 3) Topping
but wanted them to be treated uniformly (compare as equal, generate the same list of toppings in the same order when we run toList
on them, etc.), we'd end up having to write some more code to provide these (weak) guarantees.
To answer another part of your question, it's possible to write a function to generate all valid LimitedList
s of a certain size for any enumerable type, though the output will include these duplicates and alternative representations. Writing such a function requires the use of a so-called "singleton" representation for the Peano naturals:
data SPeano n where
SZ :: SPeano Z
SS :: SPeano n -> SPeano (S n)
allLimitedLists :: (Enum a, Bounded a) => SPeano n -> [LimitedList n a]
allLimitedLists SZ = [Empty]
allLimitedLists (SS n) = [x `Cons` xs | x <- [minBound..maxBound]
, xs <- rest]
++ map levelUp rest
where rest = allLimitedLists n
levelUp :: LimitedList n a -> LimitedList (S n) a
levelUp Empty = Empty
levelUp (x `Cons` xs) = x `Cons` levelUp xs
all3Toppings :: [LimitedList (P 3) Topping]
all3Toppings = allLimitedLists (SS (SS (SS SZ)))
main = print all3Toppings
The rather strange levelUp
function here is an expensive identity function, at least at the term level. (At the type level, it's not the identity, which is kind of the point.) Anyway, it's an unfortunate artifact of the choice of representation of LimitedList
. Ironically, it can be safely replaced with levelUp = unsafeCoerce
. At least, I think it can. It seems to work, at any rate.
Before I give one last approach, here's all the code so far in case you want to play with it:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE UndecidableInstances #-}
import GHC.TypeNats
import Data.Proxy
import Data.Set (Set)
import qualified Data.Set as Set
data Topping = Pickle | Onion | Lettuce | Tomato deriving (Show, Eq, Ord, Enum, Bounded)
type UpToOneTopping = Maybe Topping
type ManyToppings = [Topping]
newtype UpToThree a = UpToThree (Set a)
upToThree :: (Ord a) => [a] -> UpToThree a
upToThree xs | length xs' > 3 = error "no more than three allowed"
| otherwise = UpToThree xs'
where xs' = Set.fromList xs
data UpToK a = UpToK Int (Set a)
upToK :: (Ord a) => Int -> [a] -> UpToK a
upToK k xs | length xs' > k = error "too many"
| otherwise = UpToK k xs'
where xs' = Set.fromList xs
data Burger = Burger
placeOnBurger :: UpToThree Topping -> Burger
placeOnBurger u = undefined
placeOnBurger' :: UpToK Topping -> Burger
placeOnBurger' (UpToK k _) | k > 3 = error "no one can fit so many toppings!"
placeOnBurger' u = undefined
placeOnBurger'' :: Set Topping -> Burger
placeOnBurger'' s | length s > 3 = error "too many toppings for a burger"
placeOnBurger'' s = undefined
newtype UpTo (k :: Nat) a = UpTo (Set a)
upTo :: forall k a. (KnownNat k, Ord a) => [a] -> UpTo k a
upTo xs | length xs' > fromIntegral (natVal (Proxy @k)) = error "too many"
| otherwise = UpTo xs'
where xs' = Set.fromList xs
placeOnBurger''' :: UpTo 3 Topping -> Burger
placeOnBurger''' s = undefined
data Peano = Z | S Peano
type family P n where
P 0 = Z
P n = S (P (n-1))
-- a list of no more than `n` elements of type `a`
data LimitedList (n :: Peano) a where
Empty :: LimitedList n a
Cons :: a -> LimitedList n a -> LimitedList (S n) a
infixr 5 `Cons`
deriving instance (Show a) => Show (LimitedList n a)
top1 = Pickle `Cons` Onion `Cons` Empty :: LimitedList (P 3) Topping
top2 = Onion `Cons` Pickle `Cons` Empty :: LimitedList (P 3) Topping
top3 = Onion `Cons` Pickle `Cons` Onion `Cons` Empty :: LimitedList (P 3) Topping
data SPeano n where
SZ :: SPeano Z
SS :: SPeano n -> SPeano (S n)
allLimitedLists :: (Enum a, Bounded a) => SPeano n -> [LimitedList n a]
allLimitedLists SZ = [Empty]
allLimitedLists (SS n) = [x `Cons` xs | x <- [minBound..maxBound]
, xs <- rest]
++ map levelUp rest
where rest = allLimitedLists n
levelUp :: LimitedList n a -> LimitedList (S n) a
levelUp Empty = Empty
levelUp (x `Cons` xs) = x `Cons` levelUp xs
all3Toppings :: [LimitedList (P 3) Topping]
all3Toppings = allLimitedLists (SS (SS (SS SZ)))
main = print all3Toppings
Finally, it is possible to enforce a unique, duplicate-free representation at the type level, but it adds another layer of complexity and makes the type much more difficult to use. Here's one way to do it by encoding the minimum element in the type and using a constraint to permit only smaller elements to be prepended. I've used the singletons
package to take care of some of the boilerplate, so we'll require quite a few extensions plus some imports:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE EmptyCase #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE UndecidableInstances #-}
import Data.Singletons
import Data.Singletons.Prelude
import Data.Singletons.TH
We'll need singletons for both our Peano naturals and our Topping
type, as we'll see below.
$(singletons [d|
data Peano = Z | S Peano
data Topping = Pickle | Onion | Lettuce | Tomato deriving (Show, Eq, Ord, Enum, Bounded)
|])
Here, we use promote
to define type functions P
and Before
. These could have been defined using type families, but promoting plain Haskell functions gives nicer syntax.
$(promote [d|
p 0 = Z
p n = S (p (n-1))
before _ Nothing = True
before x (Just y) = x < y
|])
Our limited set will be represented by the following GADT, which represents a (strictly) ascending list of elements of type a
whose head (smallest) element is h
, and which contains no more than n
elements. We use the Before
type function here to permit an x
value to be prepended to a list only if it's strictly less than the list's head element.
data MinLimitedSet (n :: Peano) (h :: Maybe a) where
Empty' :: MinLimitedSet n Nothing
Cons' :: (Before x min ~ True) => Sing x -> MinLimitedSet n min -> MinLimitedSet (S n) (Just x)
infixr 5 `Cons'`
Using this MinLimitedSet
type in code would require specifying the minimum element of the list in the type signatures. That's not very convenient, so we provide an existential type LimitedSet
that represents a MinLimitedSet
with an unspecified minimum element:
data LimitedSet (n :: Peano) a where
LimitedSet :: MinLimitedSet n (h :: Maybe a) -> LimitedSet n a
With these GADTs in place, we can define an example value:
top1' :: LimitedSet (P 3) Topping
top1' = LimitedSet $ sing @Pickle `Cons'` sing @Onion `Cons'` Empty'
Note that this definition is valid only because Pickle
and Onion
are in strictly ascending order (according to the Ord
instance for Topping
) and there are no more than three elements being combined. If you try to create a list of too many toppings, or a list with duplicates or out of strictly ascending order, you'll get a compile-time type error. To use such lists in programs, you'll typically want to convert them to plain term-level lists, and that's what the toList
function does.
toList :: forall n a. (SingKind a) => LimitedSet n a -> [Demote a]
toList (LimitedSet asc) = go asc
where go :: forall n lst. MinLimitedSet n lst -> [Demote a]
go (Empty') = []
go (x `Cons'` xs) = fromSing x : go xs
main = print $ toList top1'
The full code:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE EmptyCase #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE UndecidableInstances #-}
import Data.Singletons
import Data.Singletons.Prelude
import Data.Singletons.TH
$(singletons [d|
data Peano = Z | S Peano
data Topping = Pickle | Onion | Lettuce | Tomato deriving (Show, Eq, Ord, Enum, Bounded)
|])
$(promote [d|
p 0 = Z
p n = S (p (n-1))
before _ Nothing = True
before x (Just y) = x < y
|])
data MinLimitedSet (n :: Peano) (h :: Maybe a) where
Empty' :: MinLimitedSet n Nothing
Cons' :: (Before x min ~ True) => Sing x -> MinLimitedSet n min -> MinLimitedSet (S n) (Just x)
infixr 5 `Cons'`
data LimitedSet (n :: Peano) a where
LimitedSet :: MinLimitedSet n (h :: Maybe a) -> LimitedSet n a
top1' :: LimitedSet (P 3) Topping
top1' = LimitedSet $ sing @Pickle `Cons'` sing @Onion `Cons'` Empty'
toList :: forall n a. (SingKind a) => LimitedSet n a -> [Demote a]
toList (LimitedSet asc) = go asc
where go :: forall n lst. MinLimitedSet n lst -> [Demote a]
go (Empty') = []
go (x `Cons'` xs) = fromSing x : go xs
main = print $ toList top1'