haskellgenericstypestypeclasstype-level-computation

Haskell GHC.Generics exercise: counting all `f :: Type -> Type`s in a type


I'm learning about structural polymorphism using GHC.Generics (thank you Sandy Maguire for your Thinking in Types book!). As an exercise, I thought I'd write a function that counts the number of fs (of one's choosing) in a type definition [so, just to be clear, I'm not saying this is something I really want to do, it's just an exercise]:

{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE TypeOperators #-}

import Data.Kind
import GHC.Generics

data Example f = Example { a :: f Int, b :: f String, c :: Bool } deriving (Generic)

countFs :: forall (f :: Type -> Type) (a :: Type). (Generic a, GCount f (Rep a)) => Int
countFs = gcount @f @(Rep a)

class GCount (f :: Type -> Type) (a :: Type -> Type) where
  gcount :: Int

instance GCount f U1 where
  gcount = 0

instance GCount f V1 where
  gcount = 0

instance {-# OVERLAPPING #-} GCount f (K1 i (f x)) where
  gcount = 1 -- + recursive call?

instance GCount f (K1 i c) where
  gcount = 0 -- + recursive call?

instance (GCount f a) => GCount f (M1 i c a) where
  gcount = gcount @f @a

instance (GCount f a, GCount f b) => GCount f (a :+: b) where
  gcount = gcount @f @a + gcount @f @b

instance (GCount f a, GCount f b) => GCount f (a :*: b) where
  gcount = gcount @f @a + gcount @f @b

This does kinda work, e.g. countFs @Maybe (Example Maybe) is 2, whereas countFs @Maybe (Example []) is 0. Neat. However, this is only counting top-level fs in the Example structure; if I were to add a new subfield with its own fs, they would go uncounted:

newtype Other = Other (Maybe Int) deriving (Generic)
data Example f = Example
  { a :: f Int
  , b :: f String
  , c :: Bool
  , d :: Other
  } deriving (Generic)

Here countFs @Maybe @(Example Maybe) is still 2, not 3. (Maybe that actually kinda makes sense, but bear with the exercise.)

My question: I can't seem to figure out how to recursively count subfields! Conceptually, I think I want to be able to say something like "if the K1 i (f x) thing has an x that is itself Generic, then use gcount on its Rep recursively; otherwise give up and add nothing", etc., but I can't figure out how to express that with typeclass instances. Is there a nicer way to think about this?


Solution

  • Since this counting operation is a type-level operation rather than a term-level operation, it probably makes more sense to implement it as a type function (i.e., a type family) operating on the representations Rep a. This gives you a little more flexibility in selecting the right "branch" without overlapping instances or other weirdness.

    A pair of type families, operating on types (Count) and their Reps (GCount) is a good start:

    type Count :: (Type -> Type) -> Type -> Natural
    type family Count f a where
      Count f (f a) = 1 + GCount f (Rep (f a))
      Count f a = GCount f (Rep a)
    
    type GCount :: (Type -> Type) -> (Type -> Type) -> Natural
    type family GCount f r where
      GCount f U1 = 0
      GCount f V1 = 0
      GCount f (K1 i g) = Count f g
      GCount f (M1 i c a) = GCount f a
      GCount f (a :+: b) = GCount f a + GCount f b
      GCount f (a :*: b) = GCount f a + GCount f b
    

    Unfortunately, this will fail (become "stuck") even for a trivial example:

    λ> :kind! Count Maybe (Maybe Int)
    Count Maybe (Maybe Int) :: Natural
    = 1 + GCount Maybe (Rep Int)
    

    because Int doesn't have a Generic instance by default. But, if you want to write functions (whether term level or type level) that recurse on K1s, you can't really avoid either ensuring that all field types have a Rep (i.e., have a Generic instance) or else handle special cases in the Count type family.

    For the former approach, you can introduce orphan instances for the primitive types you use:

    deriving instance Generic Int
    deriving instance Generic Char
    

    This will produce Reps in terms of URecs (for the unlifted contents):

    λ> :kind! Rep Int
    Rep Int :: * -> *
    = ...
        D
        ...
           C
           ...
              S
              ('MetaSel ...)
              (URec Int)))
    
    

    so you'll need to handle URecs as well.

    Here's an example program illustrating the approach. The count function in this example is just a simple runtime interface to the compile-time, type-level calculation, so we can display the results in a main program:

    {-# LANGUAGE AllowAmbiguousTypes #-}
    {-# LANGUAGE DataKinds #-}
    {-# LANGUAGE ScopedTypeVariables #-}
    {-# LANGUAGE TypeFamilies #-}
    {-# LANGUAGE UndecidableInstances #-}
    
    import Data.Proxy
    import Data.Kind
    import GHC.TypeNats
    import GHC.Generics
    
    type Count :: (Type -> Type) -> Type -> Natural
    type family Count f a where
      Count f (f a) = 1 + GCount f (Rep (f a))
      Count f a = GCount f (Rep a)
    
    type GCount :: (Type -> Type) -> (Type -> Type) -> Natural
    type family GCount f r where
      GCount f U1 = 0
      GCount f V1 = 0
      GCount f (URec x) = 0                             -- **ADDED**
      GCount f (K1 i g) = Count f g
      GCount f (M1 i c a) = GCount f a
      GCount f (a :+: b) = GCount f a + GCount f b
      GCount f (a :*: b) = GCount f a + GCount f b
    
    deriving instance Generic Int                       -- **ADDED**
    deriving instance Generic Char                      -- **ADDED**
    
    data Example1 f = Example1 { a :: f Int, b :: f Char, c :: Bool } deriving (Generic)
    newtype Other = Other (Maybe Int) deriving (Generic)
    data Example2 f = Example2
      { a' :: f Int
      , b' :: f Char
      , c' :: Bool
      , d' :: Other
      } deriving (Generic)
    
    count :: forall (f :: Type -> Type) (a :: Type). (KnownNat (Count f a)) => Natural
    count = natVal (Proxy @(Count f a))
    
    main = do
      print $ count @Maybe @(Maybe Int)
      print $ count @Maybe @(Example1 Maybe)
      print $ count @Maybe @(Example2 Maybe)
    

    Do note that I replaced String with Char in your examples. You can't run this counting function on Strings because type String = [Char], and the list type is recursive:

    data [a] = [] | (:) a [a]
    

    Your implementation tries to count f on both sides of the :+: and across all fields in the :*:, so you are basically evaluating the equivalent of:

    countMaybes list_of_a = countMaybes empty_list + (countMaybes a_alone + countMaybes list_of_a)
    

    which is an infinite loop. If you don't want an infinite loop, you need to rethink what you're trying to count here.

    Anyway, getting back to the problem at hand, that's how to get it working with orphan instances. Alternatively, you could handle primitive types in Count instead, as in the following example:

    {-# LANGUAGE AllowAmbiguousTypes #-}
    {-# LANGUAGE DataKinds #-}
    {-# LANGUAGE ScopedTypeVariables #-}
    {-# LANGUAGE TypeFamilies #-}
    {-# LANGUAGE UndecidableInstances #-}
    
    import Data.Proxy
    import Data.Kind
    import GHC.TypeNats
    import GHC.Generics
    
    type Count :: (Type -> Type) -> Type -> Natural
    type family Count f a where
      Count f (f a) = 1 + GCount f (Rep (f a))
      Count f Int = 0                                          -- **ADDED**
      Count f Char = 0                                         -- **ADDED**
      Count f a = GCount f (Rep a)
    
    type GCount :: (Type -> Type) -> (Type -> Type) -> Natural
    type family GCount f r where
      GCount f U1 = 0
      GCount f V1 = 0
      GCount f (K1 i g) = Count f g
      GCount f (M1 i c a) = GCount f a
      GCount f (a :+: b) = GCount f a + GCount f b
      GCount f (a :*: b) = GCount f a + GCount f b
    
    data Example1 f = Example1 { a :: f Int, b :: f Char, c :: Bool } deriving (Generic)
    newtype Other = Other (Maybe Int) deriving (Generic)
    data Example2 f = Example2
      { a' :: f Int
      , b' :: f Char
      , c' :: Bool
      , d' :: Other
      } deriving (Generic)
    
    count :: forall (f :: Type -> Type) (a :: Type). (KnownNat (Count f a)) => Natural
    count = natVal (Proxy @(Count f a))
    
    main = do
      print $ count @Maybe @(Maybe Int)
      print $ count @Maybe @(Example1 Maybe)
      print $ count @Maybe @(Example2 Maybe)
    

    You'll need a specific case in Count for every primitive that doesn't have a Generic instance. If you miss one, you'll get a "stuck" result for evaluation of the type level function, as in:

    λ> :kind! Count Maybe (Maybe Double)
    Count Maybe (Maybe Double) :: Natural
    = 1 + GCount Maybe (Rep Double)
    

    or a missing KnownNat instance for the term level count call:

    λ> count @Maybe @(Maybe Double)
    
    <interactive>:2:1: error:
        • No instance for (KnownNat (1 + GCount Maybe (Rep Double)))
            arising from a use of ‘count’
        • In the expression: count @Maybe @(Maybe Double)
          In an equation for ‘it’: it = count @Maybe @(Maybe Double)
    

    If you want to stick with your instance-based approach, you can do it with a pair of type classes (Count and GCount, analogous to the type families above). There'll be an overlapping instance in Count, and You'll need to use eqT from Data.Typeable to identify the Maybe types in the Count type class.

    Here's a full example. Note the similarities to the second type-family based example above. You still need to define specific instances for Int and Char and any other primitives you use, and trying to apply count to a list type will lead to an infinite loop.

    Also, this is generally inferior to the type family solution because it unnecessarily introduces a bunch of runtime computation (via type class dictionaries) for what is fundamentally a compile time calculation.

    {-# LANGUAGE AllowAmbiguousTypes #-}
    {-# LANGUAGE GADTs #-}
    {-# LANGUAGE UndecidableInstances #-}
    
    import Data.Kind
    import GHC.Generics
    import Data.Typeable
    
    type Count :: (Type -> Type) -> Type -> Constraint
    class Count f a where count :: Int
    
    instance forall (f :: Type -> Type) (g :: Type -> Type) (x :: Type).
      (Typeable f, Typeable g, GCount f (Rep (g x))) => Count f (g x) where
      count = gcount @f @(Rep (g x)) + case eqT @f @g of
        Just Refl -> 1
        Nothing   -> 0
    instance Count f Int where
      count = 0
    instance Count f Char where
      count = 0
    instance {-# OVERLAPPABLE #-} (GCount f (Rep y)) => Count f y where
      count = gcount @f @(Rep y)
    
    type GCount :: (Type -> Type) -> (Type -> Type) -> Constraint
    class GCount f a where gcount :: Int
    
    instance GCount f U1 where
      gcount = 0
    
    instance GCount f V1 where
      gcount = 0
    
    instance (Count f c) => GCount f (K1 i c) where
      gcount = count @f @c
    
    instance (GCount f a) => GCount f (M1 i c a) where
      gcount = gcount @f @a
    
    instance (GCount f a, GCount f b) => GCount f (a :+: b) where
      gcount = gcount @f @a + gcount @f @b
    
    instance (GCount f a, GCount f b) => GCount f (a :*: b) where
      gcount = gcount @f @a + gcount @f @b
    
    data Example1 f = Example1 { a :: f Int, b :: f Char, c :: Bool } deriving (Generic)
    newtype Other = Other (Maybe Int) deriving (Generic)
    data Example2 f = Example2
      { a' :: f Int
      , b' :: f Char
      , c' :: Bool
      , d' :: Other
      } deriving (Generic)
    
    main = do
      print $ count @Maybe @(Maybe Int)
      print $ count @Maybe @(Example1 Maybe)
      print $ count @Maybe @(Example2 Maybe)