haskellghcrepresentationcoerce

Haskell types with coercible representations identical to their C counterparts?


How can I determine if a Haskell type has an equivalent Coercible instance on a given platform?

I've just been told about Coercible in GHC 7.8, which seems great. In that context, I guess an equally good question to solve my specific problem is: Is there a way to interrogate GHC about which pairs of types a, b there is a Coercible a b instance for (on the current platform, say)?

It seems to me that for coerce :: Coercible a b => a -> b to be useful in a compiler- and platform-agnostic program, one would need to know – preferably only at compile-time, but possibly also explicitly when writing the code – whether a given Coercible a b instance exists on the given platform and use a slower non-noop fallback otherwise (by means of CPP, I guess).

Follow-up question: Would it make sense for GHC to provide a function

coerceOrConvert :: (a -> b) -> a -> b

with the property that coerceOrConvert f is

I realize this makes little sense for ordinary typeclasses, but Coercible seems far from ordinary, so it's hard for me to tell…


Solution

  • Typically the kind of coercion handled in Haskell comes in two flavors: representational equality (via newtype and Coercible) and new information about type variables (via Typeable). The second type has little to do with runtime representation, so I'll just describe the Coercible/newtype mechanism.

    It's a guarantee that that newtype changes only the type information and not the underlying representation, thus if we have (the standard example)

    newtype Age = Age { unAge :: Int }
    

    then we should be able to feel confident that something like

    instance Num Age where
      Age a + Age b = Age (a + b)
      ...
    

    is exactly as quick as (+) on Int is---i.e. there is no pointer indirection going on behind the scenes. In fact, GHC eliminates the Age constructor here with no difficulty. The challenge comes about when we want to do something like

    map Age :: [Int] -> [Age]
    

    since Int and Age are structurally identical this should be a no-op as well---all we must do is satisfy the type system at compile time and then throw away map Age operationally at runtime. Sadly, this isn't the case since the map will still traverse our list even if it does nothing at each stage.

    In situations where a lot of newtypes get thrown around but we also want GHC to produce the tightest compiled code you might see (dangerous, careful) uses of unsafeCoerce

    unsafeCoerce :: [Int] -> [Age]
    

    In this case unsafeCoerce is "safe" as we know that these two types are runtime identical. Also, since unsafeCoerce operates purely at the type level and is a true no-op at runtime we know that unlike map Age, unsafeCoerce is truly an O(0) coercion.

    But it's pretty dangerous.

    Coercible hopes to fix this by allowing instantiations like

    instance Coercible a b => Coercible [a] [b] where coerce = unsafeCoerce
    

    so that the Haskell typeclass machinery allows for coerce to be used only when safe, unlike unsafeCoerce. To ensure this is the case it must not be possible for malicious instances of Coercible to be built. To this end all Coercible instances are built by the compiler based off uses of newtype.

    As a final note, when you truly dive into how Coercible works you'll have to understand the new Haskell Role system which allows developers to annotate whether or not a newtype should allow for a coercion. This is clearly outlined in [the documentation for the Coercible class] (http://www.haskell.org/ghc/docs/7.8.1-rc2/html/libraries/base-4.7.0.0/Data-Coerce.html).