haskellsmlml

(ML) Modules vs (Haskell) Type Classes


According to Harper (https://existentialtype.wordpress.com/2011/04/16/modules-matter-most/), it seems that Type Classes simply do not offer the same level of abstraction that Modules offer and I'm having a hard time exactly figuring out why. And there are no examples in that link, so it's hard for me to see the key differences. There are also other papers on how to translate between Modules and Type Classes (http://www.cse.unsw.edu.au/~chak/papers/modules-classes.pdf), but this doesn't really have anything to do with the implementation in the programmer's perspective (it just says that there isn't something one can do that the other can't emulate).

Specifically, in the first link:

The first is that they insist that a type can implement a type class in exactly one way. For example, according to the philosophy of type classes, the integers can be ordered in precisely one way (the usual ordering), but obviously there are many orderings (say, by divisibility) of interest. The second is that they confound two separate issues: specifying how a type implements a type class and specifying when such a specification should be used during type inference.

I don't understand either. A type can implement a type class in more than 1 way in ML? How would you have the integers ordered by divisibility by example without creating a new type? In Haskell, you would have to do something like use data and have the instance Ord to offer an alternative ordering.

And the second one, aren't the two are distinct in Haskell? Specifying "when such a specification should be used during type inference" can be done by something like this:

blah :: BlahType b => ...

where BlahType is the class being used during the type inference and NOT the implementing class. Whereas, "how a type implements a type class" is done using instance.

Can some one explain what the link is really trying to say? I'm just not quite understanding why Modules would be less restrictive than Type Classes.


Solution

  • To understand what the article is saying, take a moment to consider the Monoid typeclass in Haskell. A monoid is any type, T, which has a function mappend :: T -> T -> T and identity element mempty :: T for which the following holds.

    a `mappend` (b `mappend` c) == (a `mappend` b) `mappend` c
    a `mappend` mempty == mempty `mappend` a == a
    

    There are many Haskell types which fit this definition. One example that springs immediately to mind are the integers, for which we can define the following.

    instance Monoid Integer where
        mappend = (+)
        mempty = 0
    

    You can confirm that all of the requirements hold.

    a + (b + c) == (a + b) + c
    a + 0 == 0 + a == a
    

    Indeed, the those conditions hold for all numbers over addition, so we can define the following as well.

    instance Num a => Monoid a where
        mappend = (+)
        mempty = 0
    

    So now, in GHCi, we can do the following.

    > mappend 3 5
    8
    > mempty
    0
    

    Particularly observant readers (or those with a background in mathemetics) will probably have noticed by now that we can also define a Monoid instance for numbers over multiplication.

    instance Num a => Monoid a where
        mappend = (*)
        mempty = 1
    
    a * (b * c) == (a * b) * c
    a * 1 == 1 * a == a
    

    But now the compiler encounters a problem. Which definiton of mappend should it use for numbers? Does mappend 3 5 equal 8 or 15? There is no way for it to decide. This is why Haskell does not allow multiple instances of a single typeclass. However, the issue still stands. Which Monoid instance of Num should we use? Both are perfectly valid and make sense for certain circumstances. The solution is to use neither. If you look Monoid in Hackage, you will see that there is no Monoid instance of Num, or Integer, Int, Float, or Double for that matter. Instead, there are Monoid instances of Sum and Product. Sum and Product are defined as follows.

    newtype Sum a = Sum { getSum :: a }
    newtype Product a = Product { getProduct :: a }
    
    instance Num a => Monoid (Sum a) where
        mappend (Sum a) (Sum b) = Sum $ a + b
        mempty = Sum 0
    
    instance Num a => Monoid (Product a) where
        mappend (Product a) (Product b) = Product $ a * b
        mempty = Product 1
    

    Now, if you want to use a number as a Monoid you have to wrap it in either a Sum or Product type. Which type you use determines which Monoid instance is used. This is the essence of what the article was trying to describe. There is no system built into Haskell's typeclass system which allows you to choose between multiple intances. Instead you have to jump through hoops by wrapping and unwrapping them in skeleton types. Now whether or not you consider this a problem is a large part of what determines whether you prefer Haskell or ML.

    ML gets around this by allowing multiple "instances" of the same class and type to be defined in different modules. Then, which module you import determines which "instance" you use. (Strictly speaking, ML doesn't have classes and instances, but it does have signatures and structures, which can act almost the same. For amore in depth comparison, read this paper).