Title:  Some interesting features of Haskell’s type system
Author: Wolfgang Jeltsch
Date:   9 February 2013
Tags:   containers (Haskell package), FRP, functional
        programming, GHC, Haskell, kind, literate programming,
        Markdown, multi-parameter type class, natural-numbers
        (Haskell package), Pandoc, parametricity, Stream
        (Haskell package), Theory Lunch, TTÜ Küberneetika
        Instituut, type class, type family


One of the most important ingredients of Haskell is its type
system. [Standard Haskell][haskell-report] already provides a
lot of useful mechanisms for having things checked at compile
time, and the language extensions provided by the [Glasgow
Haskell Compiler (GHC)][ghc] improve heavily on this.

[haskell-report]:
    http://www.haskell.org/onlinereport/haskell2010/
    "Haskell 2010 Language Report"
[ghc]:
    http://haskell.org/ghc/
    "Glasgow Haskell Compiler"

In this article, I will present several of Haskell’s type
system features. Some of them belong to the standard, others
are only available as extensions. This is a write-up of a talk
I gave on 31\ January\ 2013 during the [Theory
Lunch][theory-lunch] of the [Institute of Cybernetics][ioc].
This talk provided the basics for another Theory Lunch talk,
which was about the `Constraint` kind.<!--more-->

[theory-lunch]:
    http://theorylunch.wordpress.com/
    "Theory Lunch"
[ioc]:
    http://www.ioc.ee/wiki/doku.php?id=en:start
    "Institute of Cybernetics"

This whole article was written as a literate Haskell file with
ordinary text written in [Markdown][markdown]. You can download
[this literate Haskell file][article-source], read it, and load
it into GHCi to play with the code. The HTML for the blog post
was generated using [Pandoc][pandoc].

[markdown]:
    http://daringfireball.net/projects/markdown/
    "Markdown"
[article-source]:
    http://darcs.wolfgang.jeltsch.info/blog/2013/02/09/some-interesting-features-of-haskells-type-system.lhs
    "Some interesting features of Haskell’s type system"
[pandoc]:
    http://johnmacfarlane.net/pandoc/
    "Pandoc"


Prerequisites
=============

We first enable some language extensions that we will use in
this article:

> {-# LANGUAGE MultiParamTypeClasses, TypeFamilies #-}

We will reimplement some bits of the Prelude for illustration
purposes, and we will use functions from other modules whose
names clash with those of certain Prelude functions. Therefore,
we have to hide parts of the Prelude:

> import Prelude hiding (Eq (..), Functor (..), lookup, (!!))

Furthermore, we need some additional modules for example code:

> import Data.Char
> import Data.Natural
> import Data.Stream  hiding (map)
> import Data.Set     hiding (map)
> import Data.IntSet  hiding (map)

These imports require the packages `Stream`, `natural-numbers`,
and `containers` to be installed.


Kinds
=====

Types typically denote sets of values. For example, `Integer`
denotes the set of all integers, `Char` denotes the set of all
characters, and `[Bool]` denotes the set of all truth value
lists.

However, Haskell uses a more general notion of type, which also
covers functions on types. So for example, the unapplied list
type constructor\ `[]` is also considered a type, as is a
partial application of the function type constructor like `(->)
Integer`. Clearly, these types do not contain values.

To distinguish between ordinary types and functions on types,
Haskell uses a system of kinds. Kinds are the “types of types”,
so to say. A kind is either\ `*` or has the form `kind1 ->
kind2`, where `kind1` and `kind2` are again kinds. The
kind\ `*` is the kind of all ordinary types (that is, types
that contain values), and a `kind1 -> kind2` is the kind of all
type-level functions from `kind1` to `kind2`. Following are
some examples of types and their kinds:

  * `Integer, Char, [Bool], [[Bool]], [val] :: *`
  * `[], Maybe :: * -> *`

Note that in a kind `kind1 -> kind2`, `kind1` and `kind2` can
be kinds of function types again. So higher-order types are
possible. As a result, type functions with several arguments
can be implemented by means of Currying. For example, Haskell’s
pair type and function type constructors are kinded as follows:

  * `(,), (->) :: * -> * -> *`

Furthermore, we can have type constructors with type function
arguments. Take the following generic types of trees and
forests as an example:

> data    Tree   func label = Tree label (Forest func label)
>
> newtype Forest func label = Forest (func (Tree func label))

From these types, we can get various more concrete types by
partial application:

> type RoseTree          = Tree   []
>
> type RoseForest        = Forest []
>
> type NonEmptyList      = Tree   Maybe
>
> type PossiblyEmptyList = Forest Maybe
>
> type InfiniteList      = Tree   Identity

The `Identity` type used in the definition of `Stream` is
defined as follows:

> newtype Identity val = Identity val

I also want to mention that if we have a type `Event` in
functional reactive programming, `Tree Event` is the type of
behaviors that change only at discrete times, and `Forest
Event` is the type of event streams.


Type classes
============

A type class denotes a set of types, which are called the
instances of the class. Each class declares a set of methods
that its instances have to implement.


Simple type classes
-------------------

As an example of a type class, let us partially reimplement the
`Eq` class from the Prelude, whose methods are `(==)` and
`(/=)`:

> class Eq val where
>
>     (==), (/=) :: val -> val -> Bool

`Eq` is supposed to cover all types whose values can be checked
for equality. Here is an `instance`{.haskell} declaration for
the `Bool` type:

> instance Eq Bool where
>
>     False == bool2 = not bool2
>     True  == bool2 = bool2
>
>     bool1 /= bool2 = not (bool1 == bool2)


Constructor classes
-------------------

It is possible to define classes whose instances have a kind
other than\ `*`. These are sometimes called constructor
classes. An example of such a class is the `Functor` class from
the Prelude, whose instances have kind `* -> *`. Here is a
reimplementation:

> class Functor func where
>
>     fmap :: (val -> val') -> (func val -> func val')

Typical instances of `Functor` are `[]` and `Maybe`:

> instance Functor [] where
>
>     fmap = map
>
> instance Functor Maybe where
>
>     fmap _   Nothing    = Nothing
>     fmap fun (Just val) = Just (fun val)

Types `Tree func` and `Forest func` can also be made instances
of `Functor`, provided that `func` is a `Functor` instance
itself.

> instance Functor func => Functor (Tree func) where
>
>     fmap fun (Tree root subtrees) = Tree (fun root)
>                                          (fmap fun subtrees)
>
> instance Functor func => Functor (Forest func) where
>
>     fmap fun (Forest trees) = Forest (fmap (fmap fun) trees)

Note that these `instance`{.haskell} declarations make the
specialized versions of `Tree` and `Forest` that we have
defined above automatically instances of `Functor`.


Multi-parameter type classes
----------------------------

GHC allows classes to have multiple parameters. While
single-parameter classes denote sets of types, multi-parameter
classes denote relations between types. An example of a class
with two parameters is the class that relates types for which
there is a conversion function:

> class Convertible val val' where
>
>     convert :: val -> val'

We can convert from type `Int` to type `Integer`, but also
between types `Int` and `Char`:

> instance Convertible Int Integer where
>
>     convert = toInteger
>
> instance Convertible Int Char where
>
>     convert = chr
>
> instance Convertible Char Int where
>
>     convert = ord


Type families
=============

Haskell allows us to define new types using `data`{.haskell}
declarations. An example of such a declaration is the following
one, which introduces a type of lists:

> data List el = Nil | Cons el (List el)

Furthermore, we can use `type`{.haskell} declarations for
defining aliases of existing types. For example, we can use the
following `type`{.haskell} declaration to define types of
functions whose domain and codomain are the same:

> type Endo val = val -> val

Both `data`{.haskell} and `type`{.haskell} declarations have in
common that the types they define are parametric. Informally
speaking, this means that the basic structure of the defined
types is independent of type parameters. For example, lists are
always either empty or pairs of an element and another list, no
matter what the element type is. The choice of an `el`
parameter only determines the structure of elements. Likewise,
values of a type `Endo val` are always functions whose domain
and codomain are the same. The `val` parameter just determines
the concrete domain and codomain type in use.

There are situations, however, where we want to define
type-level functions that yield completely differently
structured types for different arguments. This is possible with
the type family extension that GHC provides.

There exist two flavors of type families: data families and
type synonym families. Data families introduce new types and
use the `data`{.haskell} keyword, while type synonym families
define aliases for types and use the `type`{.haskell} keyword.
This is analogous to `data`{.haskell} and `type`{.haskell}
declarations, respectively. Type families can be stand-alone or
associated. The former variant is analogous to top-level
functions, while the latter is analogous to class methods. We
will only deal with the latter in this post.


Data families
-------------

As an example of a data family, we define a type of total maps,
that is, maps that assign values to every value of a chosen key
type. Essential to our definition is that different key types
lead to differently structured maps. We declare a class of key
types, which contains the data family for total maps:

> class Key key where
>
>     data TotalMap key :: * -> *
>
>     lookup :: key -> TotalMap key val -> val

Let us now give an `instance`{.haskell} declaration for `Bool`.
Total maps with boolean keys are essentially pairs of values,
consisting of one value for the `False` key and one value for
the `True` key. Our `instance`{.haskell} declaration reflects
this:

> instance Key Bool where
>
>     data TotalMap Bool val = BoolMap val val
>
>     lookup False (BoolMap falseVal _)       = falseVal
>     lookup True  (BoolMap _        trueVal) = trueVal

Total maps whose keys are natural numbers correspond to
infinite lists, that is, streams of values:

> instance Key Natural where
>
>     data TotalMap Natural val = NaturalMap (Stream val)
>
>     lookup nat (NaturalMap str) = str !! fromIntegral nat

More advanced things are possible. For example, pairs of keys
can again serve as keys. A total map of a type `TotalMap
(key1,key2) val` corresponds to a function of type `(key1,key2)
-> val`, which in turn corresponds to a function of type `key1
-> key2 -> val`. This suggests how to implement total maps with
pair keys:

> instance (Key key1, Key key2) => Key (key1,key2) where
>
>     data TotalMap (key1,key2) val
>         = PairMap (TotalMap key1 (TotalMap key2 val))
>
>     lookup (key1,key2) (PairMap curriedMap)
>         = lookup key2 (lookup key1 curriedMap)


Type synonym families
---------------------

Let us now look at an example of a type synonym family. We
define a class of collection types where a collection is
basically anything that contains elements. Here are two
examples of collection types:

* `Set el` for any type `el` that is an instance of `Ord`
* `IntSet`

Our class contains a type synonym family that tells for every
collection type what the corresponding type of collection
elements is. The `class`{.haskell} declaration is as follows:

> class Collection coll where
>
>     type Element coll :: *
>
>     member :: Element coll -> coll -> Bool

Now we can form `instance`{.haskell} declarations for the above
example collection types:

> instance Ord el => Collection (Set el) where
>
>     type Element (Set el) = el
>
>     member = Data.Set.member
>
> instance Collection IntSet where
>
>     type Element IntSet = Int
>
>     member = Data.IntSet.member
