Type safe Polyvariadic functions in Haskell

Polyvariadic functions can be useful to improve the user experience when writing an EDSL in Haskell. Instead of making your users write doSomething ["With", "these", "arguments"], you could have them write doSomething "With" "these" "arguments".

Your intuition will probably tell you that making a polyvariadic function will be unsafe, but Haskell allows you to make it safe. Of course it is not as easy as in Python, but then again, polyvariadic functions in Python will never be type safe.

Background: Currying

The concept of currying needs to be very clear before we can dive into polyvariadic functions.

A function with two arguments: f :: a -> b -> c really only has one argument. The -> operator is right-associative, so we can rewrite this type signature as f :: a -> (b -> c). This means that f takes a value x :: a and gives you a function f x :: b -> c.

When we write down functions with 'multiple arguments', the parentheses are usually omitted, but in reality f x y z is parsed as ((f x) y) z.

The problem of polyvariadic functions

What we want to build is a function of type a -> (a -> (a -> (... -> b ))) where the length of this chain of -> operators is determined by how many arguments are given to the function.

Meanwhile, we would like to implement this function as if it were of type [a] -> b.

Keep in mind that, with what I am about to show you, it is also possible to implement more complicated polyvariadic functions (for example with different types as arguments).

A first draft of a solution

The trick that we will employ to make this happen will involve a typeclass. Let's say we would like to build the polyvariadic equivalent of a function sumOf :: [Int] -> Int. We would then write a typeclass SumArgs as follows:

class SumArgs a where
  sumArgs :: [Int] -> a

Instances of this class, 'know how to make a value of their type from a list of integers'.

Next, we instantiate Int -> r where SumArgs r. (We will need {-# LANGUAGE FlexibleInstances #-} and {-# LANGUAGE FlexibleContexts #-}.)

instance (SumArgs r) => SumArgs (Int -> r) where
    sumArgs is i = sumArgs (i:is)

This may look like black magic at first. I will explain how this works right now. Why it is useful should become clear later.

Note that the sumArgs function in the second instance is of type SumArgs r => [Int] -> (Int -> r). That is, given a list of integers is :: [Int] and another integer i :: Int, we have to build a value of type r where r has a sumArgs instance. r, having a sumArgs instance, can be built using sumArgs :: SumArgs r => [Int] -> r if we have a list of integers. We will build this list by adding i :: Int to the existing list is :: [Int].

Now we are ready to add the functionality that will use sumOf :: [Int] -> Int in a polyvariadic context. We will do this with another instance:

instance SumArgs Int where
    sumArgs = sumOf

This instance explains how to build the result (of type Int) from a list of Int arguments.

We could now use sumArgs as let f = 5 :: Int in sumArgs [] f f :: Int and we would already have a polyvariadic function, but the empty list n there is still ugly.

The final piece of the puzzle is to abbreviate this:

sumOf' :: SumArgs args => args
sumOf' = sumArgs []

There we have it, a type safe polyvariadic function in Haskell.

*> let f = 5 :: Int
*> let g = 4 :: Int
*> sumOf' f g :: Int
9
*> sumOf' f g f f g :: Int
23

Under the hood of the black magic

If you did not already get how this works, here is an example.

sumOf' f g is declared to be of type Int, which means we can write it as sumArgs [] f g :: Int.

If we add the parentheses in the right places, we get ((sumArgs []) f) g :: Int. Now we will examine what Haskell's type inference does:

Peeling off one layer, Haskell figures out that (sumArgs []) f must be of some type c -> Int where g is of that type c.
Peeling off the next layer, Haskell figures out that sumArgs [] must be of some type b -> c -> Int where f is of that type b.
It knows that sumArgs is of type [Int] -> a.
It identifies a with b -> c -> Int to come to the conclusion that sumArgs is of type [Int] -> b -> c -> Int where b and c are the types of f and g, respectively.
Because it knows that f and g are both of type Int, it specializes the type of sumArgs to [Int] -> Int -> Int -> Int, which is exactly what we want.

The reason that sumArgs :: [Int] -> Int -> Int -> Int is a valid type signature, is because Int -> Int -> Int has an instance of SumArgs. Int -> Int -> Int has an instance of SumArgs because Int -> Int does. Finally, Int -> Int has an instance of SumArgs because Int does.

Backward compatibility

I motivated polyvariadic functions by arguing that they look nice in EDSLs. If you are interested in improving the user experience of your EDSL, chances are that you have already implemented it. This means that there is probably already a bunch of code that uses sumOf :: [Int] -> Int directly. Now, there is a way to change the type of sumOf to SumArgs args => a without breaking any of your users' code and it is by adding one more instance to the above code:

instance (SumArgs r) => SumArgs ([Int] -> r) where
    sumArgs is n = sumArgs (n ++ is)

Now the old code will still work as expected:

*> sumOf' [f, g] :: Int
9

As a nice bonus, the following now also works:

sumOf' f g [f, g] :: Int
18