I’m writing a book!

If you learned something from my blog or enjoy my style of writing you should check out the book I’m writing. It’s called Developer’s Guide to Database Indexing and will be stock full of all the things you should know about database indexing as a developer.

Sign up for the mailing list and be the first to know when the first chapters go online!

Check it out

# Understanding filterM

At the recent Haskell Meetup I went to, someone posed this interesting challenge:

Given a list of integers, produce a list of all possible sums that can be created using the integers from the list.

Turns out, in Haskell this is a one-liner:

``````map sum \$ filterM (const [True, False]) [1..5]
``````

Which returns

``````[15,10,11,6,12,7,8,3,13,8,9,4,10,5,6,1,14,9,10,5,11,6,7,2,12,7,8,3,9,4,5,0]
``````

To me as a Haskell beginner, this is straight up witchcraft. How can this line possibly produce something this complex. So a couple of us decided to dive into the implementation of `filterM` to see how it all works. What follows are our findings (a bit cleaned up).

## First impressions

The first thing throwing me for a loop, is that a function called `filterM` is returning more elements than in the initial list. That’s not what filtering means in my mind. I should always end up with less, or at least with the same amount of elements than before.

Secondly, I assume that the `M` in `filterM` stands for Monad. So that instantly makes it a bit more scary to me. While I (sort of) understand what Monads are – at least from a theoretical point of view – I don’t yet have intuition for them. It’s not immediately obvious to me, why Monads are needed in this particular case.

## Understanding the type signature

Looking at the type signature for `filterM` gives us this:

``````filterM :: Monad m => (a -> m Bool) -> [a] -> m [a]
``````

Note that the type signature `ghci` gives you actually differs from the one on Hoogle. I have chosen to use the type signature that `ghci` provides me.

As mentioned before, monads are still a bit scary to me. What I always end up doing, is pluck in the actual Monad I’m dealing with, and see what the type signature looks like. So let’s do that.

We’re passing in `const [True, False]` into `filterM`. So our monad in this case would be the list monad `[]`. Let’s see what that gives us.

``````filterM :: (a -> [Bool]) -> [a] -> [[a]]
``````

Ok that makes a bit more sense. One thing that strikes me as interesting, is that this is returning a nested list.

Let’s continue.

## Playing human interpreter

Since I still don’t have a clue what this function is supposed to do, I decided to just evaluate the function myself by plucking in actual values. Let’s take a look at our `filterM` call again.

I’m leaving off the map call and only focusing on the filterM expression here.

``````filterM (const [True, False]) [1..5]
``````

`filterM` can be implemented something like this (credits to this guy ):

``````filterM :: Monad m => (a -> m Bool) -> [a] -> m [a]
filterM p []     = return []
filterM p (x:xs) =
let rest = filterM p xs in
do b <- p x
if p then liftM (x:) rest
else                 rest
``````

This is not actually how it is implemented, but this implementation makes it much easier to understand.

Since we’re not passing in an empty list, let’s just skip that part for now and instead focus on the second part of the pattern match.

Immediately we have a recursive call on the tail of the list. Let’s ignore that for now and move on.

``````do b <- p x
if b then liftM (x:) rest
else                 rest
``````

Ok, so what’s going on here. Looks like we’re applying `p` to `x`, which is the head of our input list `[1..5]`. Substituting the variables for their actual values gives us

``````do b <- const [True, False] 1
-- ...
``````

`const` is a function of type `a -> b -> a`. So in other words, it takes two parameters and simply returns the first one. The first argument in this case is `[True, False]`. So `const [True, False] 1` returns `[True, False]`. In fact, `const [True, False] anything` will return `[True, False]`. This expression will always evaluate to `[True, False]`.

Moving on, that leaves us with

``````do b <- [True, False]
if b then liftM (1:) rest
else                 rest
``````

What does `b <- [True, False]` do? We’re in a do-expression, so this is just syntactic sugar around bind (`>>=`) calls. Let’s desugar and see what we end up with.

``````[True, False] >>= \b -> if b then liftM (1:) rest else rest
``````

Ok, so depending on what `b` is, we either execute `liftM (x:) rest` or simply `rest`. `rest` is the result of our initial recursive call `filterM p xs` which has type `m [a]` since that is the return type of `filterM`.

Right so what does `liftM (x:) rest` do? `liftM` has this type signature

``````liftM :: Monad m => (a -> b) -> m a -> m b
``````

That looks familiar… What’s the type signature of `fmap` again?

``````fmap :: Functor f => (a -> b) -> f a -> f b
``````

Swap out Monad with Functor and `m` with `f` and you have the same function. So `liftM` is basically `fmap`, but for monads. In other words, it takes a function that works on “unwrapped” values, and makes it work on values that are in a monad.

So back to our example. We had this expression

``````if b then liftM (1:) rest else rest
``````

If b is `True` we want to cons `1` onto `rest`. Since `rest` is of type `[[Int]]` and `x` is of type `Int` we need to lift the cons function `(:)` into the monad before we can apply it. Makes sense.

Depending on the value of `b` we either return a list with `1` added to the front, or one without the `1`. There’s our filter!

That still leaves the question what `b` actually is. We had the following piece of code

``````[True, False] >>= \x -> if x then liftM (1:) rest else rest
``````

It seems like we need to take a look at how `(>>=)` is implemented for the list monad in order for this snippet to make sense. Here it is.

``````(>>=) :: Monad m => m a -> (a -> m b) ->  m b
xs >>= f = concat \$ fmap f xs
``````

Aha! Bind for lists is implemented in terms of `fmap`. So what’s really going on is that we’re feeding both `True` and `False` into our little if-expression one after the other. But what does it mean? Let’s see what the expression above evaluates to:

``````[True, False] >>= \x -> if x then liftM (1:) rest else rest

[,[]]
``````

What if we were to simulate the recursive call, and pass in `2` as our value. In this case `rest` would be equal to `[, []]`.

``````[True, False] >>= \x -> if x then liftM (2:) [,[]] else [,[]]

[[2,1],,,[]]
``````

I think I’m starting to see it. One more.

``````[True, False] >>= \x -> if x then liftM (3:) [[2,1],,,[]] else [[2,1],,,[]]

[[3,2,1],[3,2],[3,1],,[2,1],,,[]]
``````

Note, that this not actually how the evaluation goes. We’re evaluating from left to right for simplicities sake. However, `filterM` is actually implemented in terms of `foldr`. So evaluate it from right to left, starting with the `5`.

For every element in our input list, we return a list with that element and one without the element. Due to the recursive nature of this algorithm this gives us back all possible subsets of our initial list.

After some research, I found out, that this is called the power set of a list (or set, in set theory).

## Wrapping it up

Going back to the original problem we can now see, that

``````filterM (const [True, False]) [1..5]
``````

is returning all possible subsets that can be formed using the numbers 1 to 5 (including the empty set).

``````[[1,2,3,4,5],[1,2,3,4],[1,2,3,5],[1,2,3],[1,2,4,5],[1,2,4],[1,2,5],[1,2],[1,3,4,5],[1,3,4],[1,3,5],[1,3],[1,4,5],[1,4],[1,5],,[2,3,4,5],[2,3,4],[2,3,5],[2,3],[2,4,5],[2,4],[2,5],,[3,4,5],[3,4],[3,5],,[4,5],,,[]]
``````

All that’s now left to do in order to answer the original problem is calculating the sum of each subset. Which leaves us with the initial solution.

``````map sum \$ filterM (const [True, False]) [1..5]
``````
``````[15,10,11,6,12,7,8,3,13,8,9,4,10,5,6,1,14,9,10,5,11,6,7,2,12,7,8,3,9,4,5,0]
``````

Neat.