## A Scheme parser in Haskell: Part 1

From Write Yourself a Scheme in 48 hours:

Basically, a monad is a way of saying "there's some extra information attached to this value, which most functions don't need to worry about". In this example, the "extra information" is the fact that this action performs IO, and the basic value is nothing, represented as "()". Monadic values are often called "actions", because the easiest way to think about the IO monad is a sequencing of actions that each might affect the outside world.

I really like this tutorial. I'm only on part 3.3 of 12, parsing, but I'm new to Haskell so I'm learning left, right & centre. The exercises are taking me hours of reading and experimenting, and it's lots of fun! ghc's errors are usually quite helpful and of course ghci is a big help as well.

I'm going to explain one of the exercises because converting between the various syntax for dealing with monads wasn't plainly obvious to me. Perhaps I wasn't paying enough attention to the docs I read. In any case if you're interested in Haskell at all, I recommend the tutorial and if you're stuck on exercise 3.3.1 like I was then come on back here. Whether you're following the tutorial or not the point of this post should stand on its own with a basic knowledge of Haskell.

Last night I rewrote `parseNumber`

using `do`

and `>>=`

(bind) notations (ex. 3.3.1). Here's `parseNumber`

using the `liftM`

method given in the tutorial:

```
parseNumber :: Parser LispVal
parseNumber :: liftM (Number . read) $ many1 digit
```

Okay that's pretty simple right? Let's break it down, first looking at the right-hand side of the `$`

operator, then the left.
* `many1 digit`

reads as many decimal digits as it can.
* `Number . read`

is a function composition just like we're used to using in math. It applies `read`

to its argument, then applies `Number`

to that result.
* `liftM`

is concisely and effectively defined elsewhere, and I'll borrow their description:

`liftM f m`

lets a non-monadic function`f`

operate on the contents of monad`m`

`liftM`

's type is also quite telling: `liftM :: (Monad m) => (a -> b) -> (m a -> m b)`

In a nutshell `liftM`

turns a function from `a`

to `b`

to a function from a monad containing `a`

to a monad containing `b`

.
That results in a function on the left-hand side of `$`

, which operates on and outputs a monad. The content of the input monad is a `String`

. The content of the output monad is a `LispVal`

(defined earlier in the tutorial). Specifically it is a `Number`

.
The `$`

acts similar to a pipe in `$FAVOURITE_SHELL`

, and is right associative which means the expression on the right is passed to the expression (function) on the left. It's exactly the same as `(liftM (Number . read)) (many1 digit)`

except it looks cleaner. If you know LISP or Scheme (sadly I do not) then it's analogous to the `apply`

function.
So how does a Haskell newbie go about re-writing that using other notations which haven't even been explained in the tutorial? Clearly one must search the web and read as much as they can until they understand enough to figure it out (which is one thing I like about the tutorial). If you're lazy like me, here are 3 equivalent pieces of code for you to chew on. `parseNumber`

's type is `Parser LispVal`

(Parser is a monad).
Familiar `liftM`

method:
```
parseNumber -> liftM (Number . read) $ many1 digit
```

Using `do`

notation:

```
parseNumber -> do digits <- many1 digit
return $ (Number . read) digits
```

If you're thinking "Hey a `return`

, I know that one!" then the devious masterminds behind Haskell are certainly laughing evilly right now. `return`

simply wraps up it's argument in a monad of some sort. In this case it's the `Parser`

monad. The `return`

part may seem strange at first. Since `many1 digit`

yields a monad why do we need to wrap anything? The answer is that using `<-`

causes `digits`

to contain a `String`

, stripped out of the monad which resulted from `many1 digit`

. Hence we no longer use `liftM`

to make `(Number . read)`

monads, and instead need to use `return`

to properly wrap it back up in a monad.
In other words `liftM`

eliminates the need to explicitly re-monadize the contents as is necessary using `do`

.

Finally, using `>>=`

(bind) notation:

```
parseNumber -> many1 digit >>= \digits ->
return $ (Number . read) digits
```

At this point I don't think this warrants much of an explanation. The syntactic sugar provided by `do`

should be pretty obvious. Just in case it's not, `>>=`

passes the contents of its left argument (a monad) to the *function*on its right. Once again

`return`

is needed to wrap up the result and send it on its way.
When I first read about Haskell I was overwhelmed by not knowing anything, and not being able to apply my previous knowledge of programming to *anything* in Haskell. One piece of syntax at a time I am slowly able to understand more of the Haskell found in the wild.

I'm currently working on ex. 3.3.4, which is parsing R5RS compliant numbers *(e.g. #o12345670, #xff, #d987)*. I'll probably write something about that once I figure it out, but in the meantime if you have any hints I'm all ears.

*Update #1: I should do more proof-reading if I'm going to try and explain things. I made some changes in wording.*