quick stab at monads

This is very high level, and not generic enough, but let’s say i’m writing a parser for some binary data in Erlang. Assume it doesn’t have any binary/bit syntax. We want to parse something like this c struct:

struct quote {
       double price;
       uint32_t  size;
       enum side side;
       int64_t tv_sec;
       uint64_t tv_nsec;

In a language where we can’t mutate anything, our parser would have to look something like this (a lot of erlang code looks like this):

   Price, Data1 = parse_double(Data0)
   Size, Data2 = parse_uint32_t(Data1)
   Side, Data3 = parse_enum_side(Data2)
   Tv_sec, Data4 = parse_int64_t(Data3)
   Tv_nsec, Data5 = parse_uint64_t(Data4)

Monads allow you to abstract out the Data part. It’s still there and passed around, but it’s abstracted away so you don’t have to explicitly deal with it. So our code can effectively look like:

    Price = parse_double()
    Size = parse_uint32_t()
    Side = parse_enum_side()
    Tv_sec = parse_int64_t()
    Tv_nsec = parse_uint64_t()

And you’ll have to trust me on this, but the abstraction works for normal programming things, so we can conditionally call a parse_blah() function, or we can write our own higher level parse function that work exactly like those shown above, all without ever having to manually touch the Data.

Now all we’ve done is abstracted that “state” (Data); this is still completely pure code that isn’t allowed to do any IO or mutate anything. This example isn’t hypothetical, haskell has a binary parser that works exactly this way (my mongodb driver uses it to parse bson)

The trickery for doing IO with monads is in that we could do something similar, except instead of Data which in the last case probably contained the string and maybe an index of where the parser left off at, the Data represents the state of the world. Now we’re passing the state of the world around between functions, but abstracted so you don’t have to explicitly deal with it. The functions remain perfectly pure, the magic only lies in the state-of-the-world variable that while abstracted, you are passing around.

In a sense it is one step further then Erlang. Since erlang (basically) has no global state, no module state, no function-static state–the only state you have is the variables you pass to a function–you pretty much have to pass around _everything_ you need so you have access to it. For example, if you need a file handle deep down in some function (eg the logger), you’ll have to pass that open file handle down through each function call so you have it to call the logger with. The exception is that you can create file descriptors out of thin air at any time, and don’t have to pass around stdin and stdout and friends.

Haskell has taken it one step further and says you can’t do that (well, there are internal functions that break this for efficiency’s sake, but these are rarely used). Haskell says that if you want stdin or stdout, or want the state of the world so that you can open a brand new file descriptor, you have to pass those variables down to where they will be needed. But then after creating that restriction, they turn around and create abstractions that maintain these properties, but abstract away the messy manual code for you.

Monads in themselves are really simple. I think the hard part isn’t the monads themselves but rather the type notation and haskell’s function and implicit currying. It takes a while to get these at a deep level, and it’s really hard to read the monad implementations until you have a solid basis on these.


~ by pulotka on 2011/02/04.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: