Monads (Handling State in Haskell)
Kevin Atkinson (2005)
Slightly augmented (2008) with material from
Wikipedia
"Yet Another Haskell Tutorial", Daume
"The Marvels of Monads", Dyer
---
Handling State by passing around World:
get :: World -> (Char, World)
put :: Char -> World -> World
For example:
echo :: World -> World
echo w = let (c, w1) = get w
w2 = put c w1
in w2
So what the problem?
Nothing is preventing one from writing:
put2 :: Char -> World -> (World, World)
put2 c w = (w1, w2)
where
w1 = put c w
w2 = put c w
Which makes no sense. How can we have two worlds?
World must be used in a single-threaded manner.
Clean has Unique type system to prevent this. But Haskell chose a
different solution:
Don't directly manipulate World. Instead have put/get return a
function which performs manipulates the world. The trick is not to
let the user have access to this function. So wrap the function in an
abstract data type which looks something like:
data IO a = IO (World -> (a, World))
The data constructor for the IO type is not accessible for the user.
(Furthermore the real IO type is probably not defined as a this way.)
Now get and put have the signature
get :: IO Char
put :: Char -> IO ()
So now lets define echo but how?
Need a may to combine them, use >>=
(>>=) :: IO a -> (a -> IO b) -> IO b
Then echo is simply;
echo :: IO ()
echo = get >>= put
BUT how to we actually RUN it. Need to somehow provide world.
The compiler/run-time-env runs a function called main by providing
"World".
main :: IO ()
main = echo
Now lets define a simple function getToUpper
getToUpper :: IO Char
getToUpper = get >>= \c
toUpper c
But this isn't right. ">>=" is expecting a function of type (a -> IO
b) as it's second parameter but we are giving it (Char -> Char). What
we need is a way to pair a value up with the IO type. The function to
do this is "return".
getToUpper = get >>= \c
return (toUpper c)
Now the second parameter to ">>="has the type "Char -> IO Char"
---
">>=", "return", and a datatype together form the concept Monad. In
Haskell these two functions are actually members of the Monad type
class which looks something like this:
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
and the IO Monad is just an instance of this type class.
instance Monad IO where ..
For a monad implementation to work right, it must obey the "Monad
laws":
* "return" must preserve all information about its argument:
(return x) >>= f = f x
m >>= return = m
* Binding two functions in succession is the same as binding one
function that can be determined from them (essentially: binding is
associative):
(m >>= f) >>= g = m >>= (\x -> f x >>= g)
A monad can also have a "zero" element, which is useful for some
abstractions.
Monad are a very useful concept for encapsulating all
sorts of concepts. The power is the ">>=" operation as it can
do anything between function calls.
For example Monads can be used to avoid having to check for error
conditions each time.
One very simple error type in Haskell is the Maybe type:
data Maybe a = Nothing | Just a
Without Monads we have to check for Nothing each time. For example:
f :: a -> Maybe a
f x = case f1 x of
Nothing -> Nothing
Just y -> case f2 y of
Nothing -> Nothing
Just z -> f3 z
f1,f2,f3 :: a -> Maybe a
However if we define the Maybe Monad as such
instance Monad Maybe where
return = Just
Nothing >>= f = Nothing
(Just x) >>= f = f x
Than we can write f as
f x = f1 x >>= \y ->
f2 y >>= \z ->
f3 z
Handling errors this way is very similar to exceptions. In fact the
IO Monad type has exception that can be thrown and caught in a very
similar way to how exceptions will be used in an imperative language.
---
Another example, collections:
-- "return" constructs a one-item list.
return x = [x]
-- "bind" concatenates the lists obtained by applying f to each item in list xs.
xs >>= f = concat (map f xs)
-- The zero object is an empty list.
mzero = []
---
Haskell has a special notation for Monads
For example
f x = f1 x >>= \y ->
f2 y >>= \z ->
f3 z
BECOMES:
f x = do y <- f1 x
z <- f2 x
f3 z
The <- is not a assignment but a binding similar to
let y = f x
in y
but of course not the same
The translation from do notation is:
"x <- expr1" ===> "expr1 >>= \x ->"
"expr2" ===> "expr2 >>= \_ ->"
As a final example getToUpper with do notation:
getToUpper :: IO Char
getToUpper = c <- get
return toUpper c
---
C#/LINQ has similar syntactic support. An example, using collection
monads:
var r = from x in new[] { 0, 1, 2 }
from y in new[] { 0, 1, 2 }
select x + y;
foreach (var i in r)
Console.WriteLine(i);
---
For more information on Monads see
http://www.nomaware.com/monads/html/
or
http://www.haskell.org/bookshelf/#monads
For a C#/LINQ perspective, see
http://blogs.msdn.com/wesdyer/ (January 11, 2008 entry)