[E-Lang] pending revision of E in a Walnut

Jonathan A Rees jar8@mumble.net
Wed, 22 Aug 2001 10:29:46 -0400


   Date: Wed, 22 Aug 2001 04:46:59 -0700
   From: "Mark S. Miller" <markm@caplet.com>

   And what's a Monad?  I've already got links to papers that would
   explain this to me, once I learn to read Haskell notation.
   However, something short that didn't assume this background would
   be great.

I'll take a stab at this.

In the context of programming technique (the category theory notion is
more general), a monad is a higher-order abstraction (in the modern
lexicon, a "design pattern") that assists the simulation of semantic
extensions to a language.  The language is usually but not necessarily
a functional language.  For example, various particular monads can
help to simulate side effects, exceptions, backtracking, and reified
continuations in languages that don't have them.

We know that with a little effort extension like these can be
simulated without the use of monads.  For example, a function that
needs to do the equivalent of a side effect can take, as an extra
argument, an initial store (store = data structure holding the values
of all affectable things) and return an updated store in addition to
its usual value.  Monads simply recognize that many different
extensions are all instances of a common pattern.

A monad (as formulated in Haskell - the mathematicians' "Kleisli
triple") consists of two functions, "unit" (I may have the name wrong)
and "bind".  For simplicity let's say that expressions in the
unextended language denote "values" and expressions that talk about
the desired extension denote "extensions".  "unit" simply coerces a
value to an extension in the most direct manner possible; e.g. we can
convert 3 to a side-effecting extension by doing no side effect and
returning 3.  The key is "bind", which is a way of sewing together two
code fragments that internally use only values but which terminate by
specifying an extension.  That is:

   if fragment1 is a function from values to extensions
  and fragment1 is a function from values to extensions
 then bind(fragment1, fragment2)
        is a function from values to extensions

For those of you who like types, we might say:
   if fragment1: A -> extensions(with value part: B)
  and fragment2: B -> extensions(with value part: C)
 then bind(fragment1, fragment2): A -> extensions(with value part: C)

Often these "extensions" are higher-order.  For example, take the case
of side effects.  Suppose that as above all effects are managed in a
"store" object.  Then an extension might be a function from stores to
(value, store) pairs.  unit(val) takes any store S to the pair
 (val, S).  Programmers might write get(location) and
 put(location, newvalue) to denote the primitive extensions that
access and update stores.  Example:

  extractvalue(
    bind( lambda(ignoreme).put(13, "hello"),
	  bind( lambda(ignoremetoo).put(17, "goodbye"),
		bind( lambda(ignoreme3).get(13),
		      lambda(thirteen). unit(length(thirteen)) ))) )

might yield 5 (the length of the string "hello").  ("extractvalue"
would be something that takes an extension and extracts a value from
it.)

This is a bad example, since put isn't delivering an interesting
value.

I think Haskell uses an infix operator for the "bind" function.
Remember that "bind" and "unit" are specific to a particular monad.
If we write this operator as ;; the example becomes:

  extractvalue(
    lambda(ignoreme).   put(13, "hello") ;;
    lambda(ignoremetoo).put(17, "goodbye") ;;
    lambda(ignoreme3).  get(13) ;;
    lambda(thirteen).   unit(length(thirteen)) )

(Yes, ;; is associative, for properly-formed monads.)

As with when...catch and CPS-converted code, the program reads as
sequential code - in fact, a bit like assembly code.  You've had to
give names to intermediate values that perhaps you didn't want to
bother to name, such as "thirteen", but so it goes.

Note that each fragment may contain arbitrary value-oriented code
(conditionals, etc.) but has to end with either an extension-oriented
operator (get, put) or unit.  Since extensions are first-class
objects, one can define abstractions over them and so on.

Benefits of using monads to do "impure" language extensions instead of
ad hoc techniques:
  - Readability: at least by those familiar with monads
  - Abstraction: easy replacement of one monad with a more general one
  - Performance: if a compiler or run-time "knows" about a monad, 
      you may get good performance (e.g. fast array code in functional
      language)

Benefits over simply extending the language:
  - Cf. arguments put forth by E re event loops and by FP folks re FP

I'll cut this off here since you asked for short.

DISCLAIMER: My terminology and notation is sometimes nonstandard.

Jonathan