About that Reader trick
Reader can be used to track local bindings, but tracking all bindings can be more difficult.
Published:
There is an interesting post on Solomon's Blog about how the Reader monad in Haskell can be used to model variable bindings in an interpreter, which he calls "that one cool reader trick."
For those unfamiliar with Reader
, it represents a computation which can reference some environment and execute sub-computations in a modified environment. Concretely, this is often used in Haskell apps to reference some kind of initial configuration or context that's needed throughout. It would often be a pain to manually pass this environment to every function that requires it, so instead, we wrap the computation in a Reader
monad and then retrieve the environment when necessary with the ask
function.
As Solomon writes, and as shown in the example usage of Reader
in the documentation, we may also use Reader
to manage variable bindings in an interpreter.
However, it's important to note that this is only useful if you want to interpret a language without global variables or mutation. This makes Reader
a bad abstraction for modeling variable binding in, say, the Scheme programming language, which crucially has both of these (define
for global variables and set!
for mutation). Unfortunately, Reader
is used to model environments in the Write You a Scheme Version 2.0 tutorial (WYAS 2), which I think is basically incorrect.
In this post, I will show how Reader
can be used to model local bindings, and why it's insufficient to represent mutable or global bindings.
A tiny interpreter
Let's say we want to interpret a language with integer variables and Lisp syntax:
(let (x 10)
x)
All our tiny language has are literal expressions (like 10
), variable expressions like the x
on the second line, and let expressions, like the whole block. When we evaluate the let expression, we traverse the abstract syntax tree (AST), and eventually get the result, which is just the value 10
.
In our interpreter, we represent expressions with a data type called Expr
1:
data Expr
= Literal Integer
| Var String
| Let String Expr Expr -- name, value, body
An expression is either a literal integer, a variable which holds an string name, or a let expression, which has a name, a value, and a body.
We can hold our variable bindings in a map from String
to Integer
:
type Bindings = Map String Integer
Now we can write a simple evaluation function for our language, eval
. Our evaluator takes an expression and returns a computation representing the variable bindings and the eventual result, an integer. We can implement eval
like this:
eval :: Expr -> Reader Bindings Integer
eval (Literal x) = pure x
eval (Var name) = do
env <- ask
case Map.lookup name env of
Just value -> pure value
Nothing -> error "Undefined Variable"
eval (Let name valueExpr bodyExpr) = do
env <- ask
value <- eval valueExpr
let modified = Map.insert name value env
local (const modified) (eval bodyExpr)
Let's step through each part:
- To evaluate a literal expression, just
pure
the literal value. - To evaluate a variable expression, we get the current environment with
ask
, then lookup the name in the environment, throwing an error if it's not there. - To evaluate a let expression, get the current environment; evaluate the value expression; create a new environment with the new binding; and finally, evaluate the body expression in the modified environment. That last part is exactly what
local
does, except it expects a modifying function as its first argument, so we just pass our updated environment as a constant function.
As you can hopefully see, using Reader
to model local scope is simple to understand and easy to implement. Additionally, the ReaderT
monad transformer can be used in a transformer stack to evaluate with better error handling and other effects.
Extending the technique
However, as I alluded to at the beginning of the post, as simple as this model is to understand, it's not really helpful in modeling variable bindings outside of a toy example like this, unless your language only has local scope and no mutation.
To see this, say we wanted to extend our tiny language with a define
expression that created a globally scoped binding. The syntax for this would be something like this, in Scheme style:
(define x 10)
x
which would evaluate to 10
. Say we add a Define
constructor to our Expr
type:
data Expr
= ...
| Define String Expr
where a Define
expression holds a name and a value expression. The problem is, when we go to interpret this expression, we don't have a body to evaluate locally like we did with a Let
expression. The body to evaluate is just implicitly the rest of the program, rather than in a Let
which makes its scoping explicit.
We might be able to deal with this problem by peeking ahead in the AST to find a Define
node, and when we do, evaluating the rest of the tree with the new environment, but this turns out to be pretty inelegant, and it can lead to even more difficulty when handling recursive or nested definitions. This is basically how definitions are evaluated in WYAS 2, which makes what should be a simple feature to implement in my view quite difficult to understand, (see the section "begin, define & evalBody").
While global definitions might be able to fit into our Reader
evaluation model, mutation is even more difficult, and as a result WYAS 2 skips adding mutation to the language. When interpreting mutation, as with definitions, we need to deal with the fact that mutation is a side effect, and the mutated state is implicitly in play for the rest of the program. For example, take the following Scheme code:
(define x 3)
(let ()
(set! x 4))
x
This should evaluate to 4
, but a naive Reader
implementation could give 3
since the set!
mutation occurs in a different scope than the reference to the variable x
.
What to do instead
If you want to interpret a language with global definitions and mutation, then, it's best not to use Reader
. Instead, we can manage a stack of environments, either explicitly or with the State
monad. With an environment stack:
- To create a local binding, we push a new environment onto the stack, then evaluate the body in this top environment.
- To lookup a binding, we traverse the stack searching for the first appearance of the name.
- To create a definition, we just modify the top environment on the stack, and continue with evaluation of the rest of the AST.
- To mutate a variable, we will likely need to change our
Bindings
definition to wrap values in our environment with a mutable reference likeIORef
orSTRef
. Otherwise, mutation works like a lookup where we change the value once it's found.
Concretely, one way to do this would be to define our Environment
type as a record containing the bindings, and possibly, a parent environment:
data Environment = Environment {
bindings :: Bindings,
parent :: Maybe Environment
}
With this model, by keeping track of the current environment, we can push onto the stack by creating a new environment where the current environment is the parent, and we pop the stack by setting the current environment to the parent. With this, we can add definitions to our language without dealing with the difficulty of peeking ahead in the tree. Also note that this approach is similar to the implementation of environments in Nystrom's Crafting Interpreters.
Conclusion
The usage of Reader
to implement Scheme environments in WYAS 2 is one of the main reasons why I have recently considered rewriting the tutorial. I think that Haskell is a great choice for implementing an interpreter, but its powerful abstractions must be chosen wisely, and in the case of Scheme evaluation, I think Reader
is the wrong choice.
That's not to say that the Reader
trick is entirely useless, but I think it's worth qualifying and understanding its implications. This also all just comes from my experience, and if you are an expert Haskeller who knows how to deal with these issues with using Reader
for variable bindings, I would love to hear from you.
Notes
This isn't necessarily how I would structure Expr
for a real Scheme interpreter, but just an example.