Chapter 4. Implementing let

let allows the extension of the environment, temporarily, to include new bindings of symbols to data. let was introduced in Section 2.6 but as a quick reminder, here it is in action:

> (let ((x 10)
>       (y 20))
>      (* x y))
200
> x
Error: no binding for x in PScm::Env
[download]

Of course the Environment that has been described so far is not extensible, so the first thing to do is to look at how we might change the environment package to allow extension.

4.1. The Environment

Remember the original environment implementation from Section 3.2, where we just created an object wrapper around a Perl hash? We can build on this idea, but we need to think a bit harder about what environment extension actually means. It would be a good idea to keep the environment extensions separate from what is already in the environment, so that they can be easily undone when the time comes. It's a really Bad Idea to just poke more key-value pairs into that hash; the cost of working out how to undo those changes could be prohibitive. Therefore, each extension should have its own object hash.

A useful distinction to make at this point is between any individual hash, and the environment as a whole. When the text refers to the environment as a whole It'll just say “the environment”, but when It's talking about a particular object hash component, It'll say “environment frame”, or just “frame”.

4.1.1. A Stack-based Environment

A simple extension then, and one which a number of programming languages do in fact implement, is a stack of environment frames. A new frame containing the new bindings is pushed on top of the old, and the LookUp() method starts at the top of the stack and works its way down until it either finds a binding for the argument symbol, or hits the bottom of the stack and signals an error. To restore the previous environment, the top frame is simply popped off the stack again.

Perl lists have push and pop operations, so we could easily use those with a simple array representing the stack. Alternatively we could keep a current “top of stack” index, and increment that to push, or decrement it to pop, something like:

sub push {
    my ($self, $frame) = @_;
    $self->{stack}[$self->{index}] = $frame;
    ++$self->{index};
}

sub pop {
    my ($self) = @_;
    --$self->{index};
    die "stack underflow" if $self->{index} < 0;
    return $self->{stack}[$self->{index}];
}

This has the minor advantage of not immediately loosing what was previously on the top of the stack after a pop().

The major drawback of a stack is that it is a linear structure, and extending the stack again necessarily obliterates what was previously there, see Figure 4.1. If we plan at a later stage to support closure, where functions hang on to their environments after control has left them, then a stack is obviously inadequate unless some potentially complex additional code protects and copies those vunerable environment frames.

Figure 4.1. Stacks Destroy Old Environment Frames
figure

4.1.2. A Linked List Environment

Enter the linked list. A linked list is just a collection of objects, hashes or whatever, where each one contains a reference to the previous one on the list. If an environment, rather than being a stack of frames, was a linked list of frames, then just as with a stack, PScm::Env::LookUp() need only walk through the chain until it finds the first (most local) occurrence of the symbol and return that. The advantages of a linked list are that many environment frames can share the same parent, and therefore creating a new extension frame does not destroy the previous extension, see Figure 4.2.

Figure 4.2. Linked Lists Don't Destroy Old Environment Frames
figure

As long as something continues to hold a reference to the old Frame2 in this figure, then it will not be garbage collected and remains as valid as any other environment.

Here's PScm::Env::LookUp() modified to use a linked list.

 25 sub LookUp {
 26     my ($self, $symbol) = @_;
 27 
 28     if (exists($self->{bindings}{ $symbol->value })) {
 29         return $self->{bindings}{ $symbol->value };
 30     } elsif ($self->{parent}) {
 31         return $self->{parent}->LookUp($symbol);
 32     } else {
 33         die "no binding for @{[$symbol->value]} ",
 34           "in @{[ref($self)]}\n";
 35     }
 36 }

The only change is on Lines 30–31 (in bold) where if LookUp() can't find the symbol in the current environment frame it looks in its parent frame, if it has one.

The PScm::Env::new() method is little changed, it additionally checks the argument class in case new() is being called as an object method (which it will be), and adds a parent field to the object, with an initial zero value meaning “no parent”.

  7 sub new {
  8     my ($class, %bindings) = @_;
  9 
 10     $class = ref($class) || $class;
 11     bless { bindings => {%bindings}, parent => 0 }, $class;
 12 }

Finally we need an Extend() method of PScm::Env that will create a new environment from an existing one by creating a new frame with the new bindings, and setting the new frame's parent to be the original environment.

 14 sub Extend {
 15     my ($self, $ra_symbols, $ra_values) = @_;
 16 
 17     my %bindings = ();
 18     my @names    = map { $_->value } @$ra_symbols;
 19     @bindings{@names} = map { $_->Eval($self) } @$ra_values;
 20     my $new = $self->new(%bindings);
 21     $new->{parent} = $self;
 22     return $new;
 23 }

Because the Extend() method will be used by let and other constructs later, it takes a reference to an array of symbols and a reference to an array of values, rather than the simple %initial hash that new() takes. On Line 18 It maps the symbols to a list of strings, then on Line 19 it uses those strings as keys in a hash mapping them to their equivalent values. On Line 20, creates a new environment with that hash. Finally on Line 21 it sets that new environment's parent to be the original environment $self and returns the new environment.

4.2. Global Environments have a Problem

Before proceeding to implement let, we need to address an incipient problem with the global environment.

To understand this, assume our interpreter already has let installed as PScm::SpecialForm::Let, and is about to evaluate the (+ a b) part of the expression

(let ((a 10)
      (b 20))
     (+ a b))

It will have already extended the environment with the bindings for a and b so the global environment at that point will look like Figure 4.3.

Figure 4.3. Environment during evaluation of example “let
figure

Now, consider what let might have had to do to extend the environment and might have to do to restore it again afterwards:

  1. Save the current value of $PScm::GlobalEnv;
  2. Call Extend() on $PScm::GlobalEnv to get a new one with a and b appropriately bound;
  3. Assign that new environment to $PScm::GlobalEnv;
  4. Call Eval() on the expression (+ a b) and save the result;
  5. Restore the previous value of $PScm::GlobalEnv;
  6. Return the result of evaluating the body.

There's something not quite right there, something ugly. We've made it the responsibility of let to restore that previous environment, and if we go down that road, all of the other operations that extend environments will similarly be required to restore the environment for their callers. There's another bit of ugliness too, the simple existence of a global variable. It's the only one in our application. Does it have to be there? What could replace it?

4.3. Environment Passing

You are asked to take a leap of faith here when it is suggested that a better mechanism is to pass the environment around between Eval() and Apply() within the interpreter. Just suppose that Eval() was given the current environment as argument along with the expression to evaluate. Furthermore suppose that it passed the environment to Apply(). Now we've already seen that Special Forms (of which let is one,) each have their own Apply() method, so how would let's Apply() look if the environment were passed in? It would:

  1. Call Extend() on the argument environment to get a new one with a and b appropriately bound;
  2. Return the result of calling Eval() on the expression (+ a b) with the new environment as argument.

Isn't that better!13

Because environments would be local (my) variables, and because PScm::Env::Extend() does not alter the original environment in any way, we can rely Perl's own garbage collection take care of old unwanted environments for us.

The changes to our interpreter to make this happen are in fact quite limited. Fist of all, because Eval() now expects an environment as argument, the top-level PScm::ReadEvalPrint() must create one and pass it in. Note that it is the same as the $PScm::GlobalEnv from our previous interpreter, with the addition of a binding for let.

 31 sub ReadEvalPrint {
 32     my ($infh, $outfh) = @_;
 33 
 34     $outfh ||= new FileHandle(">-");
 35     my $reader = new PScm::Read($infh);
 36     while (defined(my $expr = $reader->Read)) {
 37         my $result = $expr->Eval(
 38             new PScm::Env(
 39                 let => new PScm::SpecialForm::Let(),
 40                 '*' => new PScm::Primitive::Multiply(),
 41                 '-' => new PScm::Primitive::Subtract(),
 42                 if  => new PScm::SpecialForm::If(),
 43             )
 44         );
 45         $result->Print($outfh);
 46     }
 47 }

Now you should remember from Section 3.5 that there are three implementations of Eval(), one in PScm::Expr, one in PScm::Expr::Symbol and one in PScm::Expr::List. Each of these must deal with the extra environment argument they are now being passed.

The default Eval() method for literal atoms (strings, numbers and others) is functionally unchanged. It ignores any argument environment because literals evaluate to themselves.

 12 sub Eval {
 13     my ($self, $env) = @_;
 14     return $self;
 15 }

The Eval() method for symbols now uses the argument environment rather than a global one in which to lookup its value:

 69 package PScm::Expr::Symbol;
 70 use base qw(PScm::Expr::Atom);
 71 
 72 sub Eval {
 73     my ($self, $env) = @_;
 74     return $env->LookUp($self);
 75 }

The Eval() method for lists (expressions) is little changed either, it evaluates the operation in the current (argument) environment then calls the operation's Apply() method, passing the current environment as an additional argument.

 62 sub Eval {
 63     my ($self, $env) = @_;
 64     my $op = $self->first()->Eval($env);
 65     return $op->Apply($self->rest, $env);
 66 }

So Apply() must change too. As shown earlier, There is one Apply() method for all PScm::Primitive classes, which evaluates all of the arguments to the primitive operation then calls the operation's private _apply() method with its pre-evaluated arguments. That needs to change only to evaluate those arguments in the argument environment:

  7 sub Apply {
  8     my ($self, $form, $env) = @_;
  9 
 10     my @unevaluated_args = $form->value;
 11     my @evaluated_args = map { $_->Eval($env) } @unevaluated_args;
 12     return $self->_apply(@evaluated_args);
 13 }

Note particularly that there is no need to pass the environment to the private _apply() method: since all its arguments are already evaluated it has no need of an environment to evaluate anything in. Therefore the primitive multiply and subtract operations are unchanged from the previous version of the interpreter.

The Apply() method for special forms is separately implemented by each special form. In our previous interpreter there was only one special form: if, so let's take a look at how that has changed.

 28 package PScm::SpecialForm::If;
 29 
 30 use base qw(PScm::SpecialForm);
 31 
 32 sub Apply {
 33     my ($self, $form, $env) = @_;
 34 
 35     my ($condition, $true_branch, $false_branch) = $form->value;
 36 
 37     if ($condition->Eval($env)->isTrue) {
 38         return $true_branch->Eval($env);
 39     } else {
 40         return $false_branch->Eval($env);
 41     }
 42 }
 43 
 44 1;

Pretty simple, The only change is that PScm::SpecialForm::If::Apply() passes its additional argument $env to each call to Eval().

4.4. let Itself

Now we're done, we can look at that implementation of PScm::SpecialForm::Let::Apply() in our environment passing interpreter.

Remember that let has the general form:

(let (‹binding› ...) ‹expression›)

where ‹binding› is:

(‹symbol› ‹expression›)

So, with that in mind, here's the Apply() method.

  8 package PScm::SpecialForm::Let;
  9 
 10 use base qw(PScm::SpecialForm);
 11 
 12 sub Apply {
 13     my ($self, $form, $env) = @_;
 14 
 15     my ($bindings, $body) = $form->value;
 16     my (@symbols, @values);
 17 
 18     foreach my $binding ($bindings->value) {
 19         my ($symbol, $value) = $binding->value;
 20         push @symbols, $symbol;
 21         push @values,  $value;
 22     }
 23 
 24     return $body->Eval($env->Extend(\@symbols, \@values));
 25 }

It starts off at Line 15 extracting the bindings and body from the argument form. Then it sets up two empty lists to collect the symbols and the values (Line 16) separately. Then in the loop on Lines 18–22 it iterates over each binding collecting the unevaluated symbol in one list and the evaluated argument in the other. Finally on Line 24 it calls the body's Eval() method with an extended environment where those symbols are bound to those values.

Line 24 encapsulates our new simple definition for let quite concisely. The environment created by Extend() is passed directly to Eval() and the result of calling Eval() is returned directly.

4.5. Summary

In order to get let working easily we had to make two changes to the original implementation. Firstly adding an Extend method to the PScm::Env class to create new environments, and secondly altering the various Eval() and Apply() methods to pass the environment as argument rather than using a global environment. Having done that, the actual implementation of let was trivial.

4.6. Tests

Rather than adding more tests to t/PScm.t, There's a new t/PScm_Let.t which you can see in Listing 11. It adds two tests, the first just tests that a value bound by a let expression is available in the body of the let, and the second proves that the body of the let can be an arbitrary expression.

Listing 11. t/PScm_Let.t
  1 use strict;
  2 use warnings;
  3 use Test::More;
  4 use lib './t/lib';
  5 use PScm::Test tests => 3;
  6 
  7 BEGIN { use_ok('PScm') }
  8 
  9 eval_ok('(let ((x 2)) x)', '2', 'simple let');
 10 
 11 eval_ok(<<EOF, '20', 'conditional evaluation');
 12 (let ((a 00)
 13       (b 10)
 14       (c 20))
 15      (if a b c))
 16 EOF
 17 
 18 # vim: ft=perl
Full source code for this version of the interpreter is available at
http://billhails.net/Book/releases/PScm-0.0.1.tgz
Last updated Sun Mar 14 10:43:08 2010 UST
13

If the improvement to the design of let does not warrant such an apparently drastic change, it should be noted that closure is much easier to implement with this model and could be egregiously difficult to implement otherwise.