The question arises as to why the implementation of the
define
special form,
described back in
Section 2.4
has beed deferred for so long,
when it would have
made so much of the previous discussion easier, particularily the
Scheme examples. Well there are good reasons. Consider
what the language so far has got.
So far, the language is fairly complete. It is however a purely functional language, in the sense that its expressions are like mathematical functions: the value of a variable is fixed for the duration of the expression. There is no sequential execution (though there is recursion) and most importantly there are no side effects. That means that one part of a program cannot affect the execution of another part that it does not contain, except by returning values that get passed in as arguments to that other part.
While this might seem to be a limitation of the language, it does in fact have some very important advantages. Essentially it makes large-scale expressions in the language very simple to analyse: since one part of an expression cannot affect another independant part, different components of a program (including the current environment) could be fed to separate processes, perhaps even separate machines on a network.
For example suppose we had a networked version of the
language. There would be a pool of processes available to do work, and
each process could delegate sub-parts of their work to other processes. A
simple rule might be that for parallel binding, e.g. let
bindings
and the arguments to closures and primitive functions, the evaluation
of the arguments could be performed in parallel by “outsourcing”
the evaluation of
each subexpression to a different process. For example consider the
following fragment:
... (let ((a (func1 x)) (b (func2 y)) (c (func3 z))) (func4 (func5 a) (func6 b) (func7 c))) ...
The let
could elect to send the subexpression (func1 x)
(plus the environment where func1
and x
have
a value) to one process, and the subexpression (func2 y)
to another. While those two expressions are being evaluated the let
could get on with evaluating (func3 z)
itself. Then when it
had finished that evaluation it would collect the result of the other two
evaluations and proceed to evaluate its body. Similarily the evaluation
of the body (the call to func4
) could outsource
the evaluation of the arguments (func5 a)
and (func6
b)
and get on with evaluating (func7 c)
, collecting
the other results when it finished, and then proceeding to evaluate
the func4
call with those evaluated arguments. It probably
shouldn't outsource something a simple as variable lookup.
The implementation of this networked programming language is left to you. Some sort of central ticketing server would be needed, with a queue where requestors could post their requests in exchange for a ticket, and a pool where clients could post their results so that they need not wait for the requestor to get back to them... quite an interesting project.
Beyond this point that kind of application becomes nearly impossible because we are about to introduce side effects, in particular variable assignment and, to make that useful, sequences of expressions. That's just the proper name for something we're all very familiar with—one statement following another.
In fact, the primary difference between a statement and an expression is that a statement is executed purely for its side effects. There are only a couple of different side effects that we need to consider. The first is variable assignment, that is to say the changing of the existing value of a variable binding. The second is definition, the creation of new entries in the current environment.
This version of the interpreter will only add variable assignment
(and sequences,) definition is left for later. Variable assignment
should be a special form, so that the variable being assigned to does
not get evaluated. In Scheme, and PScheme, the symbol bound to the
variable assignment special form is set!
, the exclaimation
point signifying that this operation has a side effect that might
upset an otherwise purely functional program25.
The syntax is simple, but there is a gotcha:
(set! a 10) Error: no binding for a in PScm::Env
This may sound unnecessarily picky, but for variable assignment
to work, there must already be a binding in place that the assignment
can affect. The reasoning is that the variable being assigned to
need not be in the current environment frame: it must be searched
for, and if we allow set!
to create new variables then
they would probably be installed in the current top frame. So the
scope of a variable that was set!
would depend on
whether the variable already existed or not, which is inconsistent
at best.
Anyway, this works:
> (let* ((a 5) > (dummy (set! a 10)) > a)) 10
This is a bit contrived, we use a dummy variable to allow the
set!
expression to be evaluated before the body of the
let*
returns the new value of a
.
However it should be clear from the example that the set!
did
take effect.
So how do we go about implementing set!
? Well as luck
would have it, we already have a method for changing existing bindings in
an environment, we have the Assign()
method that was developed
for the letrec
special form in a previous incarnation. It has
precisely the right semantics too (what a co-incidence!) It searches
through the chain of environment frames until it finds an appropriate
binding, assigning the value if it does and die()
-ing if it
doesn't. Here it is again:
59 sub Assign { 60 my ($self, $symbol, $value) = @_; 61 62 if (defined(my $ref = $self->_lookup_ref($symbol))) { 63 $$ref = $value; 64 } else { 65 die "no binding for @{[$symbol->value]}", 66 " in @{[ref($self)]}\n"; 67 } 68 }
So all we have to do is wire it up. The process of creating new
special forms should be familiar by now. Firstly we subclass
PScm::SpecialForm and give our new class an Apply()
method. The
new class in this case is PScm::SpecialForm::Set. Its
Apply()
method as usual takes a form and the current environment
as arguments. In this case the form should contain a symbol and a value
expression. This Apply()
will evaluate the expression and call the
environment's Assign()
method with the symbol and resulting
value as arguments:
138 package PScm::SpecialForm::Set; 139 140 use base qw(PScm::SpecialForm); 141 142 sub Apply { 143 my ($self, $form, $env) = @_; 144 my ($symbol, $expr) = $form->value; 145 $env->Assign($symbol, $expr->Eval($env)); 146 }
And that's all there is to it, apart from adding a
PScm::SpecialForm::Set object to the initial
environment, bound to the symbol set!
.
Sequences are another fairly trivial addition to the language. Rather than a single expression, a sequence contains one or more expressions. Each expression is evaluated in order, and the value of the last expression is the value of the sequence. This is such a common thing in many other languages (such as Perl) that it goes without notice, but thinking about it, sequences only become relevant and useful in the presence of side effects. Since only the value of the last expression is returned, preceding expressions can only affect the computation if they have side effects.
The keyword introducing a sequence in PScm is begin.
begin
takes a list of one or more expressions, evaluates
each one, and returns the value of the last expression. It functions
something like a block in Perl, except that a Perl block also encloses
a variable scope like let
does, but begin
does not imply
any scope.
(begin <expression1> <expression2> ...)
With begin
, we can write that rather awkward
set!
example from the previous section a lot more
clearly:
> (let ((a 5)) > (begin > (set! a 10) > a)) 10
begin
could, in fact, be implemented as a function,
provided that functions are guaranteed to evaluate their arguments in
left-to-right order, as this implementation does. However it is
safer not to make that assumption (remember the networked language
where evaluation was envisaged in parallel,) so begin
is better implemented as a special form which can guarantee that
left to right behaviour.
As might be imagined the code is quite trivial: evaluate each
component of the form and return the last value. We pick as an
initial value the empty list, so that if the body of the begin
is empty, that is the result. As usual we subclass PScm::SpecialForm,
in this case to PScm::SpecialForm::Begin, and give the new class an
Apply()
method:
149 package PScm::SpecialForm::Begin; 150 151 use base qw(PScm::SpecialForm); 152 153 sub Apply { 154 my ($self, $form, $env) = @_; 155 my (@expressions) = $form->value; 156 157 my $ret = new PScm::Expr::List(); 158 159 while (@expressions) { 160 $ret = shift(@expressions)->Eval($env); 161 } 162 163 return $ret; 164 } 165 166 1;
On
line 155 it extracts the
expressions from the argument $form
.
Then on
line 157
it initialises the return
value $ret
to an initial value. On
lines 159–161
it loops over each expression, evaluating it in the
current environment, and assigning the result to $ret
,
replacing the previous value. Lastly on
line 163
it returns the final value.
In this section we've seen variable assignment added to the language, but also taken some time to consider some drawbacks of that feature. We've also looked at how sequences become useful in the presence of variable assignment. In fact sequences serve no purpose without side-effects and side-effects are difficult to do without sequences.
As usual we need to add our new forms to the interpreter by
binding them in the initial environment. Here's ReadEvalPrint()
with the additional bindings.
31 sub ReadEvalPrint { 32 my ($infh, $outfh) = @_; 33 34 $outfh ||= new FileHandle(">-"); 35 my $reader = new PScm::Read($infh); 36 while (defined(my $expr = $reader->Read)) { 37 my $result = $expr->Eval( 38 new PScm::Env( 39 let => new PScm::SpecialForm::Let(), 40 '*' => new PScm::Primitive::Multiply(), 41 '-' => new PScm::Primitive::Subtract(), 42 if => new PScm::SpecialForm::If(), 43 lambda => new PScm::SpecialForm::Lambda(), 44 list => new PScm::Primitive::List(), 45 car => new PScm::Primitive::Car(), 46 cdr => new PScm::Primitive::Cdr(), 47 cons => new PScm::Primitive::Cons(), 48 letrec => new PScm::SpecialForm::LetRec(), 49 'let*' => new PScm::SpecialForm::LetStar(), 50 eval => new PScm::SpecialForm::Eval(), 51 macro => new PScm::SpecialForm::Macro(), 52 quote => new PScm::SpecialForm::Quote(), 53 'set!' => new PScm::SpecialForm::Set(), 54 begin => new PScm::SpecialForm::Begin(), 55 ) 56 ); 57 $result->Print($outfh); 58 } 59 }
It's worth pointing out a major difference between PScheme and standard
scheme implementations here. In standard scheme, function
bodies and the bodies of let
expressions and their variants
are all implicit sequences.
So in fact in a real scheme implementation
you could just write:
> (let ((a 5)) > (set! a 10) > a) 10
In PScheme function and let
bodies are single statements and require an explicit begin
to create a sequence,
because I wanted to keep the
distinction between single expressions and sequences clear to the
reader.
t/PScm_SideEffects.t
1 use strict; 2 use warnings; 3 use Test::More; 4 use lib 't/lib'; 5 use PScm::Test tests => 2; 6 7 BEGIN { use_ok('PScm') } 8 9 eval_ok(<<EOF, '2', 'begin and assignment'); 10 (let ((a 1)) 11 (begin (set! a 2) 12 a)) 13 EOF 14 15 # vim: ft=perl
A test of set!
and begin
can be seen in
Listing 20. The test binds a
to 1
in a let
, then in the body of the let
it performs a begin
which set!
s
a
to 2
then returns the new value of
a
.
Full source code for this version of the interpreter is available athttp://billhails.net/Book/releases/PScm-0.0.7.tgz
define
, for some reason.