Chapter 1. Introduction

Why would anyone want to write a Scheme interpreter in Perl?

- Felix Lee

Madness.

- Larry Wall

By the end of this book you should have a thorough understanding of the inner workings of a programming language interpreter. The source code is presented in full, and several iterations add more features until it could be considered pretty complete. The interpreter is written to be as easy to understand as possible; it has no clever optimizations that might obscure the basic ideas, and the code and the ideas will be described to the best of my ability without any unexplained technical jargon. It is however assumed that you have a good working knowledge of Perl (Perl5), including its object-oriented features.

The final implementation will demonstrate:

Having said that, time and space is not wasted fleshing the interpreter out with numerous cut'n'paste system interfaces, i/o or even much basic arithmetic (the final implementation has only multiplication, addition and subtraction—enough for the tests and examples to work,) but by then it should be a trivial matter for anyone to add those themselves if they feel so inclined. Another point worth mentioning up front is that no claims are made that this is in any way a production-quality, or even an efficient implementation. It is just meant to be easy to understand.

Putting it another way, if you've come here looking for an off-the shelf scheme-like interpreter that you can use, you've come to the wrong place: there are many and better freely available implementations on the net. On the other hand if you're more interested in how such interpreters might work, I'd like to think that you might find what you're looking for here.

1.1. Why Perl?

My motivation for writing this book is that I have always been fascinated by the fact that programming languages are just programs, but I found it very difficult in the past to work out what was actually going on in the handful of “public domain”1 implementations of programming languages available at the time. The temptation always seemed to be there for the authors to add all sorts of bells and whistles to their pet project until the core ideas became obscured and obfuscated. The fact that they were invariably implemented in the low-level system language C didn't help matters. It was only when I found out that the easiest implementations of Scheme to understand were written in Scheme itself that I made any real progress with that particular language. However implementing an interpreter in terms of itself (so-called “meta-circular evaluation”) easily leads to confusion, and it struck me that Perl, with its very high-level constructs and high signal to noise ratio is the perfect vehicle to demonstrate the programming language concepts that I've so painfully gleaned through time, without any incestuous meta-circular issues to deal with.

Another reason for my wanting to write about programming languages is the wonderful gestalt inherant in their construction: how such an apparently small amount of code could achieve so much. This is of course due to the deeply recursive nature of their design, and this small implementation in Perl attempts to be as concise and recursive as possible.

1.2. Why Scheme?

Scheme is one of the younger members of the Lisp family of programming languages. LISP stands for “LISt Processing”2, and this is an appropriate name since the fundamental data type in these languages is the list.

The main reason for choosing Scheme to demonstrate the internals of an interpreter is that Scheme is a very simple language, and at the same time an astonishingly powerful one. An analogy might be that if C is the “chess” of programming languages, then Scheme is more like “go”. The official standard for Scheme, the “Revised(6) Report on the Algorithmic Language Scheme” or R6RS [r6rs] as it is known, has this to say:

Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.

Whether or not one agrees with that, and it's hard to argue, it strongly suggests that such a consistent language might be pretty straightforward to implement.

Another interesting feature of Scheme and the Lisp family of languages is that the list data that they work with is also used to construct the internal representation of the programs themselves. Therefore Scheme programs can directly manipulate their own syntax trees. This makes the definition of macros (syntactic extensions) particularly easy from within the language itself, without recourse to any separate preprocessing stage. Finally, another good reason for choosing Scheme is that it is extremely easy to parse, as we shall see.

1.3. References

I provide and refer to a select bibliography. Almost all of the concepts in this book are well known in academic circles and it would be disingenuous of me to try to pass them off as my own. The bibliography should provide you with a small collection of useful jumping off points should you wish to investigate any of these topics further.

1.4. Typography

All of the source code listings and extracts from the source are shown in fixed-width type with line numbers, and are pulled directly from the code of the working interpreter. Furthermore, when displaying a newer version of an individual method or package, the differences bethween that version and the previous one are calculated automatically and displayed in bold. Package names are displayed Like::This and methods like this(). Scheme code looks (like this).

Other in-line code, such as Scheme and occasionally Perl examples are unfortunately not so rigorously constrained. The possibility exists that even though I have manually tested all of those examples there could be an error or two in there, for which I can only apologise.

1.5. A Note on the Interpreter Versions

You won't find this interpreter on CPAN. The reason for this is that each version of the interpreter, while building on the features of previous versions, is a thing of itself and exists to demonstrate specific pedagogical points. I couldn't publish only the final version to CPAN since it is fit for no purpose other than this book and too complex for casual perousal without having first digested the earlier versions.

However each version is available online for you to download and play with, they are linked to from the text at the end of each chapter.

Last updated Sun Mar 14 10:43:08 2010 UST
1

yes, I'm older than your grandfather

2

Or “Lots of Irritating Single Parentheses”.