Code Generation

Code generation is a lot like compilation, except that the source isn't a real programming language.

A Slightly Non-Standard Application

Just to get the ball rolling, here's an application of code generation that doesn't fit the standard Enterprise Application mold: here I'm using code generation to build a class heirarchy for abstract syntax trees.

The idea is pretty simple. An abstract syntax tree is a tree containing the results of parsing a programming language. Nodes of the tree could be function calls, operator application or control flow constructs, and the leaves of the tree would be comstants and variables. Of course these trees cannot be constructed arbitrarily, it doesn't often make sense to add an integer to a record for example, so the constructors for the nodes need to declare (or, in an untyped language, type-check) their arguments. Each node will furthermore need accessors and probably an accept() method for the Visitor pattern, so that is a substantial amount of boiler-plate code for something that is easily describeable at a very high level.

Here's a snapshot of the AST definition from F:

abstract      Expr()
Expr          Char(scalar value) # [id]
Expr          Number(scalar value)

Expr          WildCard()

abstract Expr Symbol() # [id]
Symbol        DollarName(scalar name)
Symbol        Name(scalar name)
Symbol        TypeName(scalar name)

abstract Expr List()
List          Cons(Expr car, List cdr)
List          Nil()

abstract Expr Sequence()
Sequence      Statement(Expr car, Sequence cdr)
Sequence      EmptySequence()

abstract Expr Constructors()
Constructors  Constructor(Expr car, Constructors cdr)
Constructors  EmptyConstructor()

Expr          Typedef(Expr type, Constructors constructors)
Expr          Type(Expr type)

Expr          Apply(Expr op, List args)
Expr          End(Expr expr)
Expr          Define(Symbol name, Expr value)
Expr          Declare(Name name)
Expr          Lambda(List args, Sequence body)
Expr          LambdaSwitch(List lambdas)
Expr          Try(Apply attempt, LambdaSwitch catches)
Expr          Throw(List exprs)
Expr          String(scalar value)

Expr          LambdaLet(List formals, List actuals, Sequence Body)
Expr          LambdaSwitchLet(List actuals, List lambdas)

# these next are rewritten by the RewritingVisitor
Expr          And(Expr a, Expr b)
Expr          Or(Expr a, Expr b)
Expr          Not(Expr)
Expr          If(Expr test, Sequence consequent, Sequence alternative)
Expr          Then(Expr first, Expr second)
Expr          Pairs(List list)
Expr          Switch(List exprs, LambdaSwitch lambdas)

This is a high level but complete description of the required classes, all of the details of implementation are merely repetitive, which makes it a perfect candidate for code generation.

So, given this description, and a couple of command-line options to cover class prefixes etc., my treebuilder (tbd) will generate all of the classes required. For example, here's the generated class definition for Expr If(Expr test, Sequence consequent, Sequence alternative):

# WARNING: Automatically generated from ast.tb on Sat May 29 18:51:18 2010 GMT. DO NOT EDIT! YOUR EDITS WILL BE LOST!

package Lang::FN::AST::If;
use strict;
use warnings;
use base qw(Lang::FN::AST::Expr);



=head1 NAME

Lang::FN::AST::If - a tree node, a type of Lang::FN::AST::Expr.

=head1 SYNOPSIS

  use Lang::FN::AST::API;
  my $if = If($test, $consequent, $alternative);

  my $test = $if->getTest();
  my $consequent = $if->getConsequent();
  my $alternative = $if->getAlternative();
  $if->accept(new Lang::FN::AST::Visitor());

=head1 METHODS

Individual Lang::FN::AST::If methods are described below.

=head2 new

  my $If = new Lang::FN::AST::If($test, $consequent, $alternative);

or better:

  use Lang::FN::AST::API;
  my $If = If($test, $consequent, $alternative);

The C<If()> subroutine is provided by
C<Lang::FN::AST::API> to make
it easier to create instances of C<Lang::FN::AST::If>.
In either case
C<$test>
must be a
C<Lang::FN::AST::Expr>, C<$consequent>
must be a
C<Lang::FN::AST::Sequence> and C<$alternative>
must be a
C<Lang::FN::AST::Sequence>.


=cut


sub new {
    die "wrong number of arguments to Lang::FN::AST::If::new(\$test, \$consequent, \$alternative)\n"
        unless @_ == 4;

    my ($class, $test, $consequent, $alternative) = @_;

    die "argument test to If is not a Lang::FN::AST::Expr, it is a ",
        (ref($test) || 'scalar'), "\n"
        unless defined($test)
               && $test->isa('Lang::FN::AST::Expr');
    die "argument consequent to If is not a Lang::FN::AST::Sequence, it is a ",
        (ref($consequent) || 'scalar'), "\n"
        unless defined($consequent)
               && $consequent->isa('Lang::FN::AST::Sequence');
    die "argument alternative to If is not a Lang::FN::AST::Sequence, it is a ",
        (ref($alternative) || 'scalar'), "\n"
        unless defined($alternative)
               && $alternative->isa('Lang::FN::AST::Sequence');

    return bless {
        _test => $test,
        _consequent => $consequent,
        _alternative => $alternative,
    }, $class;
}



=head2 getTest

  my $test = $If->getTest();

gets the value of the C<test> field,
which is a C<Lang::FN::AST::Expr>.

=cut

sub getTest {
    my ($self) = @_;
    return $self->{_test};
}

=head2 getConsequent

  my $consequent = $If->getConsequent();

gets the value of the C<consequent> field,
which is a C<Lang::FN::AST::Sequence>.

=cut

sub getConsequent {
    my ($self) = @_;
    return $self->{_consequent};
}

=head2 getAlternative

  my $alternative = $If->getAlternative();

gets the value of the C<alternative> field,
which is a C<Lang::FN::AST::Sequence>.

=cut

sub getAlternative {
    my ($self) = @_;
    return $self->{_alternative};
}

=head2 accept

  $If->accept(new Lang::FN::AST::Visitor());

Accepts a C<Lang::FN::AST::Visitor> and calls its C<visitIf>
method, passing it C<$self> (i.e. a C<Lang::FN::AST::If>) as argument.

=cut

sub accept {
    my ($self, $visitor) = @_;
    return $visitor->visitIf($self);
}

=head2 eq

   if ($If->eq($other)) {
       ...
   }

Returns true if the argument C<$other> is recursively equal to C<$self>.

=cut

sub eq {
    my ($self, $other) = @_;
    if (ref($other) eq 'Lang::FN::AST::If') {
        return 1
            && $self->getTest->eq($other->getTest)
            && $self->getConsequent->eq($other->getConsequent)
            && $self->getAlternative->eq($other->getAlternative)
    } else {
        return 0;
    }
}



=head2 isIf

Returns true.

=cut

sub isIf { 1 }



=head1 AUTHOR

Automatically generated from ast.tb on Sat May 29 18:51:18 2010 GMT.

=head1 SEE ALSO

The other generated node classes:
L<Lang::FN::AST::And>,
L<Lang::FN::AST::Apply>,
L<Lang::FN::AST::Char>,
L<Lang::FN::AST::Cons>,
L<Lang::FN::AST::Constructor>,
L<Lang::FN::AST::Constructors>,
L<Lang::FN::AST::Declare>,
L<Lang::FN::AST::Define>,
L<Lang::FN::AST::DollarName>,
L<Lang::FN::AST::EmptyConstructor>,
L<Lang::FN::AST::EmptySequence>,
L<Lang::FN::AST::End>,
L<Lang::FN::AST::Expr>,
L<Lang::FN::AST::Lambda>,
L<Lang::FN::AST::LambdaLet>,
L<Lang::FN::AST::LambdaSwitch>,
L<Lang::FN::AST::LambdaSwitchLet>,
L<Lang::FN::AST::List>,
L<Lang::FN::AST::Name>,
L<Lang::FN::AST::Nil>,
L<Lang::FN::AST::Not>,
L<Lang::FN::AST::Number>,
L<Lang::FN::AST::Or>,
L<Lang::FN::AST::Pairs>,
L<Lang::FN::AST::Sequence>,
L<Lang::FN::AST::Statement>,
L<Lang::FN::AST::String>,
L<Lang::FN::AST::Switch>,
L<Lang::FN::AST::Symbol>,
L<Lang::FN::AST::Then>,
L<Lang::FN::AST::Throw>,
L<Lang::FN::AST::Try>,
L<Lang::FN::AST::Type>,
L<Lang::FN::AST::TypeName>,
L<Lang::FN::AST::Typedef>,
L<Lang::FN::AST::WildCard>,
an abstract visitor that can be subclassed to walk them:
L<Lang::FN::AST::Visitor>,
and the API that makes them available: L<Lang::FN::AST::API>.

=cut

1;

The point of all this is to be able to use these constructors directly in a yapp or similar grammar file. Here's a snippet of these generated classes in action:

...
expr      :   int
          |   char
          |   string
...
          |   END expr %prec L_5 { End($_[2]) }
          |   expr '=' expr      { Apply(Name('='), Cons($_[1], Cons($_[3], Nil))) }
          |   expr THEN expr     { Then($_[1], $_[3]) }
          |   expr '&&' expr     { And($_[1], $_[3]) }
          |   expr '||' expr     { Or($_[1], $_[3]) }
          |   '!' expr           { Not($_[2]) }
          |   '(' expr ')'       { $_[2] }
...
          ;

ifexpr    :   IF '(' expr ')' block ELSE alternative
                            { If($_[3], $_[5], $_[7]) }
          ;

alternative : block
            | ifexpr
            ;

One of the visitors over the resulting tree is a rewriting visitor that converts some abstract syntax to simpler forms so that there are fewer constructs that need to be dealt with later on. Here's the visitor routine that converts If abstract syntax to a function call:

sub visitIf {
    my ($self, $if) = @_;
    my $test = $if->getTest->accept($self);
    my $consequent = $if->getConsequent->accept($self);
    my $alternative = $if->getAlternative->accept($self);
    Apply(
        LambdaSwitch(
            Cons(
                Lambda(
                    Cons(TypeName('True'), Nil),
                    Statement(
                        End($consequent->getCar),
                        $consequent->getCdr
                    )
                ),
                Cons(
                    Lambda(
                        Cons(TypeName('False'), Nil),
                        $alternative
                    ),
                    Nil
                )
            )
        ),
        Cons($if->getTest->accept($self), Nil)
    )
}