% Copyright (C) 1989, Digital Equipment Corporation
% All rights reserved.
% See the file COPYRIGHT for a full description.
%
% Last modified on Mon Dec 14 14:00:26 PST 1992 by kalsow
%      modified on Tue Feb 18 13:23:15 PST 1992 by muller


\chapter{Internals}

This section contains a brief introduction to the internal structure
of the compiler and runtime system.  This introduction is neither
comprehensive nor tutorial; it is merely intended as a stepping stone
for the courageous.

\section{A tour of the compiler}
%===============================

                    \index{intermediate code}
                    \index{C intermediate code@{\tt C} intermediate code}
The compiler has undergone much evolution.  It started as a project to
build a simple and easy to maintain compiler.  Somewhere along the way
we decided to compile \modula. Much later we decided to generate C.
In hindsight, \modula was a good choice, C was at best mediocre.

                    \index{compiler!organization}
The initial observation was that most compilers' data structures were
visible and complex.  This situation makes it necessary to understand
a compiler in its entirety before attempting non-trivial enhancements
or bug fixes.  By keeping most of the compiler's primary data
structures hidden behind opaque interfaces, we hoped to avoid this
pitfall.  So far, bugs have been easy to find.  During early
development, it was relatively easy to track the weekly language
changes.

                    \index{objects, within compiler}
The compiler is decomposed by language feature rather than the more
traditional compiler passes.  We attempted to confine each language
feature to a single module.  For example, the parsing, name binding,
type checking and code production for each statement is in its own
module.  This separation means that only the \verb|CaseStmt| module
needs to know what data structures exist to implement
\verb|CASE| statements.
Other parts of the compiler need only know that the \verb|CASE|
statement is a statement.  This fact is captured by the object subtype
hierarchy.  A \verb|CaseStmt.T| is a subtype of a \verb|Stmt.T|.

The main object types within the compiler are: values, statements,
expressions, and types.  ``Values'' is a misnomer; ``bindings'' would
be better.  This object class include anything that can be named:
modules, procedures, variables, exceptions, constants, types,
enumeration elements, record fields, methods, and procedure formals.
Statements include all of the \modula statements.  Expressions include
all the \modula expression forms that have a special syntax.  And
finally, types include the \modula types.

The compiler retains the traditional separation
of input streams, scanner, symbol table, and output stream.

The compilation process retains the usual phases.  Symbols are scanned
as needed by the parser.  A recursive descent parser reads the entire
source and builds the internal syntax tree.  All remaining passes
simply add decorations to this tree.  The next phase binds all
identifiers to values in scopes.
\modula allows arbitrary forward references so it is necessary
to accumulate all names within a scope before binding any identifiers
to values.  The next phase divides the types into structurally
equivalent classes.  This phase actually occurs in two steps.  First,
the types are divided into classes such that each class will have a
unique C representation.  Then, those classes are refined into what
\modula defines as structurally equivalent types.  After the types
have been partitioned, the entire tree is checked for type errors.
Finally, the C code is emitted.
C's requirement that declarations precede uses means
that the code is generated in several passes.
First, the types are generated during type checking.
Then, the procedure headers are produced.
And finally, the procedure bodies are generated.

The compiler implementation is in the \verb|compiler| directory.
Within that directory the following directories exist:
\begin{quote}
\begin{tabular}{ll}
      builtinOps   & \verb|ABS|, \verb|ADR|, \verb|BITSIZE|, ...  \\
      builtinTypes & \verb|INTEGER|, \verb|CHAR|, \verb|REFANY|, ...  \\
      builtinWord  & \verb|Word.And|, \verb|Word.Or|, ... \\
      exprs        & \verb|+|, \verb|-|, \verb|[]|, \verb|^|,
                        \verb|AND|, \verb|OR|, ...  \\
      misc         & main program, scanner, symbol tables, ...  \\
      stmts        & \verb|:=|, \verb|IF|, \verb|TRY|, \verb|WHILE|, ... \\
      types        & \verb|ARRAY|, \verb|BITS FOR|, \verb|RECORD|, ...  \\
      values       & \verb|MODULE|, \verb|PROCEDURE|, \verb|VAR|, ...  \\
\end{tabular}
\end{quote}

\section{A tour of the runtime}
%==============================

The runtime itself implements the garbage collector,
\modula startup code and a few miscellaneous functions.  The runtime
exists in the \verb|libm3/runtime| directory.

                     \index{M3Runtime.h@{\tt M3Runtime.h}}
                     \index{M3Machine.h@{\tt M3Machine.h}}
The interface between the compiler and runtime system is embodied (and
very sparsely documented) in \verb|M3Runtime.h|, \verb|M3Machine.h|
(an architecture-dependent file) and \verb|M3RuntimeDecls.h|.  Every C
file generated by the compiler includes these files.

                      \index{garbage collection!copying}
                      \index{garbage collection}
The allocator and garbage collector are based on Joel Bartlett's
``mostly copying collector''.  The best description of his collector
is in \cite{m3-Bar88}.  Since that paper, we've made a few
modifications to support a growing heap and to use extra information
that the \modula compiler generates.

                 \index{exceptions!setjmp, longjmp@{\tt setjmp}, {\tt longjmp}}
                 \index{setjmp@{\tt setjmp}}
                 \index{longjmp@{\tt longjmp}}
Exceptions are implemented with \verb|setjmp| and \verb|longjmp|.  The
jump buffers and scope descriptors are chained together to form a
stack.  The head of the chain is kept in \verb|ThreadSupport.handlers|.
There is a distinct chain for each thread.  When an exception is
raised, the chain is searched.  If a handler for the exception is
found, the exception is allowed to unwind the stack, otherwise a
runtime error is signaled.  To unwind the stack, a \verb|longjmp| is
done to the first handler on the stack.  It does whatever cleanup is
necessary and passes control on up the stack to the next handler until
the exception is actually handled.

                      \index{typecells}
Reference types are represented at runtime by a ``typecell''.  Due to
separate compilation, opaque types and revelations, it is not possible
to fully initialize typecells at compile time.  Typecell
initialization is finished at link time.  A typecell contains a type's
typecode, a pointer to its parent typecell, the size of the types
referent and method list if any, the type's brand, the number of open
array dimensions, the type's fingerprint, and procedures to initialize
the typecell, initialize new instances of the type, print instances of
the type and trace the type for garbage collection.

                       \index{linking}
                       \index{typecodes}
                       \index{type brands}
Link time type elaboration occurs in several steps.  First, all types
are registered.  That is, a global array that points to all typecells
is built.  Next, the runtime verifies that all opaque types have been
given concrete representations.  Then, the initialization of typecells
is finished.  Then, all types with the same brand and fingerprint are
identified with the same typecode.  Finally, a check is made to ensure
that distinct types have distinct brands.

                       \index{initialization}
                       \index{module initialization}
At the beginning of the execution of the program, all global variables
are initialized, and the main bodies of the modules are invoked.  The
skeletal code that ensures that every module is initialized is
generated by the linker part of the driver.

Other parts of the runtime, such as threads, are actually implemented
in the base library.

                        \index{threads}
                        \index{threads!preemption}
                        \index{atomicity}
Thread switching is implemented with \verb|setjmp| and \verb|longjmp|.
A timer interrupt (signal) is used to cause preemptive thread
switching.  The global variable \verb|ThreadSupport.self| points to the
currently running thread.  The integer \verb|ThreadSupport.inCritical| is
used by the runtime to prevent thread switching during garbage
collection and other ``atomic'' runtime operations.

\section{Porting to another machine}
%===================================

                       \index{porting}
                       \index{machine description}
Anyone who is interested in porting this compiler is encouraged!  We
would like to know how it goes.  The primary concerns when doing a
port will be the size and alignment constraints of the target machine
and the runtime.  We tried to avoid suspicious C constructs, but we
doubt that we were completely successful.

The directions in this section are somewhat sparse. We tried to make
the installation of \srcmodula smooth, but it is another story to make
the development of ports smooth. Please bear with us and tell us what
we can do to improve this section.

If you want to a port to an unsupported system you should:
\begin{itemize}
\item get the {\tt compiler} and {\tt driver} source archives (in
  addition to {\tt m3make} that came with the boot archive and {\tt
  libm3} which you had to install anyway).
\item decide on the name of the new architecture; in the rest, we
  assume that it is {\it new}
\item describe the target machine for the compiler
\item implement the machine-specific part of the base libraries for
  the new machine
\item build a cross-compiler on a supported machine
\item cross-compile (to C) the driver and the compiler
\item finish the compilation of the driver and the compiler on the
  target machine
\end{itemize}

In the following, all the paths are relative to the directory in which
you unpacked the archives (also known as the top-level directory).

\paragraph{Describing the target machine}

                       \index{Target.i3@{\tt Target.i3}}
The compiler has a small number of parameters that are used to
describe the target machine.  These parameters are expressed in the
interface {\tt compiler/src/{\it new}/Target.i3}.  Create the
directory {\tt compiler/src/{\it new}} and build the file {\tt
Target.i3}, using the descriptions for the other machines as models.
In {\tt compiler/src/m3makefile}, add the lines:
\begin{verse} \tt
    \#if defined (TARGET\_{\it new}) \\
    source\_dir (../src/{\it new}) \\
    \#endif \\
\end{verse}

\paragraph{Porting the runtime and base libraries}

Some of the Modula-3 code (as well as very little pieces of C) are
machine-dependent.  Of course, it may be that some code we
thought to be machine-independent will turn to have to be changed for
your {\it new} architecture, so we cannot guarantee that the list
below is exhaustive.  In general, look at what is done for the other
machines, and find the most similar as a starting point.


In {\tt libm3/Csupport/src}, add a directory {\it new} and put in it
the files:
\begin{itemize}
\item {\tt m3makefile} to describe the contents of the directory
\item {\tt M3Machine.h} which is included in every C file generated by the
  \srcmodula compiler
\item {\tt dtoa.c} to configure {\tt ../generic/dtoa.h}; look at
   that file for the things to configure.
 \item {\tt float.h} if your system does not have one. You can build
   it using a program called {\tt enquire}, which can be found 
   on the net.
   %  in the directory {\tt compiler/src/enquire}
\end{itemize}

In {\tt libm3/C/src}, add a directory {\it new} and put in it the
files:
\begin{itemize}
\item {\tt m3makefile} to describe the contents of the directory
\item {\tt Csetjmp.i3} to describe the interface to {\tt setjmp},
  {\tt longjmp}, {\tt \_setjmp} and {\tt \_longjmp}.  Be careful to
  get the size of the \verb|jmp_buf| right.
\item {\tt Cstdio.i3}, essentially a \modula translation of {\tt stdio.h}
\end{itemize}

In {\tt libm3/thread/src}, add a directory {\it new} and put in it the
files: 
\begin{itemize}
\item {\tt m3makefile} to describe the contents of the directory
\item {\tt WildJmp.i3}: this interface provides functions that are
  similar to {\tt \_setjmp} and {\tt \_longjmp}, but that do not
  impose any restriction on what is possible.  The {\tt DS3100}
  version is the simplest, because on that architecture, {\tt \_setjmp}
  and {\tt \_longjmp} are just fine.  The {\tt VAX} version is the most
  complex, we had to rewrite our own versions because {\tt longjmp}
  requires that the stack be popped (remember the {\tt longjmp botch}
  message ?).
\end{itemize}

                     \index{stack scanning}
                     \index{conservative collection}
                     \index{threads!stacks}
In {\tt libm3/runtime/src}, you can either reuse one of the {\tt
StackInc-{\it n}} component, or you may have to create a new one
(i.e., you need a different value of {\it n}).  The dependency
described there is for the benefit of the garbage collector.  At the
beginning of a collection, the collector must find all the roots, that
is, all the heap objects that are referenced from outside the heap
itself.  The stacks contain such pointers, and the collector scans
them to find roots.  However, the collector does not know the full
structure of the stacks (frames, argument lists and so on); rather, it
just looks at the values that are there and make conservative
decisions by interpreting these values as possible pointers.  For each
stack, the collector initializes a pointer $P$ to the bottom of the
stack; then it repeatedly tries to interpret the bits pointed by $P$ as
a pointer in the heap, marks the root if the interpretation is
successfull and advances $P$.  The question is by how much $P$ should be
advanced; if all entries in the stack are aligned at $n$-bytes
boundaries, it is sufficient to increment $P$ by $n$ bytes; a smaller value
would be an overkill.  We have found that some machines require $n$ to
be 2, and that 4 is enough for others.

                       \index{floating-point}
                       \index{IEEE floating-point}
In {\tt libm3/float/src}, add a directory {\it new} and copy the files
that are in {\tt MODEL} in that directory.  The routines in these
modules provide access the floating point control (to set exceptions
and so on).  The version in {\tt MODEL} is a template, and most of the
routines will fail (because of an \verb|<*ASSERT FALSE*>|) if
executed.  The {\tt DS3100} and {\tt SPARC} directories are examples
of implementations for IEEE machines, the {\tt VAX} directory is an
example for non-IEEE machines.  It is not essential that you implement
the proper procedures right now: the versions in {\tt MODEL} are good
enough for the compiler, the driver and simple test programs.  But you
will have to take care of that at some point.

                       \index{random numbers}
In {\tt libm3/random/src}, you can either reuse one of the directories
{\tt VAX}, {\tt IEEE-le} or {\tt IEEE-be}, or create your own on those
models.  The goal is to describe enough of the floating point
representation for the random number generator.  There is probably
some overlap with the {\tt libm3/float} stuff, we will take care
of that at some point.

                      \index{Unix}
                      \index{OS interfaces}
                      \index{system interfaces}
In {\tt libm3/unix/src}, you will find a bunch of interfaces to the
procedures of sections 2 and 3 of U**X.  Not everything is there, but
we sometime dream to have a complete set; in other words, it's quite a
bit of work to make sure that you have the proper descriptions, and of
course, there is nothing from which these interfaces could be
mechanically derived.  Fortunately, the driver and the compiler rely
on very few of these procedures, and any version is probably good
enough for your machine.  We suggest that you do the work only when
you find some problems (at least, wait until you get a basic port running).

The last thing is to reflect all these changes in {\tt
libm3/src/m3makefile}.  Add a bunch of lines:
\begin{verse} \tt
    \#if defined (TARGET\_{\it new}) \\
    ... \\
    \#endif \\
\end{verse}
similar to those that are already there.  You need to make a similar
modification to {\tt compiler/src/m3makefile} and {\tt
driver/src/m3makefile} (sorry for the duplication, but it is difficult
to avoid).

\paragraph{Creating a cross-compiler} 

At the top level, type to the shell:
\begin{verse} \tt
    \$ m3make cross NEW={\it new} \\
\end{verse}

After a while, you should get an executable {\tt compiler/cross-{\it
new}/m3compiler}.

\paragraph{Cross-compiling the driver and the compiler}

At the top level, type to the shell:
\begin{verse} \tt
    \$ m3make bootstrap\_driver NEW={\it new} \\
    \$ m3make bootstrap\_compiler NEW={\it new} \\
\end{verse}

At this point, you should in the same state as if we had built a boot
archive for {\it new} and you had grabbed it.  If you cannot mount the
file system that contains all the files on the {new} machine, create a
boot archive:

\begin{verse} \tt
    \$ m3make pack NEW={\it new} \\
\end{verse}

This creates a file {\tt boot\_files/boot.{\it new}-{\it
version.tar.Z}}, which you need to unpack on the {\it new} machine.

You can then proceed as for the first installation of \srcmodula
 (see the top level {\tt README}).

Good Luck!