Paul McJones, editor
paul@mcjones.org
https://mcjones.org/dustydecks/
Last modified 12 October 2025
This project attempts to discover, preserve, and present primary and secondary source materials (including specifications, source code, manuals, and papers discussing theory, design and implementation) from the history of SETL. Jacob T. "Jack" Schwartz started the SETL project around 1970 at the Courant Institute of Mathematical Sciences of New York University. The project developed a very high level programming language, SETL, based on the idea of finite sets as the fundamental data structure.
Comments, suggestions, and donations of materials (which will be housed at the Computer History Museum) are greatly appreciated.
"While visiting IBM, Jack worked with the IBM researchers John Cocke and Frances ("Fran") Allen in connection with the work on program optimization that they had pioneered. ... Jack’s own interest in programming languages was triggered by working with Allen and Cocke. He collaborated with John Cocke on an encyclopedic treatment of concepts and techniques for compiler construction that, although published only as a Courant Institute report, was quite influential. It contained the first systematic survey of parsing techniques, as well as code-generation algorithms for imperative and functional languages, and more recondite pattern-matching techniques. Far more important were optimization techniques that were truly seminal to the field of compiler development and its history. In addition to new techniques, a lasting framework for the subject was created. Compiler optimization as a subject of study began with this report, and as a result the Courant Institute became the place where the most important optimization techniques, and students, were produced. This work has been directly present in virtually every compiler ever written since, usually in much the way first laid out in the Cocke-Schwartz report.
The algorithms developed at IBM for global data-flow analysis and program decomposition (interval analysis) have a natural set-theoretic expression, but these algorithms proved to be hard to implement in the programming language of choice at the time, namely Fortran. This led Jack to embark on a large effort to design and implement his programming language SETL based on set-theory, and to prove its usefulness as a specification language by recasting numerous algorithms in various areas of computer science into this language. SETL underwent several implementations, and design improvements with substantial contributions from Robert Dewar and others [Schwartz et al. 1986]. The central feature of the language is the use of sets and mappings over arbitrary domains, as well as the use of universally and existentially quantified expressions to describe predicates and iterations over composite structures. this set-theoretic core is embedded in a conventional imperative language with familiar control structures, subprograms, recursion, and global state in order to make the language widely accessible. Conservative for its time, it did not include higher-order functions. The final version of the language incorporated a backtracking mechanism (with success and fail primitives) as well as database operations. The popular scripting and general purpose programming language Python is understood to be a descendent of SETL, and its lineage is apparent in Python’s popularization of the use of mappings over arbitrary domains." [Davis and Schonberg 2011]
- - -
"Guy L. Steele: Could you talk about Jack Schwartz?
Fran E. Allen: Jack spent a summer at ACS and had a huge influence. He wrote a number of wonderful papers on optimizing transformations, one being "Strength reduction, or Babbage's differencing engine in modern dress." Jack had a list of applications for strength reduction, which we in compilers never took advantage of. He and John wrote a big book, never published but widely circulated, on a lot of this work [Cocke and Schwartz 1970]. I spent a year in the Courant Institute -- I taught graduate compilers. And Jack and I were married for a number of years. So it was a good relationship all around.
GLS: What did you think about SETL [a programming language developed by Schwartz]?
FEA: It wasn't the right thing for that time, but it may be an interesting language to go back and look at now that we're mired in over-specifying.
GLS: Gregory Chaitin's classic PLDI paper on "Register Allocation and Spilling via Graph Coloring" [Chaitin 1982] contains a substantial chunk of SETL code, four and a half pages, that implements the algorithm.
FEA: I liked SETL and was amazed that they got some good compiling applications out of it. In the context of multicores and all the new challenges that we've got, I like it a lotit's one instance of specifying the problem at such a high level that there's a good possibility of being able to target multiple machines and to get high performance from programs that are easy to write." [Steele 2011]
- - -
"In the present paper we will outline a new programming language, designated as SETL, whose essential features are taken from the mathematical theory of sets. SETL will have a precisely defined formal syntax as well as a semantic interpretation to be described in detail; thus it will permit one to write programs for compilation and execution. It may be remarked in favor of SETL that the mathematical experience of the past half-century, and especially that gathered by mathematical logicians pursuing foundational studies, reveals the theory of sets to incorporate a very powerful language in terms of which the whole structure of mathematics can rapidly be built up from elementary foundations. By applying SETL to the specification of a number of fairly complex algorithms taken from various parts of compiler theory, we shall see that it inherits these same advantages from the general set theory upon which it is modeled. It may also be noted that, perhaps partly because of its classical familiarity, the mathematical set-notion provides a comfortable framework, that is, requiring the imposition of relatively few artificial constructions upon the basic skeleton of an analysis. We shall see that SETL inherits this advantage also, so that it will allow us to describe algorithms precisely but with relatively few of those superimposed conventions which make programs artificial, lengthy, and hard to read." [From Schwartz, September 1970, written when he was working at IBM in the summer of 1969, according to https://wiki.c2.com/?SetlLanguage]
- - -
"I recall well the day I ran into Jack Schwartz at the corner of 90th and Broadway or near about in late summer 1970 (or perhaps 1971). He said he had been working on a project at IBM Research, but since IBM had decided not to pursue it any more, it was up to him. He then said he was working on creating a programming language based on the theory of finite sets. I thought it then, and have had the same view ever since, that this was the single greatest idea for a programming language that I have ever encountered." [Dave Shields, personal communication, 9 April 2020]
- - -
"SETL has had a significant (but well-hidden) intellectual impact on programming language research and development; in some cases it was so far ahead of its time, e.g. for classical data flow analysis (which did made it into mainstream compiler research), value-oriented programming with updatable variables aliasing and hashing to make this work efficiently, flow/dynamic type analysis for structured data (rediscovered in the 90s without knowledge of SETL work in the 70s and 80s), set and map data types (via ABC, a [the?] predecessor for Python’s dictionaries), comprehension notation (popularized in 80s and 90s as list comprehensions and later via their analysis as monads, typically without knowledge of SETL and its predecessors such as SNOBOL and ICON)." [Fritz Henglein, personal communication, 17 May 2020]
- - -
"Having coded a few algorithms in SETL, I had experienced its power firsthand—a power that stemmed entirely from its high-level inbuilt data types. Particularly powerful were sets and maps, also known as “associative arrays,” containing data that can be indexed not only by consecutive integers but by arbitrary values." [Meertens 2022]
Newsletter 101 says "This is the basic reference work on SETL. It includes a discussion of programming principles, details concerning the design of the language, and a variety of algorithms written up in SETL."
Installment 1 was published in January 1973 and Installment 2 was published in October 1973. Installment 3, never published, was to be titled "Extended Facilities of the SETL Language". The June 1975 edition includes the first two installments, renamed parts I and II.
"Unfortunately, however, the only 'language manuals' were A SETLB Primer, a well-intended document that hid the beauty of the language (actually, a subset thereof) behind the ugly face of the IBM 029 keypunch and its extraordinary mapping to the character set of a 63-character CDC line printer, and Schwartz’s own 675-page On Programming, a compilation of philosophical musings, language definitions, algorithms, and implementation details which, though impressive in the best sense of the word, presented a formidable front to the novice programmer." [from a review of Dewar 1979 by Scott Hurd]
Page 22: "Since April intensive debugging of the run-time library has been in progress. For this purpose, a comprehensive set of test programs has been designed·and is currently being implemented. We intend to create an industrial grade library of standard test programs which will satisfy the following requirements: first, the library has to be modular so that additional tests can easily be added to it, and furthermore it must allow full or partial execution of all modules; second, each test has to check its results automatically, as much as possible, since the projected size of the test library makes it prohobitive to check the results manually.
The overall design of the test library has been completed, and much of it has bee implemented with some changes still to be made to include existing stand-alone tests into the library.
Some preliminary effort aimed at installing SETL on the IBM 370 has been undertaken. A first attempt to run part of the compiler has been made; it pointed out some necessary changes both to the IBM LITTLE implementation and to the SETL source code. at this time. Most of these changes have been made on the SETL source."
S. M. Freudenberger
Combines two SETL-based Ph.D. theses:
- Ssu-cheng Liu. Automatic data structure choice in SETL.
- Robert Paige. Expression continuity and the formal differentiation of algorithms.
"... a three-pager that is basically the core result of my Ph.D. thesis. It's condensed from a paper I authored at IBM Research in Hawthorne as an intern in the summer of 1988 in Ken Perry's group ... ." [Henglein, personal communication, 8 Nov 2021]
"A miscellaneous collection of papers mostly concerned with implementation problems, but some have an independent theoretical importance. Variously authored." [Bonic 1973]
- - -
"NYU SETL Project. SETL Newsletters. Numbers 1–234, Courant Institute of Mathematical Sciences, New York University, 1970–1981 and 1985–1989." [Bacon 2000]
SETL Newsletter. Scans of Numbers 1-217 (constituting some 2700 pages) were provided by Stefan M. Freudenberger:
"It is suggested that a preliminary implementation of SETL be done in BALM. Though inefficient, such an implementation would be simple, and would permit experimentation with the form of the language, and perhaps be useful as a bootstrap for later versions.
Given below are routines to implement most of the basic SETL operations. A set is implemented as a BALM list whose first item is the value of the variable SETMARK, which will print as ***. An ordered set is represented by a BALM list. All other data-objects in BALM can be members of sets or ordered sets."
Permutation generator, generator of unordered trees, Huffman table, Huffman encode and decode, Fisher-Galler equivalence declaration analysis algorithm, polynomial addition and multiplication in form described by Knuth.
Anonymous review recommending against an [unnamed] paper on SETL being published.
A later version of this appears as Item 2, A second general reflection on programming, On Programming (1975 version), pages 12-20.
"This note will contain remarks concerning: (i) SETL implementation; (ii) the possibility (raised in Newsletter 30, pp 31-32, and 31, pp. 5-7) of securing increased efficiency by adding a system of optional elaborations to the language. These elaborations, if present, should increase running speed considerably and lead also to significant data compressions."
"This newsletter brings Newsletter 15 of Feb. 19, 1971 up to date.
Status Report: A first version of the BALMSETL parsing system is now close to working, though very inefficiently (approximately 30,000 times slower than FORTRAN)."
"This specification serves two purposes: 1. It is a starting point for a BALM implementation of a SETL-like language in which sets are accessed by means of a hashing scheme. 2. It is a detailed specification of the meaning of most current SETL operators, and a few other SETL constructions such as functional application."
Part 2 is essentially a specification of SRTL (the SETL Run Time Library) in SETL.
"We will give a more detailed specification of the parser of SETL outlined in Newsletter 47."
Proposals for symmetric use of relations; tuples, sequences, and strings; conditions on sets. Cites Codd's paper on the relational model and Earley's own work on the VERS language.
"This newsletter discusses the possibility of compiling SETL programs into LITTLE code, without using the "SETL machine" approach. The motivation for this is run-time efficiency on conventional (non-microprogrammable) machines, while maintaining portability.
A primary goal is to produce, by early 1972, the first version of a SETL compiler capable of compiling a preliminary SETL language, but probably without optimizations. The implementation in early 1972 should be able to handle most SETL features, such as recursive procedures, arbitrarily large integers, the compile statement, SETL name-scoping rules, and SETL rules for matching arguments to formal parameters."
"The BALM-SETL 4 routines follow Newsletter 49 fairly closely. There are a few points of divergence and many things which are not included in the specs, and it is these that will be elaborated upon here."
Starting on page 33 is the implementation of the BALMSETL procedures. This newsletter is marked "obsolete" in Newsletter 69.
"To compile and execute a SETLB program actually requires running two programs. The first is the SETLB preprocessor. It reads the SETLB program and translates it into BALMSETL. The second program is BALM. BALM reads the BALMSETL and produces 6600 code which is then executed."
It describes how BALM, BALMSETL, SETLB and SETL/LITTLE were being implemented, including bootstrapping.
"The SETL Run Time Library is a collection of subroutines and macros, written in the LITTLE language, that are capable of evaluating SETL expressions. Subroutines are provided to build sets, test SETL objets for equality, concatenate strings, etc. The library includes a storage allocation mechanism that is based on a compacting garbage collector."
See also SETL in the Soviet Union.
"In the present newsletter, a preliminary attempt at systematic semantic definition of SETL will be made. This will be done by describing, in a deliberately abstract way, a hypothetical 'back end' for the SETL compiler. The 'back end' will consist of a set of routines concerned with name scoping, compilation of abstract recursively structured syntactic trees into interpretable serial structures, digestion of labels, and finally with the interpretation of an ultimate code form. The total package will include almost all the routines necessary to go from a simplified 'host language' form of SETL to interpretable text."
This and other newsletters mention "SETL Notes" (with page number references as high as 251); Newsletter 101 suggestions this was [Schwartz 1971].
Includes listing of SETL Run Time Library routines.
"This newsletter gives a rewrite of the generalized nodal span parse routine described in Newsletter 46. A SETLB program was written in accordance with the SETL algorithm therein described, debugged, and retranslated into true SETL."
Chapter 2 describes PSETL (Parallel SETL).
"A version of BALMSETL with paging is available. The implementation is loosely based upon the description in SETL Newsletter 86. It proved to be only slightly more difficult to implement than anticipated. The SETLB user should not expect the paging version to solve all of his space problems as this version represents, at best, an interim solution. The user will find himself once again confronted by the dilemma of trading core size against running time.
"This note describes the present manner of making debugging runs involving the SETL run time library. The setup of a run is complicated because of the existence of global macros and variables, and because the present LITTLE compiler has no provisions designed to avoid recompiling a complete program to test any change to the program, no matter how small the change may be. The purpose of most of the complications is to circumvent the latter deficiency."
Section 8 says: "A powerful aspect of SETL is that it allows procedure names to be elements of sets or components of tuples. This use will be illustrated below."
[Was this true in 1973?]
"The debugging of the LITTLE-written, SRTL-compatible BALM interpreter which will initially support the new SETLB system is now far advanced. However, in order to avoid substantial losses in BALM compile efficiency, and possibly significant losses in SETLB execution efficiency as well, we will undoubtedly require a SRTL-compatible BALM translator as a replacement for the interpreter. This is, of course, an SRTL-compatible version of the present (R. Paige) BALM translator. This newsletter will outline a plan for such a translator."
"The second generation of SETLB, SETLB.2, based upon a version of the BALM interpreter written in LITTLE and the SETL Run Time Library, will become operational in the near future. It will offer a limited capability for variation of the semantics of subroutine and function invocation by the SETLB programmer."
"This newsletter is a sequel to SETL Newsletter #102 ('Reduction in Strength Using Hashed Temporaries')."
"Initial runs of the LITTLE written MBALM [the BALM virtual machine] simulator indicate that it runs almost a factor of two times more slowly than the current FORTRAN based MBALM simulator. Allowing for expected improvements in the new LITTLE assembler the LITTLE based simulator can probably be made to perform as well as the FORTRAN version. Since the LITTLE based simulator will use the SETL run time library procedures in executing SETLB programs it will actually be considerably faster than the FORTRAN based version. However, it seems unlikely that the LITTLE based simulator will match the overall speed of our current MBALM to COMPASS translator system. Consequently a SRTL-compatible BALM translator will probably be necessary to make the LITTLE based system a viable alternative."
"As a byproduct of Malcolm Harrison's Artificial Intelligence Algorithms Project, there's now available on the 6600 a working program to turn SETLB source decks into publication SETL manuscripts.
The output of the program should be listed using Malcolm Harrison's deluxe teletype to take advantage of its lower case capabilities, backspacing, underlining, and extended character set."
Proposes adding a feature based on the "meta composition" of the family of "reduction languages" defined in [Backus 1973] (which Schwartz intended to cite but omitted).
- John Backus. Programming language semantics and closed applicative languages. In Proceedings of the 1st annual ACM SIGACT-SIGPLAN symposium on Principles of programming languages (POPL '73). Association for Computing Machinery, pages 71–86. ACM Digital Library
"... continues D. Shields' Newsletter 45 on the same subject."
"We examine here the feasibility of a direct SETL to LITTLE translation as the next phase of development of the SETL system. The interest of such a direct translation is two-fold:
a) by eliminating the BALMSETL link from the present system, many of the annoying semantic restrictions imposed by BALM on SETLB can be removed. In particular, the implementation of the proposed SETL namescoping scheme becomes possible. Current restrictions on the use of labels and recursive calls within iteration loops can also be removed.
b) Bypassing BALM as an intermediate language streamlines the SETL system and simplifies its maintenance. Incrementality of the language can actually be assured more easily than with a SETL-to BALM-to LITTLE system, where tables of variables, procedures and linkages have to be maintained in two places simultaneously (the BALM and SRTL environment)."
"This newsletter
i. takes up the discussion initiated in newsletter 90;
ii. tries to put the discussion on a more reasoned semantic basis;
iii. arrives at a proposal differing substantially from that of NL. 90...The discussion which follows is much influenced by the proposals made by George Weinberger in his thesis-in-progress.
"This newsletter takes up the 'high level optimisation' theme of NL.71, NL.118, NL.121, and of Aaron Tenenbaum's thesis (hereinafter referenced as TT). The generally good performance of Tenenbaum's 'type-finder' program suggests that it may be feasible to deduce deeper properties of the objects appearing in SETL programs. In the present newsletter, we will outline techniques which, building on the approach and result of the typefinder, allow useful relationships, among them relationships of inclusion and membership,to be established."
The Pursue block.
"SETL1, the first fully compiled implementation of SETL, is nearing completion. This system uses a BALM-to-LITTLE translator, written in BALM, to produce LITTLE source code, which is then compiled as a LITTLE program and executed in the environment of the run-time library. The translator is itself interpreted by the BALM interpreter, written in LITTLE, which also utilizes the SRTL. The full system consists of the following modules:
a) SETLA: A lexical scanners and syntactic analyzer. SETLA is written in LITTLE. Its output consists of various token tables and a sequence of calls to generators.
b) TREEWLK: A series of treewalking routines, written in BALM, which build BALM parse-trees using the output from a).
c) TRANS: The translator proper, which replaces the code generators of the BALMSETL system, so that LITTLE code is emitted (instead of extended MBALM code).
d) BALMINT: A BALM interpreter, which uses a portion of SRTL, and executes b) and c).
e) The LITTLE compiler.
f) SRTL."
[Was "SETLA" promoted from the name of the lexer/parser to the name of this whole system?]
"This newsletter returns to the theme of NL 39 (More Detailed Suggestions Concerning 'Data Strategy' Elaborations for SETL), namely to the idea of a declarative, programmer-assisted approach to the problem of data structure choice. Since such an approach is an alternative to and perhaps also a preparation for fully automatic structure choice, it deserves investigation. This newsletter will simplify and flesh out the suggestions made in NL 39."
See [Deak et al. 1977].
Timings are provided for three versions of each of five algorithms: SETL running on SETLA (which ran on the MBALM virtual machine), SETL compiled by TSETL to LITTLE code, and hand-translated LITTLE versions. The results showed that TSETL was 2 to 5 times faster than SETLA. The hand-translated LITTLE versions were 20-200 times faster than SETLA, and 7-50 times faster than TSETL.
"The fully compiled version of the SETL system (SETLC) is now functioning adequately (though problems still exist; see below). Our main effort from now on will be directed at the global optimization of SETL. Toward this end the following has already been done:
(a) A large variety of peephole optimizations have been installed. Our experience with these is disappointing; speed gain is small. This makes it clear that purely local optimization of SETL cannot improve its speed much.
(b) Extensive, but still quite unpolished, SETL code for the various forms of global analysis described in 'Optimization of Very High Level Languages' (Jour. Prog. Langs., v. 1, # 2,3) has been developed.
(c) Dave Shields' work on global optimization of LITTLE is going forward. When complete, this should improve the speed of our whole system by about 20 %.
(d) A semi-automatic data structure choice scheme has been outlined in NL 151."
"This Newsletter extends the remarks of SETL Newsletter no. 57 (Hank Warren: copy minimization in SETL) and describes a somewhat restricted implementation of reference counts (RF's) in the SETLA interpretive system."
"This Newsletter describes the data structures which will be used for the run time systems of both the interpreted and compiled versions of the new SETL system. The introduction of basings has provided an opportunity for redesign of the SETL runtime system since it will be necessary to modify much of the runtime library in any case to support the basings. Consequently, the data structures have been completely redesigned and it is expected that programs will run faster with the redesigned structures even if the basing declarations are not used."
"This Newsletter describes the data structures which are used for the run-time systems of both the interpreted and compiled versions of the SETL 2.1 system. It replaces SETL Newsletter 189 on the SETL Data Structures by Robert B. K. Dewar, Art Grand, Edmond Schonberg, and Jacob T. Schwartz.
Most of the changes to the previous newsletter are in the details, rather than in the general design of the data structure. In the course of the implementation, however, a sufficient number of differences developed to make this revision necessary."
PUMA was a microprogrammable computer sharing the CDC 6600 instruction set architecture that was designed and built at NYU.
- See [Grishman 1978].
- See [CIMS 1978, page 20].
- Ralph Grishman. The PUMA project: Computer design automation in the university. In Proceedings of the ACM 1980 annual conference (ACM '80). Association for Computing Machinery, New York, NY, USA, 490–497. ACM Digital Library
"We are currently in the process of debugging the new SETL system. At this point we have successfully executed a 200 line program which tests the I/O, all the control structures, arithmetic, and most of the primitives for sets, maps, and tuples. It also tests the garbage collector, including the feature which turns off share bits of objects which are not actually shared. The test program does not contain any declarations since they cannot currently be processed by the compiler.
The run time library is close to completion. ...
At this stage we must put a considerable amount of work into the compiler. ..."
"This newsletter contains a suggestion for introduction of string primitives into SETL.
The suggestions derive from SNOBOL-4 [sic], but lack (deliberately) the complexity of this language's pattern matching facility. In particular, they do not contain any imbedded backtracking. In those rare cases where backtracking matches are desired, the suggested functions can be used in conjuction with the existing backtracking prinitives to provide the required effect."
"The new SETL system is now beginning to approach operational status. As the present library of data representations, and the routines which support them become operational, it may become feasible to add various significant new representations, with corresponding code, to the library. Some of these representations may be easily implementable by short routines which invoke existing facilities. This note will comment on two potentially significant representations not currently provided: list and B-trees."
"This note follows SETL Newsletter No. 203, where a new approach to automatic data-structure selection has been described. The approach suggested there, although it has a rather simple structure, suffers from several deficiencies which have led us to look for an alternative approach. The new and probably improved approach, to be described in this newsletter, is closer to Ed Schonberg's algorithm (to be described in a coming newsletter) and seems to be faster than the first approach. In spite of the differences between these two methods, they still have a similar overall logic which is much simpler than that of previously suggested algorithms. Among these simplifications are: using the BFROM and FFROM maps instead of value-flow maps and dispensing altogether with a phase which inserts 'locate' instructions into the code. ..."
"This newsletter outlines the design of the automatic data-structuring module currently incorporated in the optimizer of the SETL compiler. Although several aspects of this design have been considerably revised and modified, it is presented in unretouched form, first to serve as a short introduction to the use of bases for automatic data-structuring, and second, as a background for M. Sharir's recent newsletter on the subject (SETLNL.203), which details several major improvements to the scheme presented here."
See also Newsletter 207, an improvement of Newsletter 203.
"The work on the SETL optimizer is now approaching the end of another phase. A first implementation, written in SETL, is about to be completed. It contains most of the major optimizations that we have contemplated, including common subexpression elimination and code motion, type analysis, automatic data-structure selection and copy optimization. It has been tested on a few small to medium size SETL programs with satisfactory results.
Of course, in its current state, the optimizer is still far from the state at which it could be used routinely as part of the SETL compiler. Currently it does not contain based data-structure declarations, and because of its length it is compiled and executed as nine successive phases. On the average the optimizer currently processes one SETL source line per minute. However, we can estimate the speed-up that could be achieved by the following improvements:
(a} Introduction of based representations (especially in the sections which perform bitvectoring data-flow analysis, where currently the analysis bitvectors are still represented as unbased sets). This should give a speed-up of at least 2.
(b} Improvement of the SETL system, e.g. by generation of hard code and elimination of the interpretive overhead, should give a speed-up of approximately 4.
(c) Elimination of the binary I/O currently needed for communication between subsequent phases of the optimizer, and minimization of the dumps and other printouts currently produced by the optimizer, should give a speed-up of approximately 4. All these improvements together could bring the optimizer to a level at which it could process roughly 30 lines/minute.
(d) Optimization of the optimizer by self-application. This might double the optimizer speed once more. Note however that the optimizer is roughly 10,000 lines long, so self- application will initially require a run of approximately five hours. If these estimates are correct, then the optimizer will reach only a minimal production level after these improvements.
It therefore appears likely that we will want to recode all, or significant parts of the optimizer in a lower-level language (probably LITTLE)."
See also:
- Micha Sharir. Some observations concerning formal differentiation of set theoretic expressions. ACM Trans. Program. Lang. Syst. Volume 4, Number 2 (April 1982), pages 196–225. ACM Digital Library
SETL Newsletter. Citations and some PDF files for Numbers 220-233 were provided by Fritz Henglein:
The symposium, held at NYU on 22 May 1972, included lectures by James B. Morris, Edsger Dijkstra, Aiden Falkoff, Gerald Sussman, John Backus, and Andrew Yershov as well as lectures by NYU authors:
- Jack Schwartz: Abstract and Concrete Algorithms.
- Rudolph Krutar: An Algebra of Assignment.
- David Shields: Optimization Algorithms in a Set Theoretic Language.
- E. Milgrom: The Use of Sets in the Extensible Programming Language AEPL.
- Kurt Maly: The Front End of the SETL Compiler.
- Henry Warren: SETL Internals.
- K. Kennedy: Some Optimization Strategies for High Level Languages.
"The MIDL language currently being developed at NYU incorporates features of two other NYU languages (SETL and LITTLE), and welds these rather different languages together."
Around the time Schwartz began thinking about SETL, his NYU colleague Malcolm Harrison designed a language called BALM, which became a tool for early prototypes of SETL:
"In fact there was an earlier implementation of SETL at NYU (one archeological layer below) done in BALM (which stood for branch-and-link mechanism) a language designed at NYU by Malcom Harrison, which had roughly the semantics of LISP and the syntax of ALGOL 60." [Ed Schonberg, personal communication, 8 April 2020]
- - -
"I wrote the first implementation of SETL, called BALMSETL, as it was written in BALM, a block-like extension of LISP. ..." [Dave Shields, personal communication, 9 April 2020]
- - -
"There have been a number of implementations of SETL and SETL-like languages over the years. The first was called SETLB [Mullish and Goldstein 1973] and was implemented using an extension of Harrison’s extensible LISP-like BALM language, BALMSETL [Harrison 1970, Shields 1971]. SETLB was succeeded by SETLA [Schwartz et al. 1972], was implemented in BALMSETL, and was an almost strict subset of SETL." [Bacon 2000]
- - -
"SETLA is an implemented version of a subset of SETL, a set-theoretic language developed at NYU by Jack Schwartz. ... SETLA is for the most part a compatible subset of SETL (to a much greater extend than its predecessor, SETLB). However, BALM still plays an important role in the structure of SETLA, and this is reflected in some of its features which do not properly belong to standard SETL. Note in particular that name scoping and the special conventions which apply to begin blocks and statement labels are the same in SETLA, BALMSETL, and, indeed, in BALM. The SETLA precedence rules and procedure linkage mechanisms are however those of SETL, rather than those of BALM or BALMSETL. 'BALMSETL' above refers to a series of BALM procedures which implement SETL primitives as BALM extensions. The structure, and indeed the existence of BALMSETL out to be invisible to the SETLA user, and will be for the most part. See however section 3, 5, and 8 for unavoidable instances of BALMSETL visibility." [Schwartz et al. 1972]
"Jack had created a low-level language called LITTLE. It had the bitfield as the basic data type, and was based on a very clever macro language to extend LITTLE's expressive power. The original version was written in FORTRAN. I took over the project and bootstrapped it to a version coded in LITTLE. This was done by the end of 1973. The next implementation of SETL was written in LITTLE by Robert Dewar and Art Grand. During that time we were porting LITTLE from the CDC 6600 to other systems, notably System/360, DEC-10, VAX 32v and VAX/VMS. (LITTLE was later ported to the PDP-11, though I played no role in that.)" [Dave Shields, personal communication, 9 April 2020]
- - -
"Hank [Henry S. Warren] worked on SETL definition and SRTL, the run time library for SETL. It was written in LITTLE." [Dave Shields, personal communication, 27 April 2020]
- - -
"The full SETL language itself was implemented in LITTLE [Warren 1971], syntactically a Fortran-like language supplemented with a notation for bit-field extraction. LITTLE had only two built-in data types, fixed-length bit strings and floating-point numbers, but was used to implement both the compiler and run-time system of the version of SETL maintained and distributed by the Courant Institute of Mathematical Sciences (CIMS) at New York University from the mid-1970s until the late 1980s." [Bacon 2000]
"The user manual describes the NYU LITTLE implementation of SETL as defined by The SETL Programming Language by Robert B. K. Dewar, March 12, 1980."
SETL was not placed in the public domain (as was LITTLE); it was only distributed in binary form, under license. However it is now being made available for study and non-commercial use. This copy was provided by Stefan M. Freudenberger.
"I am reasonably sure that I also have the sun-m68k-bsd4.2 setl/little version, including source and binaries, and including the ltlasm that I wrote for m68k and that is modelled after the ibm s370 ltlasm that Richard Kenner wrote. These code generators generate target code directly, not via the high-level format that was initially used for the dec 10 port (known as t10), and later for the vax port (known as t32). I would have to check whether I have vax780-vms version also ... I have a box with tapes, one of which I had given to a commercial service to read as I certainly don’t have hardware to read these. I also have directory listings of most of these tapes so I’ll have to do some research to see what I have." [Stefan Freudenberger, personal communication, 19 March 2021]
"The SETL project produced, among other things, the SETL optimizer [Schwartz 1975, Freudenberger, Schwartz, and Sharir 1988; Dubinsky, Freudenberger, Schonberg, and Schwartz 1983], a 24,000-line prototype written in SETL. Unfortunately, on the machines of the day, it was too large to apply to itself. [Bacon 2000]
The SETL optimizer is "pass two and a half", running between the semantic pass and the code generator of the regular SETL compiler. It was written by Stefan M. Freudenberger, Art Grand, Jacob T. Schwartz, Micha Sharir, and Leonard Vanek. See [Newsletter 213].
Nigel Williams was given this tape by Jean-Pierre Rosen in Paris who, as a post-doc at NYU contributing to the Ada/Ed project, brought the tape home afterward. Nigel arranged to have it sent to Dennis Boone at Michigan State University, who was able to recover its contents.
"PSETL is a version of SETL which has been enlarged to allow the description of algorithms involving interrupts, parallelism, and to some extent, machine dependent features. Using PSETL, several operating systems are presented in detail. The first, a simple uniprogrammed batch system illustrates basic control mechanisms and scheduling. The second, a multiprogrammed batch system, shows additional complications which arise due to contention for resources and conflicting objectives. Our third system is interactive and includes data sharing capabilities."
"The SETL language allowed the experimental version of SYLLOG to be developed with far less time spent on programming and debugging than would have been needed in other languages. The particular features of SETL which seem to have helped the most are:
- the primitive, nestable data types tuple, and (of course!) set,
- the non-procedural such that notation for defining tuples and sets, and
- the provision of the omega value undefined, which was very useful in debugging."
The first appendix begins "The following program written in SETL (see [Dewar, Schonberg, and Schwartz 1981]) outlines in executable form the main ideas and algorithms presented in this paper."
Schwartz and other SETL project members visited A. P. Ershov's group in Novosibirsk several times during the 1970s. Ershov's group implemented a version of SETL on the BESM/6 computer using the EPSILON programming language.
"Ershov was a pioneer in the field of computing in the U.S.S.R. In the early 1960’s he did fundamental, ground-breaking work in the area of programming optimization while constructing an Algol compiler. He did this work on his own, in isolation, and independently came up with many of the key ideas that were then being discovered by Western scientists in the early 60’s.
His work became known in the West because of the writings of my colleague and thesis advisor at New York University, Jacob T. Schwartz.
My wife and I made our first trip to Russia in November, 1973, just weeks after the “Yom Kippur War.” We went as ordinary tourists to visit her many relatives. Both her parents were born in Russia, and her relatives there were then living in the Ukraine, in the cities of Kiev and Lvov. Many of them later emigrated to the U.S. We also went to Novosibirsk, Siberia, during our stay to visit Ershov.
Ershov was a member of the Soviet Academy of Sciences, a position in those days that gave its owner about the power and prestige of a U.S. Senator. He was one of the founders of Akademgorodok, a community based on education and research that was established in Siberia to provide a foothold for such work in Siberia, in the hopes it would speed Siberia’s development. [1] I believe he was a native of Siberia. It is similar in many ways to Los Alamos, New Mexico, in that respect.
[1] “Gorodok,” which means “small town,” was one of the first words I learned in Russian during the two years I studied Russian during my high-school years. A famous story by Lermontov begins with the words, “Taman yeset malenky gorodok … ” — “Taman is a small town …” [Dave Shields, 2006]
Setl-s was designed by Nigel P. Chapman at the University of Leeds, with guidance from Robert B. K. Dewar and contributions by Anthony McCann. A later port to MS-DOS was done by Julius J. VandeKopple at New York University, again with guidance from Dewar.
"The CIMS SETL system was quite slow and cumbersome, and LITTLE was not widely ported, so in the late 1970s Nigel Chapman, who was then a graduate student at the University of Leeds, designed and implemented a system called Setl-s [Chapman 1980]. It covers a substantial subset of SETL, leaving out only such ephemera as the macros, backtracking, the data representation sublanguage, support for separate compilation, and a few minor syntactic luxuries. The '-s' in the name can also stand for 'small', because Setl-s was a very compact system based on Dewar’s celebrated indirect threaded code [Dewar 1975] technique, and was written in MINIMAL, the portable assembly language in which the run-time system of MACRO SPITBOL [Dewar and McCann 1977] was implemented. Jay VandeKopple dropped the hyphen from Setl-s and worked on what was then called SETLS in the 1990s [Julius J. VandeKopple, private communication to David Bacon, 1999] while at NYU on sabbatical from Marymount College. " [Bacon 2000]
"I didn’t hear about Setl until I went up to Leeds for an interview in 1977, after I’d applied to do a PhD there. ...
Tony McCann at Leeds had worked with Robert Dewar on macro Spitbol. Dewar was part of the Setl team at NYU, and he had the idea of using indirect threaded code (ITC) to do for Setl what macro Spitbol had done for Snobol. Tony got involved and proposed to supervise a PhD student to work on the project. I thought it sounded interesting and in line with my interests, so I took it on.
It's ironic that an interest in high level languages led me to write thousands of lines of assembler. Being straight out of Cambridge, I was all for building the thing in BCPL, but Robert and Tony believed that you needed to be closer to the machine to get the benefit of ITC. The whole emphasis was to be on improved performance, so it had to be Minimal [the portable assembly language of Macro Spitbol].
...
The system was really Robert Dewar's baby. You've probably heard that Robert was something of a character, full of confidence and energy, also very bright. He breezed in and sketched out the implementation in a couple of afternoons then left us to it. I like to think that I made significant contributions later on, specially to the front end, but Setl-s is based on macro Spitbol.
Tony adapted the macro Spitbol garbage collector. He was adamant that the garbage collector had to come first. He also wrote the lexical analyzer to kick-start the development. I did all the rest.
The plan was to implement a subset of Setl, so naturally we started referring to it as subSetl. Dewar didn't like that, though – I guess he thought it made it sound inferior, or maybe he hoped to do the whole language one day. People were sticking letters on the end of language names to identify implementations, like Algol68C, so Tony came up with Setl-s. I never liked it, but it was just a name.
...
There was a guy called Dave Shields, who was our main contact at NYU. He worked on their Setl implementation as well as macro Spitbol. ..." [Nigel Chapman, personal communication, 21 June 2020]
Chapman completed version 1.9 in 1980. His Ph.D. thesis [Chapman 1980] uses SETL-S as an example of the techniques it describes. Later, Julius J. VandeKopple, working at New York University, adapted it for MS-DOS, and distributed versions through 2.2 in 1991.
Includes the main source file for SETL-S, which is written in MINIMAL, and also includes the files distributed for version 2.2 on October 6, 1991 running on MS-DOS.
To run the MS-DOS version of SETLS from the browser, visit this page:
https://virtualconsoles.com/online-emulators/dos/and drag DISTR-ZIP.zip to the blue area on the web page, then press the blue Start button at the bottom. You’ll soon have a C:\> prompt; type “cd distr” to enter the directory, then type “setls fact” to run the factorial example, followed by two carriage returns to accept the default filenames for listing and executable. There’s DVED from Robert Dewar in this same directory if you want to write a new program.
"SETL existed pretty much on machines which were falling out of favor. It was used on DEC VAX’s for most classes. There was an implementation for Motorola-based Sun Workstations and several on much larger machines. I don’t think there was ever a version for RISC machines. At any rate there was a project at NYC to develop a successor to SETL, funded by the Office of Naval Research. Jack was not around at the time as this was during his stint at DARPA. I got one of the RA positions on the project shortly after I started at NYU and was working on a SETL2 compiler under Matthew Smosna. We were writing a compiler front end in Ada. It didn’t go all that rapidly for a number of reasons, I think. I was pretty junior, there were still regular language design meetings so it was very much in flux, and I suspect Jack’s absence slowed things down. Frankly I mostly sat and listened in those meetings. Ed Schonberg would probably have more insightful comments. Fritz Henglein as well. Unfortunately those are the only two people that I remember being involved that haven’t passed away.
Anyway, the funding dried up. I was having fun with it so I started all over in C. At the time I didn’t have access to an Ada PC compiler and I preferred working on my own PC. With the design meetings over I didn’t have to worry about language changes any more, it was really just my own fun. When I had something feebly running I showed it to Matthew and he talked to some other people about it. Jack returned to NYU and became my advisor. There never was any funding for SETL2 but after I started working with Jack money never seemed to be a problem any more so I could spend time on it until it became pretty stable.
So I think you can kind of think of there being two SETL2’s, the official sponsored one which never got to the point where something ran, and mine that started out as skunkworks but became more accepted in the end."
[Kirk Snyder, personal communication, 14 May 2020]
CONTENTS
"This minicourse will use SETL2 as a vehicle for showing how types and transformations can be used to integrate algorithm design and analysis, program development, and high level compilation of set theoretic programming languages. There are four lectures, each lasting two hours. Several References may be found here."
A derivation of the program is presented in:
- B. Bloom and R. Paige. Transformational Design and Implementation Of A New Efficient Solution To The Ready Simulation Problem. Science of Computer Programming, Volume 24, Number 3, 1995, pages 189-220. cs.nyu.edu
The ISETL language was designed and implemented by Gary Levin, based on Edward Dubinsky's experience using SETL to teach mathematics. Dubinsky and his colleagues used ISETL for a series of math textbooks.
"The main problematic features of SETL in 1985 were:
- SETL is very large and is not useable on the small personal computers just beginning to become popuular in academia.
- SETL is slow.
- SETL is a compiled language and it was necessary to operate in batch mode.
- SETL has many complex features that are not essential for educational purposes.
- Functions are not first class objects in SETl.
In summer 1985, I began conversations with Gary Levin about the possibility of an interactive version of SETL that would eliminate these problems and be more appropriate for educational uses. Discussions conntinued through the Fall and by January, 1986 we had agreed on specifications and Gary began to work on what would become (with the agreement of Jack Schwartz) ISETL. ... By Fall of 1988, ISETL version 1.0 was sufficiently stable that I could use it in courses on discrete mathematics and abstract algebra." [Dubinsky 1995]
- - -
"ISETL is an interactive version of SETL. It was inspired by Ed's desire to teach mathematics using manipulable objects. I stripped SETL down to the subset he needed and then added extra features (like first class functions) that I thought would be useful. The code owes nothing to the SETL implementation and is written in very basic C." [Gary Levin, personal communication, 12 May 2020]
The Proteus Project was "a collaborative effort by researchers at UNC Chapel Hill, Kestrel Institute, and Duke University to develop a system for the specification, prototyping, analysis and generation of parallel software." [Home page at cs.unc.edu]
The project included the Proteus language.
"I modified the ISETL interpreter to execute Proteus programs, but they were mostly ISETL with a few additional operations for task parallel and data parallel execution. I wrote some relatively large programs manipulating gravitational fields using Proteus.
ISETL was really a remarkable tool. It enabled so many concepts so easily. Two of my favorites were functions as expressions and pointwise function definitions.
I ran ISETL on a Sparc4 sun workstation. I had to get help from Gary as the compiler laid out structs differently and ISETL crashed. He fixed it quickly and we were off. I would run week long programs. 20 years later, I rebuilt ISETL on my Dell laptop. It was so much faster." [Lars Nyland, private communication, September 2021]
ISETL-LINDA was part of the LINDA project at The University of York.
"ISETL-LINDA is an imperative interpreted language, built on top of Gary Levin's ISETL language. ISETL (Interactive SET Language) is novel, because it has as its major data structures sets and tuples.
We have embedded the LINDA primitives into ISETL. To do this, we introduced a type of bags, which represent tuple spaces, and templates, which are the counter-part of tuples." [ISETL-LINDA page at cs.york.ac.uk]
"This paper describes a new language, ISETL-LINDA, and its implementation over a network of transputers. The language itself is similar to another parallel language, ProSet [Hasselbring 1993]. Both are embeddings of Linda [Gelernter 1985] into SETL."
The Esprit-funded SED research project was led by Jean-Pierre Keller.
"SETL’s success as a prototyping tool spawned the Esprit SED (SETL Experimentation and Demonstration) project [Keller 1989] of the late 1980s, which was a sweeping effort to create a SETL-based prototyping environment complete with highly sophisticated language manipulation tools at the syntactic and semantic levels [Donzeau-Gouge et al. 1987]. This included a SETL-to-Ada translator [Doberkat and Gutenbeil 1987, Doberkat et al. 1989], an editor, a debugger, and a performance-profiling monitor [Bouzas et al. 1989]. The latter was rendered particularly accommodating and non-invasive by the use of coöperating processes sharing messages over TCP sockets. Abstract interpretation was the operative model in both the ambitious Tenenbaum-based [Tenenbaum 1974] type inferencer and Paige’s more general RAPTS [Paige 1986] transformational system (the predecessor to APTS), which was used to prototype “meta-SETL” [Aliffi et al. 1988], an AST-traversing interpreter. .... Interoperability was addressed in the SETL-to-Ada translator and in the performance monitor by means of ISLE (the Interface Specification Language and Environment), which was important for the SED project’s demonstration of rapid prototyping in SETL in a cartography application containing a package of computational geometry algorithms [Bistiolas et al. 1989].” [Bacon 2000]
"Encouraged by SED’s contributions to the art of programming in the large and by the project’s inchoate plans for persistence in SETL, but disappointed by SED’s failure to arrive at a coherent product [Doberkat et al. 1992a], Ernst-Erich Doberkat proposed integrating persistent backing stores called P-files and their requisite namespace support into SETL/E [Doberkat 1990, Doberkat et al. 1990], a revision of SETL that was extended with a process creation operator and renamed ProSet [Doberkat et al. 1992b] to signify its role in prototyping. Doberkat’s interest in SETL during the 1980s [Doberkat et al. 1983, Doberkat 1985, Hyun and Doberkat 1985, Doberkat and Gutenbeil 1987, Doberkat and Fox 1989, Doberkat et al. 1989] grew in the 1990s into a more general interest in software engineering with set-oriented languages having intrinsic persistence features [Doberkat and Gutenbeil 1991, Doberkat and Sobottka 1991, Doberkat 1992, Doberkat et al. 1992a, Doberkat et al. 1992b, Doberkat et al. 1996] that sought to spare the programmer the trouble of coding data movement operations explicitly. Willi Hasselbring also showed how to translate [Hasselbring 1991a] a subset of SETL/E into Snyder’s SETL2. ProSet tuples are natural candidates for inter-process communication via Linda tuple spaces [Gelernter 1985], so Hasselbring has worked extensively throughout the 1990s on and with a methodology for prototyping concurrent applications using a hybrid system called ProSet-Linda [Hasselbring 1991b, Hasselbring 1992, Hasselbring 1993, Hasselbring 1994, Hasselbring and Fisher 1994, Hasselbring and Fisher 1995, Hasselbring et al. 1997, Hasselbring 1997b], and compared this approach to several others [Hasselbring 1997a, Hasselbring and Kröber 1998]." [Bacon 2000]
"Jean-Pierre Keller, the leader of the SED project, went on to define a SETL descendant which he called Cantor [Keller 1991, Keller 1994]." [Bacon 2000]
"In its current version, Cantor has been derived from iSETL (by Gary Levin). Cantor has the usual collection of statements common to procedural languages, but a richer and simpler set of expressions" [Keller 1994, Volume 2, "Cantor short history"]
Keller used Cantor to write the NAP first-order logic proof checker [Keller and Paige 1994]. NAP was itself intended for use in proving properties of Cantor programs [Keller and Paige 1995].
David Bacon designed and implemented another version of SETL at NYU in the 1990s.
The section "A Brief History of SETL" and the Bibliography have been invaluable in preparing this web site.
A debugger based on maintaining a complete execution history, using persistent data structures, was built. It used the translator from Bacon's GNU SETL, together with a runtime system written in Scheme.
SETL4 is an implementation of SETL written by Dave Shields in Macro SPITBOL [Dewar and McCann 1977]. The name stands for the fact that it followed several implementations of SETL by the original project team.
SPITBOL is an implementation of SNOBOL4. The original version was designed by Robert B. K. Dewar and Ken Belcher in 1969, for the IBM System 360. Later Dewar and Anthony P. McCann of Leeds University created Minimal, a portable assembly language and then rewrote SPITBOL in Minimal, resulting in Macro SPITBOL.
See in particular the file setl4, which is the Spitbol source program.
"setlX is an interpreter for the high level programming language SetlX (set language extended).
The most distinguishing feature of this language is the support it offers for sets and lists. As set theory is the language of mathematics, many mathematical algorithms that are formulated in terms of set theory have very straightforward implementations in SetlX.
Designed mostly by Karl Stroetmann, the SetlX language is an evolution of Setl by Jack Schwartz. It was specifically conceived to make the unique features of Setl more accessible to today's computer science students." [setlx.randoom.org]
Like SETL, SetlX is dynamically typed. It has a terse syntax based on C and Java. The language combines SETL's support for sets and tuples (called lists), and adds Prolog-like terms, support for functional programming, regular expressions, limited backtracking, object-oriented programming, vectors and matrices, several probability distributions, a plotting library, and a small set of graphical primitives. It is implemented in Java and is thus multiplatform; the implementation is called setlX.
Also serves as complete reference manual.
A follow-on rather than a dialect, Griffin was heavily influenced by the experience of using SETL.
"The Griffin [Dewar et al. 1992] language which was designed at New York University in the early 1990s was intended to be a general-purpose successor to SETL. Its goals were very lofty, and included what is surely the most comprehensive type and type inference system ever proposed. Griffin was supposed to give the programmer complete freedom of choice as to whether to code in a functional or an imperative style. It had language constructs for database-style transactions, namespaces, and persistence. Real-time facilities, Ada interoperability, exceptions, and broad support for concurrency were also built in at the language level. Much of Griffin has been implemented in a compiler, but a major obstacle to the completion of that enormous task has been the difficulty of fixing on a fully self-consistent language definition." [Bacon 2000]
"Griffin was a short-lived effort at language design, which was unusual in its attempt to combine ideas from various trends in language design, coming from SETL, LISP (via BALM, the language designed by Malcom Harrision), ML and Haskell and type inference (thanks to the participation of the late Paul Hudak who had been Ben Goldberg's thesis advisor at Yale). These different trends did not lead to a viable new language. but produced a number of extremely stimulating discussions among the participants!" [Ed Schonberg, private conversation, 5 October 2021]
This project would not have been possible without help from many people. Special thanks go to Annie Liu for coaxing me to begin this project in the spring of 2020, and to Nigel Williams, who had presciently started the search for artifacts back in 2013.
"In the mid-1960s Jack began to devote his creative energies to the emerging field of computer science. In 1969 he founded the Computer Science Department at New York University under the umbrella of the Courant Institute of Mathematical sciences. He served as chair of the department until 1977. Jack’s directorship of the Information Science and Technology Office at the Defense Advanced Research Projects agency during his two-year leave (1987-1989) from NYU enabled him to seriously influence the direction of research in computer science in the United States." [Davis and Schonberg 2011]
"I wasn't present at the time, but a mutual colleague said roughly: Jack went home one weekend with a 7090 manual and came back on Monday having written NU SPEAK [an early list processing language]." [Peter Capek, personal commumication, 21 May 2020]
"I met John [Cocke] at Yorktown. I had written the paper -- the paper appeared in 1967 so this must have been around 1966 -- on parallel computing which they had somehow noticed and contacted me and I met him at that time. I believe he and Jack Bertram visited at New York University. After the visit I came up to Yorktown; I spent a summer at Yorktown learning something about how machine architecture was actually done, getting familiar with the basic issues and then continued as a consultant in the California days of the [ACS] project." [Van Deusen 1990]
A management shuffle at the ACS project led to an exodus: "Among those leaving, Cocke took a sabbatical at the Courant Institute of Mathematical Sciences at New York University to work with Jacob Schwartz ..." [Smotherman et al. 2016]
See also: M. C. Harrison. Implementations of the SHARER2 time-sharing system. Commun. ACM, Volume 11, Number 12 (Dec. 1968), page 845. ACM Digital Library
"John certainly had a great influence on me. He had in his head ... a whole host of compiler optimization techniques that were quite novel and became quite influential. The intellectual history there really is that I learned them from him and wrote them down. They were written down first in a series of internal IBM technical reports connected with the ACS project that must have been, oh, 1972 or something like that. But those never were publicly distributed you know some people still have them as as old ACS workbook files and that sort of thing. Then I wrote them down in a set of notes on compilers -- these are the old Cocke and Schwartz notes on compilers, and I think from there they got into the more standard textbook literature - you know, Aho and Ullman [1977] and that sort of place. But I think the first publications of them were in that set of notes, the Cocke and Schwarz notes." [Van Deusen 1990]
"I used this material to produce the optimizer for the compilers we wrote for IBM’s RISC machines. The optimizer and register allocator (invented by Greg Chaitin) were so effective that we made back-ends for the compiler to also generate code for the IBM/370, and the Motorola 68000, which IBM used extensively as an embedded microprocessor in many devices. For a while, I think IBM had more 68000 code than anyone else, and the high quality of the code harks back to the optimizations gotten from Cocke and Schwartz. I still have a copy, but the spine is completely worn off!" [Peter Markstein, personal communication, 27 March 2021]
"The name BALM is actually an acronym (Block And List Manipulator) but is also intended to imply that its use should produce a soothing effect on the worried programmer. The system has the following features:
1. An Algol-like input language, which is translated into an intermediate language prior to execution.
2. Data-objects of type list, vector and string, with a simple external representation for reading and printing and with appropriate operations.
3. The provision for changing or extending the language by the addition of new prefix or infix operators, together with macros for specifying their translation into the intermediate language.
4. Availability of a batch version and a conversational version with basic file editing facilities.
The intermediate language is actually a form of LISP 1.5 which has been extended by the incorporation of new data-types corresponding to vector, string and entry-point. The interpreter is a somewhat smoother and more general version of the LISP 1.5 interpreter, using value-cells rather than an association-list for looking up bindings, and no distinction between functional and other bindings. The system is implemented in a mixture of Fortran (!) and MLISP, a machine-independent macro-language similar to LISP which is translated by a standard macro-assembler. New routines written in Fortran or MLISP can be added by the user, though if Fortran is used a certain amount of implementation knowledge is necessary." [Harrison 1970]
"The MBALM machine is designed to provide as much freedom as possible in the choice of representation in the object machine. No assumptions are made about the representation of code, for example, so the instruction which creates a procedure from a vector of integers representing the MBALM code is free to leave it as is, or to translate it into more efficient code. We have used this property to implement a faster version of the MBALM emulator which translates the MBALM code into machine code, which is then executed directly [Paige 1972]. The MBALM (and BALM4) programmer is unaware of this change, except that his program executes faster, and takes more memory."
SETL Newsletter Number 5 ("Miscellaneous algorithms written in SETL", 18 November 1970), begins "Here are various algorithms from Harrison's 'Data structures and programming' and from other sources written out in SETL ..."
Chapter 19 describes the BALM language, and Appendix B is a listing of a BALM translator written in BALM.
These were similar to the SETL Newsletters, but focussed on the BALM language and its implementation. Some 20 were issued between 1971 and 1972; they are listed in Part 5 of [Abes 1973].
Chapter 8 of Programming language and their compilers, "The self-compiling compiler", sketches how to port a self-compiling compiler to a new target machine via a bootstrapping process. To make the description more concrete, it presents a particular programming language as an appropriate vehicle for writing such a portable, self-compiling compiler. That language is called LITTLE.
"In defining this language, it is our intent to incorporate only that relatively minima1 set of essential features which are truly necessary in the convenient expression of compiler function." [Cocke and Schwartz 1970, page 649]
The language is similar to FORTRAN, but with one datatype, the bitstring. The report goes on to give a full syntax for LITTLE, as well as an overview of how its compiler would be organized.
A few years later Schwartz initiated the SETL project, and by 1971 it became clear that a more efficient implementation language than BALM was needed; work began on a serious implementation of LITTLE. Over the next decade, implementations of LITTLE were created for the CDC 6000, IBM System/370, Digital Equipment DECsystem-10, and Digital Equipment VAX-11/780. [Shields 1982]
LITTLE vs. C:
"... LITTLE was conceived by Jack before C was available. I remember Jack being favorably impressed by the design of C later on. Much of the low-level implementation work at NYU was influenced by the fact that the one available large machine was a CDC6000 with an awkward 60-bit word and no byte addressability, so that Jack saw machine-level programming as bit-vector manipulations (and 64-bit arithmetic). The SETL library had a large collection of operations on bit-vectors, but the 6600 (Cray's design) was really designed for numerical computations, and LITTLE was an awkward fit." [Ed Schonberg, personal communication, 13 October 2022]
These were similar to the SETL Newsletters, but focussed on the LITTLE language and its implementation. Lists of the authors, titles, and dates are included in [Abes 1973] and [Abes 1980].
LITTLE Newsletter. Scans of Numbers 2-42A (constituting some 465 pages, with some gaps) were provided by David Bacon.
Describes the syntax and semantics of the language. Includes examples.
Describes the implementation of LITTLE, including lexical scan, parse and semantic analysis, code generation, and library (used at compile time and also at runtime of generated code).
Robert Abes, Edith Deak, Richard Kenner, David Shields, and Aaron Stein are the principal authors of the LITTLE compiler (LEX, GEN, LIB, and TST). LITTLE is in the public domain except [Shields 1982], which is Copyright © 1982 David Shields and is included here by permission from Shields. For a complete snapshot circa 1984, see the little directory of setl-vax-unix.tar.xz in the LITTLE-based SETL section.
In 1978 there was a collaboration with Anthony McCann and later Nigel Chapman (University of Leeds) on a plan to port LITTLE to the DEC PDP-10. It appears that Richard Kenner and David Shields ended up doing the bulk of the work but used a set of macros by Chapman to translate from the artificial targer t10 to valid MACRO-10 code -- see little/mk/ltlasm.opl for details. Chapman ended up developing Setl-s, using the MINIMAL language from MACRO SPITBOL by Dewar and McCann.
In 1980, Rob Pike at Bell Labs ported SETL from VAX/VMS to VAX UNIX. See comments in the source files little/mk/clib.c.bky, little/mk/clib.c.uts, little/mk/ltlt32.c, little/mk/LOG, little/clarkson/t32.c, and setl/bin/README.
"I do have a sometimes hilarious Bell Labs memo written by Rob Pike, as one of his very first jobs at BTL was to port the VAX implementation to Unix 32v, 32v was BTL's version of a paging Unix from VAX, though Berkeley of course won that battle." [Dave Shields, personal communication, 9 April 2020]
"I did the port because Doug McIlroy asked me to, and it was my job. ... I enjoyed the work and enjoyed my time meeting with people at NYU ... I did very much enjoy meeting Jack and Robert ... I had to port LITTLE first, to do SETL. " [Rob Pike, personal communication, 13 and 14 April 2020]
In addition to the .opl files below, there are .upd files with changes corresponding to some of them; see the .tar.xz file.
"This program translates an annotated BNF-like description of a programming language grammar into tables and text fragments which form the central part of a parser for the language, which consists of an interpreter for an abstract parsing machine."
Nigel Williams and Paul McJones (paul@mcjones.org), editors.
"Robert Dewar joined the NYU faculty in the mid-1970’s. He was a member of the Algol-68 committee, and among other things had created SPITBOL, a remarkably efficient implementation of SNOBOL, the pattern-matching language designed by Ralph Griswold. He immediately improved the perfomance of SETL and displayed an extraordinary talent for software design and language design. At the time Jack Schwartz was interested in developing techniques for Software Prototyping using very high-level languages, and the Department of Defense was in the process of looking for a common programming language for all the needs of the DoD. ... In any case, the design of Ada took several years, and the production of several requirement documents with the names Strawman, Stoneman, Ironman, and Steelman. This last one was the basis for an international competition for a new programming language, which became Ada83. To showcase the capabilities of SETL as a prototyping language, Jack and Robert got a grant from the US Army to create a prototype implementation of the new language. Robert wrote what was in essence an executable denotational definition of the language, including all of the concurrency model of the language, with tasks, entries, and rendez-vous as the basic synchronization constructs. I wrote a conventional semantic analyzer to perform static correctness checks (visibility, typing, overload resolution). The fact that this was an executable program meant that it could also be used early on to check the test suite being written at the sase time (by Softech, under the leadership of John Goodenough). Ada/Ed (which later became NYU/ADA) was extremely slow (we used to say that it was for the real-time simulation of paper-and-pencil calculations) but it was sufficient to execute the test suite (ACVC, or Ada Compiler Validation Capability) whose purpose was to ensure that all compilers were conformant to the language definition, ensuring the portability of Ada code. NYU/Ada ended up being the first officially validated implementation of the language (which was politically convenient for the DoD because it was not a commercial product).
NYU/Ada was used as a teaching tool for a few years, and served as a design model for a more efficient interpreter (written in C) and years later as the basis for the GNAT compiler. This was another demonstration project involving the next revision of the language, which became Ada95. This was a large addition and redesign of the language, and GNAT was intended to show the implementability of this new language, whose lead designer was Tucker Taft (then at Intermetrics). Like other languages with an ISO presence, there is an official maintenance committee (known as the ARG , or Ada Rapporteur Group) in charge of revising the standard periodically, and releasing new versions of the language. There have been such in 2005 and 2012, and a new one will be released early next year. Tucker Taft (now at Adacore) remains the chief designer and guide of the evolution of the language.
GNAT stands for “GNU-NYU Ada translator” indicating our close association with the Free Software community, and the use of the GNU code generator as the back-end of the compiler. (Internally the name was also understood to mean an irritant to developers of other commercial compilers for Ada)." [Ed Schonberg, personal communication, 1 April 2020]
See Version 1.4 below.
Abstract: "The NYU-Ada project is engaged in the design and implementation of a translator-interpreter for the Ada language. The objectives of this project are twofold: a) to provide an executable semantic model for the full Ada language, that can be used for teaching, and serve as a starting point for the design of an efficient Ada compiler; b) to serve as a testing ground for the software methodology that has emerged from our experience with the very-high level language SETL. In accordance with these objectives, the NYU-Ada system is written in a particularly high-level, abstract SETL style that emphasizes clarity of design and user interface over speed and efficiency. A number of unusual design features of the translator and interpreter follow from this emphasis. Some of these features are described below. We also discuss the question of semantic specification of programming languages, and the general methodology of 'Software Prototyping' of which the NYU-Ada system is a sizeable example."
The New York University Ada translator (NYU Ada/Ed) version 19.7 (March 21, 1983), was tested with version 1.1 (March 4, 1983) of the ACVC validation tests. 1.1 of the test suite contained 1,633 tests, of which 1,325 were applicable to Version NYU Ada/Ed. Of the applicable tests, 14 were withdrawn, due to errors in the tests. NYU Ada/Ed passed all of the remaining 1,311 applicable correct tests.
The source code of this version was published as a two-part NYU technical report. Staff at the Courant Institute scanned these reports for Nigel Williams in 2018.
This version was validated on a Sun-3/60 in 1989:
Version 1.10 was distributed through the National Technical Information Service for VAX computers running VMS Version 5.2 and UNIX (version unknown). Here are reports describing the availability of distribution tapes:
Ada/Ed-C was a rewrite from SETL to C, for Unix-based machines. This paper describes the design process:
Version 11.11.0a could be built for various Unix systems; see #ifdefs in config.h.
Version 1.11.2, as released by the NYUADA project, could be built for the following targets:
"The sources for NYU Ada/ED Version 1.11.2, obtained from http://www2.informatik.uni-stuttgart.de/iste/ps/ada-software/html/dos_ada.html on 9 July 2012.
NYU Ada/Ed was the first Ada compiler to pass the Ada Compiler Validation Suite. Ada/ED for the IBM PC was the first compiler to pass the suite on an IBM PC.
I initiated the project in late 1983, after finishing my PhD disseration. I worked on it until mid 1987, until I left NYU's Courant Institute of Mathematical Sciences (CIMS) to join IBM Research."
In addition, the GW-Ada/Ed project at The George Washington University, led by Professor Michael B. Feldman, produced the GW-Ada/Ed Program Development Environment for Macintosh and MS-DOS that extenderd the Ada/Ed interpreter with an editor and a "shell" to control compiling, binding and execution of an Ada program.
The nyu directory contains the original nyu/Adaed-1.11.2/) and copies (nyu/src/) apparently only differing by some renaming to produce shorter filenames. The gwu directory contains subdirectories for DOS (gwu/dos/) and Macintosh (gwu/mac/); within the Macintosh subdirectory there are several copies of most source files for unknown reasons.
In the late 1980s, W. Kirk Snyder designed and implemented a successor to SETL called SETL2. A version of the original Ada/Ed was modified for SETL2. Mentioned in [Bacon 2000], page 21.
THE END