[SCC_Active_Members] "The realities of software preservation"

Thu Apr 12 14:35:58 PDT 2007

H.M. Gladney wrote:

> I would like to draw the participants' attention to one such apparent 
> unspoken assumption: that the pace of collecting "digital stuff" should 
> be coupled to the pace of accession.

http://community.computerhistory.org/scc/workshop/CHM_Attic_Parlor_Workshop_Proceedings_May-05-063.pdf

Was about the 'attic' and 'parlor'

Consider official museum accession, or curation and analysis as the 'parlor'.
There has always been the assumption that as things were found there has to
be an 'attic', and the two have been assumed to be coexistent. If someone finds
an artifact that fits somewhere in the matrix that is the history of computing
CHM should be one of the places where it can find a home.

So, what are the 'realities' of software preservation?
  (this is part of the document I am generating, so it is all partially baked)

It is my belief that you have to discuss software within the context of a system;
how the combination of technology that was available in some period of time was
used to solve a problem. The fact that there have been several generations now
of implementations of solutions to some of these problems provides historical
continuity.

What are some specific needs for the preservation of these systems?

- historical fact gathering
   (oral histories, personal/corporate internal technical papers)

- preservation of documenation (paper/machine readable)

- preservation of the software itself (binaries/sources)

The first is a mix of technical and non-technical information to try to put what
follows in historical context. Who, What, When, Where, Why.

The second describes the system's intended use from the people who implemented it.

The third is the system itself.

If this were a perfect world, companies would have saved all of this. The reality is that
there was no business reason for them to do so. As a result, the historical record is a
patchwork that has to be assembled as information can be found. Concrete examples of this
process how early FORTRAN was traced by Paul McJones, and how NLS is being bundled for
historical preservation. You can see the parts fill in to do this over time.

So, the first part of the puzzle are mechanism for tracking many parallel historical
topics, which may take years to fill in. This was one of the reasons that I created
bitsaver.org. You may have noticed that there are many, many manufacturers that are
fleshed out as information on them can be found.

The third, preservation of the bits, is the least developed. The process for accepting
physical and digital donations and putting them into a structure of a similar but more
complicated structure to the documents on bitsavers is something that I have been working
on since becoming Software Curator at CHM.\

For those of you who have seen my office at CHM, there is a whiteboard with the layout
of this process on there for a while now. I'll cover it in detail in the 'realities'
document that I'm working on.

There are three main parts to the process, and are pretty similar to other content
management systems

- data capture
- preservation of that data in a verifiable stable store
- analysis/annotation

There are a LOT of opportinities for work by volunteers in data capture and analysis
as well as evalution and annotation of the existing collection.

What are the known audiences for this material?

- museum exhibits
- restoration
- simulations

There is clearly a much wider audience, but those three are those which over the past
year have been what I have been providing the majority of archive research time for.

This is all context. This does not attempt to define any detail on how such an archival
process would work.

--- more to follow.