<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7036.0">
<TITLE>Content Management pilot for CHM Software Preservation Group</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/rtf format -->
<P><FONT SIZE=2 FACE="Arial">We have had many discussions about digital content management for enabling SPG work. At our June 26 meeting I will introduce you to a Greenstone Digital Library (GSDL) pilot installation. It begins creation of infrastructure that can serve us well for 3-5 years for collecting, organizing, and showing off whatever software collections SPG can achieve. </FONT></P>
<P><FONT SIZE=2 FACE="Arial">A draft of presentation slides I plan to use is available at </FONT><A HREF="http://www.hgladney.net/CHMpres.pdf"><U><FONT COLOR="#0000FF" SIZE=2 FACE="Arial">http://www.hgladney.net/CHMpres.pdf</FONT></U></A><FONT SIZE=2 FACE="Arial">. These diagrams sketch what is in place today, and what tailoring is required to make this GSDL instance an attractive tool for SPG volunteers. </FONT></P>
<P><FONT SIZE=2 FACE="Arial">GSDL aspects that attracted me to this open-source package, but that are not obvious from its documentation, include that:</FONT></P>
<P> <FONT SIZE=2 FACE="Arial">-- a GSDL server can scale to about 100 collections each holding up to about 100,000 files within 10Gbyte;</FONT>
<BR> <FONT SIZE=2 FACE="Arial">-- SPG volunteers will be able to create and tailor GSDL collections from anywhere in the world;</FONT>
<BR> <FONT SIZE=2 FACE="Arial">-- GSDL creates search indices automatically for files of any format for which an information extraction plugin is provided;</FONT></P>
<P> <FONT SIZE=2 FACE="Arial">-- Web presentations for museum visitors can exploit the content of a GSDL (as they can with any other CM offering);</FONT></P>
<P> <FONT SIZE=2 FACE="Arial">-- GSDL supports arbitrary metadata schema (DC and METS are available today); and</FONT>
<BR> <FONT SIZE=2 FACE="Arial">-- GSDL can export its collections in a standard format that other CM offerings can import.</FONT>
</P>
<P><FONT SIZE=2 FACE="Arial">I have ingested the Snobol collection from Arizona, and Paul McJones' Fortran, Lisp, and C++ collections. GSDL visitors can poke at those. No curatorial work has been done on these collections yet. Moreover, you will quickly discover that searching finds text documents and images, but not programs. As far as we know, no-one in the world has created subroutines to extract useful search indices from program collections. Doing this well might be a nontrivial challenge. I hope to find volunteers to help tackle it, starting with obvious heuristics. (If we can invent useful algorithms, programming them as GSDL plugins will be neither difficult nor laborious.)</FONT></P>
<P><FONT SIZE=2 FACE="Arial">The current GSDL instance is a pilot, and will remain a pilot at least until year-end. I intend to provide library update privilege only to a few SPG volunteers until we are confident that enough GSDL functionality is working reliably, is attractive to its users, and makes them productive without much technical help. Tailoring required is suggested at </FONT><A HREF="http://www.hgladney.net/SPGprojects.htm"><U><FONT COLOR="#0000FF" SIZE=2 FACE="Arial">http://www.hgladney.net/SPGprojects.htm</FONT></U></A><FONT SIZE=2 FACE="Arial">.</FONT></P>
<P><FONT SIZE=2 FACE="Arial">The current GSDL for SPG server machine (called HMG3) is a Linux PC running in my home. If the pilot proves as useful as I expect, we'll probably need to move the service to a more robust, better controlled, and faster platform before calling it a production service. Learning the practicalities of such a step is part of the purpose of the pilot. A few SPG members have been discussing this quietly, and will continue to think about. This is not an urgent matter, because tailoring the pilot for CHM-SPG work will not reach sufficient refinement before 2008. <B> During 2007, comments, questions, and criticism of the pilot will be invaluable. Please don't hesitate.</B></FONT></P>
<P><FONT SIZE=2 FACE="Arial">Cheerio, Henry</FONT>
<BR><FONT SIZE=2 FACE="Arial">H.M. Gladney, Ph.D. <A HREF="http://home.pacbell.net/hgladney">http://home.pacbell.net/hgladney</A></FONT>
</P>
<P><FONT SIZE=2 FACE="Arial">Web pages from which you can examine this activity are:</FONT>
</P>
<P><B><FONT SIZE=2 FACE="Arial">Pilot CHM-SPG DL: </FONT></B><A HREF="http://www.hgladney.net/gsdl/cgi-bin/library"><B><U><FONT COLOR="#0000FF" SIZE=2 FACE="Arial">http://www.hgladney.net/gsdl/cgi-bin/library</FONT></U></B></A><B></B>
<BR><FONT SIZE=2 FACE="Arial">Help for SPG librarians: </FONT><A HREF="http://www.hgladney.net/GSDLibrarian.htm"><U><FONT COLOR="#0000FF" SIZE=2 FACE="Arial">http://www.hgladney.net/GSDLibrarian.htm</FONT></U></A>
<BR><FONT SIZE=2 FACE="Arial">GSDL bibliography: </FONT><A HREF="http://www.hgladney.net/GSDLbib.htm"><U><FONT COLOR="#0000FF" SIZE=2 FACE="Arial">http://www.hgladney.net/GSDLbib.htm</FONT></U></A>
<BR><FONT SIZE=2 FACE="Arial">GSDL examples: </FONT><A HREF="http://www.hgladney.net/GSDLsites.htm"><U><FONT COLOR="#0000FF" SIZE=2 FACE="Arial">http://www.hgladney.net/GSDLsites.htm</FONT></U></A>
<BR><FONT SIZE=2 FACE="Arial">Projects to refine the CHM-SPG DL: </FONT><A HREF="http://www.hgladney.net/GSDLibrarian.htm"><U><FONT COLOR="#0000FF" SIZE=2 FACE="Arial">http://www.hgladney.net/GSDLibrarian.htm</FONT></U></A>
<BR><B><FONT SIZE=2 FACE="Arial">An overview with links to the resources above and more is available at </FONT></B><A HREF="http://www.hgladney.net/"><B><U><FONT COLOR="#0000FF" SIZE=2 FACE="Arial">http://www.hgladney.net/</FONT></U></B></A><B></B>
</P>
</BODY>
</HTML>