[SCC_Active_Members] Capturing information from the WWW
Al Kossow
aek at bitsavers.org
Mon Jan 22 09:27:14 PST 2007
H.M. Gladney wrote:
> Relative to the SPG emphasis on capturing stuff, doing so with a view to
> classifying, accessioning, obtaining authorization later, etc., it will
> from time to time be of interest to capture a large set of files in the
> directory tree hanging from some some interesting Web Page.
>
> There probably are several available tools to accomplish this.
I normally use some variant of the Unix 'wget' command.
Whatever you use, make sure it preserves the dates of the original
files.
I now have about 100 site snapshots on the RAID at CHM containing about
2 million files.
More information about the SCC_active
mailing list