[SCC_Active_Members]
RE: [Dspace-general] Google Scholar inclusion of DSpace repositories
Lee Courtney
lcourtney at mvista.com
Tue Nov 30 14:34:48 PST 2004
Dear fellow Software Collection Committee members:
Very interesting tidbit from the D-Space General mailing list:
> I wanted to mention that the new Google Scholar search
> (http://scholar.google.com) is including items from
> DSpace repositories in the results, as long as they're open for
> harvesting the full-text.
This has *very* interesting implications for the Historic Software Archive.
If we base our repository infrastructure on a tool that facilitates this
external indexing and search inclusion, then it appears content will get
included in searches by Google et al.
At a minimum this is a requirement for whatever software we end up with to
housing artifacts in the Software Collection.
Thoughts?
Cheers,
Lee Courtney
MontaVista Software
1237 East Arques Avenue
Sunnyvale, California 94085
(408) 328-9238 voice
(408) 328-9204 fax
> -----Original Message-----
> From: dspace-general-bounces at mit.edu
> [mailto:dspace-general-bounces at mit.edu]On Behalf Of MacKenzie Smith
> Sent: Sunday, November 28, 2004 12:42 PM
> To: dspace-general at mit.edu
> Subject: [Dspace-general] Google Scholar inclusion of DSpace
> repositories
>
>
> Hi all,
>
> I wanted to mention that the new Google Scholar search
> (http://scholar.google.com) is including items from
> DSpace repositories in the results, as long as they're open for
> harvesting
> the full-text. I did notice that some
> institutions running DSpace that should be there aren't yet, so
> I've asked
> Google why they're missing.
>
> It can be a little tricky to figure out if you're institution is getting
> included or not -- search some known items
> from your repository and plow through all the results, and be
> sure to check
> all the versions since your copy
> might not be one of the first listed. If you're there, great, and
> if you're
> not (and want to be) then first make
> sure your repository's web server isn't blocking crawlers, and then write
> to me or them directly
> (scholar-support at google.com) to make sure they crawl your site.
>
> They also wanted me to mention that if you have limited access material
> that you would like to get indexed
> by Google but not cached by them for display, they're very interested in
> working with you. For example, at
> MIT we have some book titles from the MIT Press in our DSpace repository
> which are only available for free
> to the MIT community. Google proposes to index them, but not
> cache them, so
> that when a searcher finds
> one of them in a result set in google.com they're returned to DSpace to
> view the item and can get to the
> Press's online ordering system from there. More traffic for the
> book, more
> money for the Press. Let me
> know if you're interested in this and I'll put you in touch with
> the Google
> folks. Remember: if your DSpace
> content is freely available to the public then Google and the other web
> search engines should *already* be
> harvesting it so you don't need to do anything...
>
> MacKenzie
>
>
> MacKenzie Smith
> Associate Director for Technology
> MIT Libraries
> Building E25-131d
> 77 Massachusetts Avenue
> Cambridge, MA 02139
> (617)253-8184
> kenzie at mit.edu
>
> _______________________________________________
> Dspace-general mailing list
> Dspace-general at mit.edu
> http://mailman.mit.edu/mailman/listinfo/dspace-general
More information about the SCC_active
mailing list