Integrating Data Mining and Data Management Technologies for Scholarly Inquiry

This project will integrate large-scale collections including JSTOR and the books collections of the Internet Archive stored and managed in a distributed preservation environment. It will also incorporate text mining and Natural Language Processing software capable of generating dynamic links to related resources discussing the same persons, places, and events. In this 17-month project we go beyond basic analysis by providing a prototype system developed to provide expert system support to scholars in their work.

Principal Investigators

Ray R. Larson, University of California, Berkeley, US, IMLS
Richard Marciano, University of North Carolina at Chapel Hill, US, IMLS
Paul B. Watry, University of Liverpool, UK, AHRC/ESRC/JISC
Additional participating institutions: Internet Archive, JSTOR