Abstract
Commercial media companies have embraced computational analytics to study discussions of media content across social media data streams. Data mining companies identify actors and TV shows that are “trending” in global popularity, along with more granular analyses of regional tastes, social networks, and discourse. We propose to apply a similar methodology toward the study of film and media history. Project Arclight (http://projectarclight.org) will create a web-based tool that enables the study of 20th century American media through comparisons across time and space. The Arclight tool will be built using several popular open source technologies, including Ruby on Rails, Javascript, and Solr. The tool will analyze roughly two million pages of public domain publications derived from two repositories: the Media History Digital Library (which uses the Internet Archive’s scanning, hosting, and preservation services) and the Library of Congress Chronicling America collection.