Chronicling America Library of Congress National Digital Newspaper Program

About: 

As of March 2011, Chronicling America provides free and searchable access to more than 3.3 million pages of historic newspapers, published between 1860 and 1922. These newspapers are selected and digitized by NEH awardees through the National Digital Newspaper Program (http://www.neh.gov/projects/ndnp.html), per Library of Congress technical guidelines (see http://www.loc.gov/ndnp/guidelines/ ). Page-level data presented through Chronicling America include JPEG2000, PDF images, and searchable page text. To date, twenty-two state awardees and the Library of Congress have contributed content to the site from newspapers published in Arizona, California, the District of Columbia, Florida, Hawaii, Illinois, Kansas, Kentucky, Louisiana, Minnesota, Missouri, Montana, Nebraska, New York, Ohio, Oklahoma, Oregon, Pennsylvania, South Carolina, Texas, Utah, Virginia and Washington. Three additional states (New Mexico, Tennessee and Vermont) will be adding content in Spring 2011. The site will continue to expand over time (potentially, twenty years) to eventually include all 54 states and territories with newspapers published between 1836 and 1922.

Contact: 

Tech support contact: David Brunton (dbrun@loc.gov), and Nathan Yarasavage, (nyarasavage@loc.gov), National Digital Newspaper Program, Library of Congress.

Links to APIs and documentation:  The Library makes available the digitized text (created through Optical Character Recognition) of more than three million newspaper pages in the METS/ALTO XML format (see http://www.loc.gov/standards/alto/).  For each page of OCR text, the library includes a permanent link to an image of the page, from which additional metadata can be derived.

The Library provides an OpenSearch API [1], with results returned in HTML, JSON, or Atom, at the researcher's discretion.  From the search results, the Library provides pointers to additional information for each result based upon a URI Template. [2]

[1.] http://www.opensearch.org/Home  

[2.] http://bitworking.org/projects/URI-Templates/spec/draft-gregorio-uritemplate-03.txt

Terms of Service: 

Data provided by the Library is for the sole use of the awardee in support of research as described in the Digging for Data proposal and should not be re-used or re-distributed for any other purpose without permission. 

The Library reserves the right to block IP addresses that fail to honor the Library's robots.txt files or submit requests at a rate that negatively impacts service delivery to all Library patrons. Current guidelines recommend that software programs submit a total of no more than 10 requests per minute to Library applications, regardless of the number of machines used to submit requests. The Library also reserves the right to terminate programs that require more than 24 hours to complete. (See http://www.loc.gov/homepage/legal.html for more information).