Round 2

Original Announcement of the Round Two (2011) Competition

During the first round, in 2009, nearly 90 international research teams competed in the challenge. Ultimately, eight remarkable projects were awarded grants.

In 2011, the Digging into Data Challenge has returned for a second round, this time much larger, with sponsorship from eight international research funders, representing Canada, the Netherlands, the United Kingdom, and the United States.

What is the "challenge" we speak of?  The idea behind the Digging into Data Challenge is to address how "big data" changes the research landscape for the humanities and social sciences. Now that we have massive databases of materials used by scholars in the humanities and social sciences -- ranging from digitized books, newspapers, and music to transactional data like web searches, sensor data or cell phone records -- what new, computationally-based research methods might we apply? As the world becomes increasingly digital, new techniques will be needed to search, analyze, and understand these everyday materials. Digging into Data challenges the research community to help create the new research infrastructure for 21st century scholarship.

Applicants will form international teams from at least two of the participating countries.  Winning teams will receive grants from two or more of the funding agencies and, two years later, will be invited to show off their work at a special conference sponsored by the eight funders.

Let's get digging.

Original Press Releases

Press Releases About the Launch of Round Two (March 2011)

AHRCESRCIMLSJISCNEHNSF, NWO, SSHRC

Press Releases about the Winners of Round Two (January 2012)

AHRC, ESRCIMLSJISCNEHNSF, NWO, SSHRC

Round Two Conference

At the end of each round of funding, the grantees gather to present their work. The second round conference was held October 12, 2013 in  Montréal, Quebec. To access the papers given at the conference and read profiles of the speakers, please see the Digging Round Two Conference page.

Press: 

2011 Award Recipients:

This project seeks to harness the power of data mining techniques with the interpretive analytics of the humanities and social sciences to understand how newspapers shaped public opinion and represented authoritative knowledge during this deadly pandemic. This project makes use of the more than 100 newspaper titles for 1918 available from Chronicling America at the United States Library of Congress and the Peel’s Prairie Provinces collection at the University of Alberta Library.

view

This project will examine topic lifecycles across heterogeneous corpora, including not only scholarly and scientific literature, but also social networks, blogs, and other materials. While the growth of large-scale datasets has enabled examination within scientific datasets, there is little research that looks across datasets.

view

This project will develop new ways of exploring the full text content of digital historical records. The project will demonstrate its approach using medieval charters which survive in abundance from the 12th to the 16th centuries and are one of the richest sources for studying the lives of people in the past. 

view

A project to develop and implement a multi-scale workbench, called "InterDebates", with the goal of digging into data provided by hundreds of thousands, eventually millions, of digitized books, bibliographic databases of journal articles, and comprehensive reference works written by experts.

view

This project will analyze a vast set of Open Access research publications using Natural Language Processing and social network analysis methods to identify patterns in the behavior of research communities, to recognize trends in research disciplines, to learn new insights about the citation behaviors of researchers and to discover features that distinguish papers with high impact. This will enable the development of better methods for exploratory search and browsing in digital collections or new ways of evaluating research or the researcher’s impact.

view

This project will develop an automated reader for large text archives of human rights abuses that will reconstruct stories from fragments scattered across a collection, and an interface for navigating those stories.  By improving on anaphora resolution techniques in Natural Language Processing for the connection of pronouns to specific nouns, this system will help researchers and courts reveal witnesses and patterns contained in their own collections.

view

The project will automatically generate new forms of metadata tags from existing metadata records and associated resources that will support discovery across multiple repositories.  The project will utilize four repositories that vary in size, domain, metadata creation method and workflow, and quality.  PERTAINS, a tool developed by one of the partner schools, will be used to analyze the metadata records in each repository and then to generate Dewey Decimal Classification-based tags.

view

A project to study changes in Western musical style from 1300 to 1900, using the digitized collections of several large music repositories. The team notes that in order to understand style change in Western polyphonic music we need to be able to describe acceptable vertical sonorities (chords) and melodic motions in each period, and how they change over time.

view

A project to explore new visualization techniques for use in large scale linguistic and literary corpora using the collections of the British National Corpus and various smaller archives of poetry. The team will investigate whether or not advanced visualization techniques can provide an interface that enables humanities researchers to use their domain knowledge dynamically, while using the computational capability of computers. In particular, can data visualization help users make new observations and generate new hypotheses?

view

This project is designed to provide mummy and medical researchers with a large-scale comparative database of medical imaging of mummified human remains. This departure from a case-study model for mummy studies will drive the field towards a large-scale comparative and epidemiological paradigm.

view

This project will develop an integrated environment using sophisticated text mining tools to facilitate knowledge discovery in social history research. It will provide social historians and social scientists with the means to detect and associate events, trends, people, organizations, and other entities of specific interest to social historians.

view

This project will integrate large-scale collections including JSTOR and the books collections of the Internet Archive stored and managed in a distributed preservation environment. It will also incorporate text mining and Natural Language Processing software capable of generating dynamic links to related resources discussing the same persons, places, and events. In this 17-month project we go beyond basic analysis by providing a prototype system developed to provide expert system support to scholars in their work.

view

This project will make use of novel data-mining technology to exploit one of the largest population databases in the world, a vast collection of harmonized 19th and early 20th century census microdata from Britain, Canada, and the United States originally digitized for genealogical research. The goal is to shed light on the impact of economic opportunity and spatial mobility on social structure in Europe and North America.

view

This project will examine the economic and environmental consequences of commodity trading during the nineteenth century. The project team will be using information extraction techniques to study large corpora of digitized documents from the nineteenth century. This innovative digital resource will allow historians to discover novel patterns and to explore new hypotheses, both through structured query and through a variety of visualization tools.

view