Trees and Tweets: Mining Billions to Understand Human Migration and Regional Linguistic Variation


The proposed research aims to analyze contemporary twitter data for the UK and USA for regional variation in linguistic forms and link the patterns of variation with migration in both countries. Our goal is to understand how linguistic variation is shaped by migration in both the past and present. Two sorts of “big data” will be collected, cleaned, and analyzed for spatial patterns: tweets will be used to document regional linguistic variation and family trees to describe the large-scale migration patterns that might explain this variation. By analyzing successive tweets by the same individuals, we will also have a record of their mobility which we will relate to linguistic variation in the tweets.

Principal Investigators

Diansheng Guo, University of South Carolina, US, IMLS
Jack Grieve, Aston University, UK, AHRC/ESRC