Machine Translation and Automated Analysis of Cuneiform Languages (MTAAC)

Ancient Mesopotamia, birthplace of writing, has produced vast numbers of cuneiform tablets that only a handful of highly specialized scholars are able to read. The task of studying them is so labor intensive that the vast majority have not yet been translated, with the result that their contents are not accessible either to historians in other fields or to the wider public. This project will develop and apply new computerised methods to translate and analyse the contents of some 67,000 highly standardised administrative documents from southern Mesopotamia from the 21st century BC. By automating these basic but labor-intensive processes, we will free up scholars’ time. The tools that we will develop, combining machine learning, statistical and neural machine translation technologies, may then be applied to other ancient languages. Similarly, the translations themselves, and the historical, social and economic data extracted from them, will be made publicly available on the web.

Paper from the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Proceedings, pages 10–16, Vancouver, BC, August 4, 2017: Machine Translation and Automated Analysis of the Sumerian Language (Émilie Pagé-Perron, Maria Sukhareva, Ilya Khait, Christian Chiarcos).