Mellon Grant Funds Continuation of Persian and Arabic Digitization Project

Mellon Grant Funds Continuation of Persian and Arabic Digitization Project

The introduction to George B. Whiting's Kitab fi al-Imtina‘ ‘an Shurb al-Muskirat

The $100k grant supports the development of user-friendly, open-source software to produce high-quality digital transcriptions of printed texts.

The Andrew W. Mellon Foundation has awarded a $100,000 grant to support the continued development of user-friendly, open-source software capable of creating digital texts from Persian and Arabic books. 

Matthew Thomas Miller, assistant professor in the Roshan Institute for Persian Studies in the School of Languages, Literatures, and Cultures, leads an interdisciplinary team of researchers from Northeastern University, Aga Khan University (AKU) in London and the Maryland Institute for Technology in the Humanities at Maryland. The Mellon Foundation has been funding the team’s work since 2019.

“We are honored that The Andrew W. Mellon Foundation has again supported our efforts,” Miller said. “They have been global leaders in building open-source tools and open-access collections for the expansion in access to and digital preservation of cultural traditions across the world, and we are delighted to be a part of these efforts.”

The project, known as “OpenITI AOCP,” aims to enable the digitization of texts from the premodern Islamicate world—an enormous tradition stretching over 1,000 years. The tools being created by the project team will be free and open to use and will allow academics and the public to produce high-quality digital transcriptions of Persian and Arabic printed texts, from poetry to the Quran. 

“Premodern Islamicate textual production is a massive and understudied archive that remains particularly underrepresented in the field of digital humanities,” Miller said. “This democratization of access to digital text production will change the landscape of Islamicate studies.”

Thus far, the project team—made up of computer science and humanities experts—has successfully improved the accuracy of Persian and Arabic optical character recognition (OCR) tools, which are tools that transfer printed text into machine-encoded text, and have begun experimenting on Ottoman Turkish and Urdu. They are integrating those tools into a platform called eScriptorium. They also held a training session at the University of Maryland in 2020 for OCR experts from all over the world. And they taught a Spring 2021 Global Classrooms course, “The Islamicate World 2.0: Studying Islamic Cultures through Computational Textual Analysis,” on the basics of computational textual analysis as it relates to textual data about the Islamicate world.

Next steps include finalizing the open-source software for widespread use, as well as holding additional workshops and community building activities around the new tools. This latest Mellon grant will last one year. 

Earlier this year, Miller was awarded $282,905 by the National Endowment for the Humanities to support the project.

Image description: The introduction to George B. Whiting's Kitab fi al-Imtina‘ ‘an Shurb al-Muskirat, published in Beirut by American Mission Press in 1838 and housed at Harvard's Houghton Library (*98Miss168). Licensed for non-commercial use.

Original news story written by Jessica Weiss 

September 3, 2021

Prev   Next

Current Headlines

UMD Leads New $25M NSF Quantum Leap Challenge Institute for Robust Quantum Simulation

Michael Fu Works to Improve Kidney Transplants With NSF Grant

IonQ Joins University of Maryland Quantum Startup Foundry, Receives National Innovation Award

Entomology and Extension Faculty Join a National Team to Study and Support Diverse Perennial Forage Systems with Major Implications for Human and Animal Ecosystem Health

Sangeetha Madhavan Publishes New Research On Families’ Economic Inequalities In Sub-Saharan Africa

Technology for All

Center for Substance Abuse Research Receives Funding to Expand the Emergency Department Drug Surveillance System Nationwide

Student Journalism Project Sheds Light on Role of White Supremacist Newspapers in Fueling Racial Tension, Violence

News Resources

Return to Newsroom

Search News

Archived News

Events Resources

Events Calendar

Additional Resources

UM Newsdesk

Faculty Experts


social iconstwitterlinkedinrssYouTube
Division of Research
University of Maryland
College Park, MD 20742-1541
© Copyright 2021 University of Maryland

Did You Know

UMD's START Center created the world’s largest unclassified database on terrorism incidents, the Global Terrorism Database (GTD).