Researchers develop automatic text recognition for ancient cuneiform tablets
A new artificial intelligence (AI) application developed by a team from Martin Luther University Halle-Wittenberg (MLU), Johannes Gutenberg University Mainz, and Mainz University of Applied Sciences is now able to decipher difficult-to-read texts on cuneiform tablets.
Instead of photos, the AI system uses 3D models of the tablets, delivering significantly more reliable results than previous methods. This makes it possible to search through the contents of multiple tablets to compare them with each other. It also paves the way for entirely new research questions. The findings are published in The Eurographics Association journal.
For their new approach, the researchers used 3D models of nearly 2,000 cuneiform tablets, including about 50 from a collection at MLU. According to estimates, about 1 million such tablets still exist worldwide. Many of them are more than 5,000 years old and are thus among mankind’s oldest surviving written records.
They cover an extremely wide range of topics. “Everything can be found on them: from shopping lists to court rulings. The tablets provide a glimpse into mankind’s past several millennia ago. However, they are heavily weathered and thus difficult to decipher even for trained eyes,” says Hubert Mara, an assistant professor at MLU.
This is because the cuneiform tablets are unfired chunks of clay into which writing has been pressed. To complicate matters, the writing system then was very complex and encompassed several languages. Therefore, not only are optimal lighting conditions needed to recognize the symbols correctly, but a lot of background knowledge is required as well. “Up until now it has been difficult to access the content of many cuneiform tablets at once—you sort of need to know exactly what you are looking for and where,” Mara adds.
His lab came up with the idea of developing a system of artificial intelligence which is based on 3D models. The new system deciphers characters better than previous methods. In principle, the AI system works along the same lines as OCR software (optical character recognition), which converts the images of writing and text in into machine-readable text.
This has many advantages. Once converted into computer text, the writing can be more easily read or searched through. “OCR usually works with photographs or scans. This is no problem for ink on paper or parchment. In the case of cuneiform tablets, however, things are more difficult because the light and the viewing angle greatly influence how well certain characters can be identified,” explains Ernst Stötzner from MLU. He developed the new AI system as part of his master’s thesis under Hubert Mara.
The team trained the new AI software using three-dimensional scans and additional data. Much of this data was provided by Mainz University of Applied Sciences, which is overseeing a large edition project for 3D models of clay tablets. The AI system subsequently did succeed in reliably recognizing the symbols on the tablets. “We were surprised to find that our system even works well with photographs, which are actually a poorer source material,” says Stötzner.
The work by the researchers from Halle and Mainz provides new access to what has hitherto been a relatively exclusive material and opens up many new lines of inquiry. Until now it has only been a prototype that is able to reliably discern symbols from two languages. However, a total of twelve cuneiform languages are known to exist. In the future, the software could also help to decipher weathered inscriptions, for example in cemeteries, which are three-dimensional like the cuneiform script.
Ernst Stötzner et al, R-CNN based PolygonalWedge Detection Learned from Annotated 3D Renderings and Mapped Photographs of Open Data Cuneiform Tablets, The Eurographics Association (2023). DOI: 10.2312/gch.20231157
Martin Luther University Halle-Wittenberg
Researchers develop automatic text recognition for ancient cuneiform tablets (2023, November 20)