+ Transkribus recognises early modern German correspondence

The Gender History research group at the University of Jena (Thuringia, Germany) have been experimenting with Transkribus as part of a digital edition project on the correspondence of the eighteenth-century regent, Erdmuthe Benigna von Reuß-Ebersdorf (1670-1732).

Early Modern scripts are very challenging for Automated Text Recognition technology because letters tend to be closely intertwined, abbreviations occur quite often and the spelling of words is not standardized.  As the below example suggests, Erdmuthe’s writing is not easy to follow!  She had a unique writing style and often broke words into separate parts.

Sample page of a letter (Source: Landesarchiv Thüringen – Staatsarchiv Greiz, Paragiatsherrschaft Köstritz, From IV 15, fol. 56r ., All rights reserved)

In order to train a model to recognise Erdmuthe’s writing, the Gender History research team used about 250 pages of existing transcripts that had been produced in the course of their work on the digital edition.  They also used these same transcripts to create a dictionary of Erdmuthe’s vocabulary that can be integrated into the recognition process.

The resulting model is capable of producing automated transcripts of Erdmuthe’s writing with a Character Error Rate (CER) of below 9%.  When a dictionary is included in the recognition process,  the errors are reduced still further.

Martin Prell from the project team has elaborated on this experiment in a report (in German).  He covers the experience of preparing training data for text recognition and working directly with Transkribus.  If you are thinking about using Transkribus for your own project, this very instructive paper could help!

Report:

Other links:

SHARE THIS ARTICLE

Recent Posts

July 3, 2024
News, Transkribus
Some Transkribus projects finish with a complete digitised collection in Transkribus. Some take that digitised source and use it to ...
June 12, 2024
News, Transkribus
When you think of Carolingian (or Caroline) minuscule, Charlemagne and his vast Carolingian empire likely come to mind. While the ...
May 14, 2024
Uncategorized
Understanding historical documents is key to understanding history. But understanding historical documents in Polish can be a challenge. Not only ...