+ Transkribus recognises early modern German correspondence

The Gender History research group at the University of Jena (Thuringia, Germany) have been experimenting with Transkribus as part of a digital edition project on the correspondence of the eighteenth-century regent, Erdmuthe Benigna von Reuß-Ebersdorf (1670-1732).

Early Modern scripts are very challenging for Automated Text Recognition technology because letters tend to be closely intertwined, abbreviations occur quite often and the spelling of words is not standardized.  As the below example suggests, Erdmuthe’s writing is not easy to follow!  She had a unique writing style and often broke words into separate parts.

Sample page of a letter (Source: Landesarchiv Thüringen – Staatsarchiv Greiz, Paragiatsherrschaft Köstritz, From IV 15, fol. 56r ., All rights reserved)

In order to train a model to recognise Erdmuthe’s writing, the Gender History research team used about 250 pages of existing transcripts that had been produced in the course of their work on the digital edition.  They also used these same transcripts to create a dictionary of Erdmuthe’s vocabulary that can be integrated into the recognition process.

The resulting model is capable of producing automated transcripts of Erdmuthe’s writing with a Character Error Rate (CER) of below 9%.  When a dictionary is included in the recognition process,  the errors are reduced still further.

Martin Prell from the project team has elaborated on this experiment in a report (in German).  He covers the experience of preparing training data for text recognition and working directly with Transkribus.  If you are thinking about using Transkribus for your own project, this very instructive paper could help!

Report:

Other links:

SHARE THIS ARTICLE

Recent Posts

May 24, 2023
Transkribus
Transkribus might be known for its ability to transcribe and enrich handwritten documents, but did you know you can also ...
May 10, 2023
Transkribus
If you’re new to Transkribus, or machine learning in general, then you are also probably new to the term “Ground ...
May 3, 2023
Transkribus
If you scroll through the list of Transkribus’ public AI models, you might be forgiven for thinking the platform can ...