Stefan Karcher, a graduate student at Heidelberg University has written a fascinating blog post explaining how he has been using Transkribus to process nineteenth-century German sermons.
Karcher took the opportunity to train his own Automated Text Recognition models. He used around 30,000 transcribed words of training data to generate a model that can produce transcripts of his documents with a Character Error Rate of 8-10%. The blog post notes that these transcripts are a useful and efficient basis for his research and includes a description of how these automated transcripts can be analysed with Voyant Tools.
Do you want to train your own Automated Text Recognition model?
- Find out how to get started in our How to Guide.