The Future of Information Extraction – Be Part of TUC 2024! ✨ Feb 15-16, In-Person and Online. Get your Ticket >>

+ Trolls and water spirits – transcribing Swedish folklore records with Handwritten Text Recognition

It’s time to hear about some remarkable new results with Handwritten Text Recognition (HTR) technology – this time from the Institute for Language and Folklore in Sweden.

The Institute holds a collection of more than 30,000 pages of folklore records written by the Swedish folklorist Carl-Martin Bergstrand between the 1920s and the 1960s.  Dr Fredrik Skott, an associate professor and research archivist at the Institute, has helped to train a HTR model to automatically transcribe these documents.

Dr Skott used our Transkribus platform to transcribe around 20,000 words from pages which were written by Bergstrand in the early 1930s.  A couple of example pages can be seen below, which contain Bergstrand’s records of an interview with August Svensson (b. 1842) where Svensson talked about water spirits and trolls.

Transcripts and images of these documents were processed by CITlab HTR – a form of HTR technology which uses Neural Networks to recognise handwriting.  The resulting HTR model can automatically produce transcripts of pages written by Bergstrand with an average Character Error Rate (CER) of 7.0%.  When a dictionary is integrated into the recognition process, the CER can be as low as 5.5%.

Dr Skott is excited about the possibilities: ‘Previously, I always thought that future generations would have difficulty reading the folklore collections. Now I know that they will find it easier to read the text than the present generation does. In short, the results of our tests with Transkribus are amazing. After manually transcribing just 150 pages, our HTR model now reads the folklore records better than many of our visitors do’.

The Institute for Language and Folklore is now working with these transcriptions to produce a digital map of myths and legends that they plan to launch in autumn 2017.

SHARE THIS ARTICLE

Recent Posts

February 22, 2024
Uncategorized
Exciting news for Dutch history enthusiasts and researchers! Following the announcement at last week’s Transkribus Users Conference 24, the new ...
January 31, 2024
News
We’re pleased to announce the latest updates to our document editor, bringing you a more intuitive and cleaner interface. Our ...
January 17, 2024
News, Transkribus
Do I need to transcribe or translate handwritten text to be able to work with it? Well, that depends on ...