+ National Archives releases first version of a Dutch handwriting model

The digitisation team around Liesbeth Keyser from the National Archives in the Netherlands is working hard on creating training data for their collections in order to prepare HTR processing on a large scale. As a first result a model based on 475.769 words is now made available for Transkribus users. The model shows a Character Error Rate of 7.48% on the training set and 6.15% on the validation set. It is based on the careful transcription of dozens of different handwritings and comprises scans from the Incoming Documents from the Dutch East India Company (Overgekomen Brieven en Papieren van de VOC) of the National Archives of the Netherlands and of 19th century Notarial deeds from the Noord-Hollands archief.  The model is named: NAN/NHA_GT_M3+ Enjoy!

 

 

SHARE THIS ARTICLE

Recent Posts

February 28, 2024
News, Transkribus
With over 80 speakers from around 40 countries, engaging presentations and thoughtful discussions, the Transkribus 2024 User Conference made us ...
February 22, 2024
Uncategorized
Exciting news for Dutch history enthusiasts and researchers! Following the announcement at last week’s Transkribus Users Conference 24, the new ...
January 31, 2024
News
We’re pleased to announce the latest updates to our document editor, bringing you a more intuitive and cleaner interface. Our ...