+ National Archives releases first version of a Dutch handwriting model

The digitisation team around Liesbeth Keyser from the National Archives in the Netherlands is working hard on creating training data for their collections in order to prepare HTR processing on a large scale. As a first result a model based on 475.769 words is now made available for Transkribus users. The model shows a Character Error Rate of 7.48% on the training set and 6.15% on the validation set. It is based on the careful transcription of dozens of different handwritings and comprises scans from the Incoming Documents from the Dutch East India Company (Overgekomen Brieven en Papieren van de VOC) of the National Archives of the Netherlands and of 19th century Notarial deeds from the Noord-Hollands archief.  The model is named: NAN/NHA_GT_M3+ Enjoy!

 

 

SHARE THIS ARTICLE

Recent Posts

March 29, 2023
Uncategorized
The majority of Transkribus models are also trained to read just one language — after all, most historical documents are ...
March 23, 2023
Transkribus
Go to any history museum or read any history book and you’ll find that many of the stories and events ...
March 15, 2023
Uncategorized
By Fiona Park Not everyone who works with history is a professional historian. From hobby genealogists to volunteers in local ...