+ Proudly presenting the Dutch giant

This is what comes out, when two Transkribus power user archives, namely the Amsterdam City Archives and the National Archives of the Netherlands work together: a model with 1 384 893 words of training data, in this case reading 18th century Dutch. The model is available to all Transkribus users now and can be found under the name: “Dutch Mountains (18th Century)”. It combines the 18th Century models of the two archives (Amsterdam City Archives : 3500+ scans of 15 notarial handwritings and National Archives of the Netherlands 3500+ scans of VOC handwritings). The Character Error Rate goes down to 5,67%.

When you are using models big as this one for your documents, it makes sense to add the corrisponding language model, or a base model, if you have already trained a model yourself. That’s where you can find the language model setting in Transkribus: “Tools”-tab -> Click “Run” in the “Text Recognition”-section -> “Select HTR-model” -> “Dictionary” (top left) -> “Language model from training data”

Have fun trying out this or one of our other public models! With this link you can have a look at the overview of all the available models: https://transkribus.eu/wiki/images/d/d6/Public_Models_in_Transkribus.pdf


Recent Posts

June 20, 2022
By Joe Nockels, University of Edinburgh As part of his PhD research at the University of Edinburgh and National Library ...
June 15, 2022
Transkribus, Webinars
We are excited to launch our new “Ask Us Anything” webinar series, where attendees can ask us about anything related ...
June 9, 2022
The new sharing feature It might be a small button for a screen, but it is a great function for ...