+ Proudly presenting the Dutch giant

This is what comes out, when two Transkribus power user archives, namely the Amsterdam City Archives and the National Archives of the Netherlands work together: a model with 1 384 893 words of training data, in this case reading 18th century Dutch. The model is available to all Transkribus users now and can be found under the name: “Dutch Mountains (18th Century)”. It combines the 18th Century models of the two archives (Amsterdam City Archives : 3500+ scans of 15 notarial handwritings and National Archives of the Netherlands 3500+ scans of VOC handwritings). The Character Error Rate goes down to 5,67%.

When you are using models big as this one for your documents, it makes sense to add the corrisponding language model, or a base model, if you have already trained a model yourself. That’s where you can find the language model setting in Transkribus: “Tools”-tab -> Click “Run” in the “Text Recognition”-section -> “Select HTR-model” -> “Dictionary” (top left) -> “Language model from training data”

Have fun trying out this or one of our other public models! With this link you can have a look at the overview of all the available models: https://transkribus.eu/wiki/images/d/d6/Public_Models_in_Transkribus.pdf

SHARE THIS ARTICLE

Recent Posts

November 17, 2022
Transkribus
We are thrilled to announce that yesterday, we hit 100,000 users on the Transkribus platform! Even with our years-long highly ...
August 12, 2022
Handwritten Text Recognition
Ever had trouble reading someone else’s handwriting?  Well, it may reassure you to know that it’s not only humans that ...
July 22, 2022
Uncategorized
The latest version of Transkribus Lite is here and brings a number of new features. Here are the most important ...