+ Proudly presenting the Dutch giant

This is what comes out, when two Transkribus power user archives, namely the Amsterdam City Archives and the National Archives of the Netherlands work together: a model with 1 384 893 words of training data, in this case reading 18th century Dutch. The model is available to all Transkribus users now and can be found under the name: “Dutch Mountains (18th Century)”. It combines the 18th Century models of the two archives (Amsterdam City Archives : 3500+ scans of 15 notarial handwritings and National Archives of the Netherlands 3500+ scans of VOC handwritings). The Character Error Rate goes down to 5,67%.

When you are using models big as this one for your documents, it makes sense to add the corrisponding language model, or a base model, if you have already trained a model yourself. That’s where you can find the language model setting in Transkribus: “Tools”-tab -> Click “Run” in the “Text Recognition”-section -> “Select HTR-model” -> “Dictionary” (top left) -> “Language model from training data”

Have fun trying out this or one of our other public models! With this link you can have a look at the overview of all the available models: https://transkribus.eu/wiki/images/d/d6/Public_Models_in_Transkribus.pdf

SHARE THIS ARTICLE

Recent Posts

September 19, 2023
Transkribus
We are thrilled to announce the September 2023 release of the Transkribus web app. After the successful switch to the ...
August 30, 2023
News, Transkribus
Today, the new Transkribus web app is officially launched!  Transkribus has always worked towards simplifying the digitasion and transcription of ...
August 21, 2023
Transkribus User Conference
The Transkribus User Conference 24 (15 & 16 February 2024, Innsbruck) invites stakeholders, users, scholars, and enthusiasts to explore the ...