+ Proudly presenting the Dutch giant

This is what comes out, when two Transkribus power user archives, namely the Amsterdam City Archives and the National Archives of the Netherlands work together: a model with 1 384 893 words of training data, in this case reading 18th century Dutch. The model is available to all Transkribus users now and can be found under the name: “Dutch Mountains (18th Century)”. It combines the 18th Century models of the two archives (Amsterdam City Archives : 3500+ scans of 15 notarial handwritings and National Archives of the Netherlands 3500+ scans of VOC handwritings). The Character Error Rate goes down to 5,67%.

When you are using models big as this one for your documents, it makes sense to add the corrisponding language model, or a base model, if you have already trained a model yourself. That’s where you can find the language model setting in Transkribus: “Tools”-tab -> Click “Run” in the “Text Recognition”-section -> “Select HTR-model” -> “Dictionary” (top left) -> “Language model from training data”

Have fun trying out this or one of our other public models! With this link you can have a look at the overview of all the available models: https://transkribus.eu/wiki/images/d/d6/Public_Models_in_Transkribus.pdf

SHARE THIS ARTICLE

Recent Posts

July 3, 2024
News, Transkribus
Some Transkribus projects finish with a complete digitised collection in Transkribus. Some take that digitised source and use it to ...
June 12, 2024
News, Transkribus
When you think of Carolingian (or Caroline) minuscule, Charlemagne and his vast Carolingian empire likely come to mind. While the ...
May 14, 2024
Uncategorized
Understanding historical documents is key to understanding history. But understanding historical documents in Polish can be a challenge. Not only ...