This is what comes out, when two Transkribus power user archives, namely the Amsterdam City Archives and the National Archives of the Netherlands work together: a model with 1 384 893 words of training data, in this case reading 18th century Dutch. The model is available to all Transkribus users now and can be found under the name: “Dutch Mountains (18th Century)”. It combines the 18th Century models of the two archives (Amsterdam City Archives : 3500+ scans of 15 notarial handwritings and National Archives of the Netherlands 3500+ scans of VOC handwritings). The Character Error Rate goes down to 5,67%.
When you are using models big as this one for your documents, it makes sense to add the corrisponding language model, or a base model, if you have already trained a model yourself. That’s where you can find the language model setting in Transkribus: “Tools”-tab -> Click “Run” in the “Text Recognition”-section -> “Select HTR-model” -> “Dictionary” (top left) -> “Language model from training data”
Have fun trying out this or one of our other public models! With this link you can have a look at the overview of all the available models: https://transkribus.eu/wiki/images/d/d6/Public_Models_in_Transkribus.pdf