An extended Transkribus print model which in addition to common Antiqua and Fraktur typefaces can also decipher typewritten text, modern computer printouts, and even various unusual ‘decorative fonts’ from the 16th until the 21st century in several languages. It should be able to read historical Dutch, German, English, Finnish, French, and Swedish with good quality.
We have compared it with the similar HTR+ model and the results of the PyLaia model seem to match and in some cases surpass those of the HTR+ model. On one of our test sets for example, the new PyLaia model was 30% faster while having a CER of 1.28% compared to 1.64% of the HTR+ model. We have observed before that PyLaia seems to be doing very well on large and diverse train sets such as this.
…the swiss army knife for printed documents. Words trained: 4415716, CER on validation set: 1.60%.