Printed Latin and Greek (also German, English, Italian) 15th-19th century | HTR+

The “NOSCEMUS General Model” is able to read printed Latin text, especially from the 15th, 16th, 17th and 18th century. The model was released by Stefan Zathammer and it is based on training data coming from the Digital Sourcebook of the NOSCEMUS project.

For the 4th revised and updated version a substantial amount of new pages was added, including prints from the 15th and 19th century and especially Greek texts.

Although the model is tailored towards transcribing (Neo-)Latin texts set in Antiqua-based typefaces, it is also, to a certain degree, able to handle Greek words and words set in (German) Fraktur.

The NOSCEMUS project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 741374).

There is also a PyLaia-version of this model, which you can find here.

Model Overview

Noscemus GM 4.0
Noscemus project (University of Innsbruck)
Model ID:
15th, 16th, 17th, 18th
English, German, Greek, Italian, Latin
Latin alphabet, Gothic Script, Greek alphabet
CER on validation set:
0.79 %
Noscemus GM 4.0 is freely available to everyone

