Model Training

With Transkribus you can train a Handwritten Text Recognition model to automatically process a collection of documents. The model needs to be trained to recognise a certain style of writing by being shown images of documents and their accurate transcriptions.
For the training of a model between 5,000 and 15,000 words (around 25-75 pages) of transcribed material are required. If you are working with printed rather than handwritten text, a smaller amount of training data is usually required.
With the use of a base model the amount of required training data can be reduced. As base model you can either use one of the publicly available models in Transkribus, if there is a suitable one for your documents or one of your own models, which you have already trained before.

Figure 1 Model Training

