Tag Training

Tag Training

Tag training is an option for the normal text training to include annotated tags and properties in the training. The trained model will then include the trained tags during the recognitition. Please note that it is not possible to do text recognition without tag recognition or only a recognition of the tags afterwards. So you would need to train an extra model for the pure text recocnition. This is due the way the tags are trained. Tags and attributes are encoded in the text. So the model learns beside the characters in the image also some metadata for it. And for that reason this method works best if the amount of used properties is limited to one or two – but there is space for your own experiment. One of the main use case is the training of abbreviations with their expansion. Text styles can also be learned quite well. For other tags like persons, places, and so on the benefit is limited in our opinion. It can learn the annotated persons but will not know and recognice other persons or places which were not included in the training set. For that purpose ‘named entity recognition’ (NER) will be much more useful. The good news is that NER will be available in the future as a separate tool. The benefit of the tag training is best tried for yourself.

  • This is how it works:
    • Select ‘Train tags’ to include tags in the training. With the plus button on the right the tags are chosen which shall be included.
    • Select ‘Include Properties’ to add the properties in the training. So if a person has firstname and lastname set the model learns this as well. If not set the recognition will only say that it is a person but not know the name. As already said above if many properties have been assigned, the recognition will probably not be very good.
    • Please describe the trained tags in the model description, so that this can be seen when selecting the model. This information cannot be read and added automatically at this time.

Please try the function and share your experience with us.

Get started with Transkribus

Make your historical documents accessible