In this guide you will learn how to use the Handwritten Text Recognition feature in Transkribus Lite. HTR enables you to automatically recognize text on images and produces a transcript of the text by using artificial intelligence.
The Recognition tab can be found on the collection page in Transkribus Lite. When you click on a collection in the collection overview (more about managing collections) the respective collection will open up. At the top right of the page you will see the Recognition tab. To start the recognition you will have to go through a 3-step process that is explained in the following sections.
Step 1: Choosing the document
As the first step you need to select the document you want to run the Handwritten Text recognition on. For this, you simply need to click on the desired Document in the list to select it. If you have many documents in your collection you can also use the filters to search for the document.
Step 2: Selecting the Model
After selecting the desired document, you have to select the HTR model that you want to use for the recognition. It can either be one of the models that you have trained yourself or one of the free and publicly available models.
Under the all models tab you will find all the models that you can use. The models that you have trained or access to as well as the public models are listed here. You will see the name, the language and the provider (i.e. the recognition engine that will be used with this model). Additionally you can filter for the ID of the model, the username of the creator as well as for the number of ground truth pages that were used to train the model.
Note: Model training is not yet available in Transkribus Lite, therefore you will need to use Transkribus Expert Client if you want to train your own model. You can read more about model training here.
In the public models tab you can find all public models that are available to use by everyone. Many of these models are created by our great community thanks to some very exciting projects. By clicking on a model you will get more information about the respective model and for many models you can also see a preview snippet of the material used for training. Thereby, you can search for a model that fits your needs best. Sometimes it is also helpful to test several models and compare the results to find a model that works for you. However, you need to keep in mind that historic material, especially the countless different types of handwriting, can be very heterogeneous and thus we are not able to provide a public model that works for everyone (yet 😉 ). You can also check out all public models here.
Once you have identified a model that you want to use, just click on the “select” button to select the model.
Step 3: Starting the Recognition
In the last step, you need to define the pages that you want to recognize with HTR. You can either enter a page string with the starting page number and the ending page number or select the pages manually with the “Select Pages” button. After you have selected the pages you can click on “Start”.
Before the recognition job finally starts, you will see a pop-up that shows you how much credits will be consumed by the job that you start. Additionally, you will see your current credit balance and the credit balance after the recognition. By clicking “Start” again, the recognition job will finally be started.
As soon as you have started the recognition process, you will be redirected to the jobs page. Here you will see your newly created recognition job. In the status column you can always check the current status of your recognition job as well as other information regarding the job. By clicking on the job in the list you will get to the document related to the job. After the recognition is finished, you can view the results by viewing the pages in the document.