Download & Installation
- Transkribus is contained in the main jar file Transkribus-.jar
- To run the program from command line type: java -jar Transkribus-.jar
- Note: Java 8 is needed to run the program. Make sure Java 8 is either installed system wide or copy a JRE into the program directory!
- Java 9 and 10 are not supported! If already installed on your system please copy a Java 8 JRE (from https://java.com/de/download/manual.jsp#win) into the Transkribus package (=the unzipped installation folder with the application exe). This way the program takes the Java from there without having it installed on the computer. Note: Please rename the folder to ‘jre’ – only then it will be found!!
- Some problems (mainly Java heap space) occur because a Java 32 bit version is installed on a 64 bit operating system. Please check!”
- Note: To run the scripts in Mac (or Linux) you may have to make them executable from the command line: (any version before 0.6.8)
- Mac console basics
- change into the program folder using ‘cd’ commands
- chmod +x Transkribus.command (or chmod +x Transkribus.sh for Linux!)
- Furthermore you will find several files in the Transkribus package copied to your computer:
- config.properties can be modified to adjust simple appearance properties
- virtualKeyboards.xml can be used to specify a set of virtual keyboards
- logback.xml can be modified to adjust logging properties (for expert users only)
- The ‘libs’ subfolder contains the necessary libraries for all platforms. Currently supported are:
- Windows 32/64 bit
- Linux 32/64 bit
- OSX 64 bit
If you encounter mysterious error messages like “already connected”, maybe your java is not up to date. Please try to update Java (https://www.java.com/de/download/) and try again.
For Mac users – if the integrated java version is outdated, try to download the latest version of Transkribus from our homepage and replace the complete installation on your computer.
[Pro tip: the Mac version of the expert client comes with a java shipped within the application. If this java version is outdated, you can try to delete or replace it with an updated version. To find the files in Mac finder, right-click (or cmd-click) on the Transkribus application in your programs view, click “show packages contents” in the context menu, then go to the subfolder “Contents/MacOS”. There, the subfolder “jre” contains that java version. If you delete this folder, the application starter will try to find java on your system.]
- After opening the command file on the Mac, Transkribus says that there is a wrong Java version installed (184.108.40.206) instead of 1.7. However, there is the most current version of Java RE (220.127.116.11) installed.
- The problem is that Java 18.104.22.168 is the default Java on the command line which the Transkribus.command uses. You can check the default version by opening the terminal and typing ‘java -version’.
- To solve the problem you can either download the latest jdk as a .tar.gz package from here:
and unpack it into the Transkribus folder – the Transkribus.command file will automatically check for java installations in its sub directories!
- Or you could make your java 8 installation the default one on command line following e.g. the instructions here:
The package “libwebkitgtk” may not be installed. On Fedora you can install the package using dnf on the command line (use “yum” instead of “dnf” in older versions of Fedora):
sudo dnf install webkitgtk
If you do not know if you have to use a proxy server and you get “Login failed: already connected” as error message when trying to log in then that’s most likely the indication for it.
- When the program has started, click on the home menu button on the top left and select “Proxy settings…”. In the following dialog you can set the proxy host, port, user name (optional) and password (optional). This is the recommended method for using a proxy server.
- Alternatively, you can edit the start script (e.g. Transkribus.bat on Windows, Transkribus.sh on Linux) to include the environment variables for the proxy server:
However, editing this file will be necessary on each update of Transkribus.
Logging in to the server is not possible via Transkribus, but on the website it works.
- Error message “Already connected”: your Java installation may be outdated and can’t establish a secure connection to the server. You can check your installed version by opening a terminal/command line and entering “java -version”. If you encounter this problem, try updating Java on your machine. We recommend a current version of Java 8 (Oracle or OpenJdk). The Mac version of Transkribus includes a Java runtime. If you encounter this problem on a Mac please download a new package from from https://transkribus.eu and update your installation. If the error persists please contact email@example.com, ideally including the log file of your installation (from the Transkribus directory: logs/TrpGui.log) and/or information on your Java version and operating system.
- Since February 2017, old Transkribus versions are blocked from logging in for compatibility reasons. If you use a version older than 1.0.0 an update of the application may be necessary to solve this issue.
- You may have to configure a proxy server via ‘main menu’ – ‘Proxy settings’.
Java Heap space / No more handles
- 32 bit Java on a 64 bit Windows OS -> install 64 bit Java from here: https://java.com/de/download/manual.jsp#win
- Too little RAM: Try to allocate more main memory by opening Transkribus.bat and set e.g. java -Xmx2048m -jar Transkribus-1.5.4.jar
- Start Transkribus with this bat file
Logging in is prevented by the Firewall of your Internet Provider
- Some IT departments are blocking the SSL port 443 and/or unknown applications via a firewall. Check with your IT department if that might be the case.
Norton Antivirus detects a threat and is blocking the zip file from being unpacked.
- Solution: This is a false alarm which Norton gives when encountering software it is not familiar with (WS.Reputation.1). You should be able to restore the file from quarantine by following the instructions from the following resource .
Versions older or equal than 0.6.5 cannot update (very long error message):
- Please click on the “Home” button (upper left corner), then “Install a specific version”, select the newest version from “Releases” and tick the box beside “Download complete package”.
- Afterwards click on “Update” or “Replace”. This way, the complete package is downloaded and the update should work.
On newer MacBooks (2016 onwards), there seems to be a problem when starting Transkribus directly out of the Downloads folder – try copying the application to a folder where you have full system rights (e.g. the applications folder) and start it from there
If your starting problems persist, here you can find a workaround to start the app from your terminal
Handwritten Text Recognition (HTR) Workflow
Our long-term goal is to train so many different writing styles that Transkribus will be able to deal with most handwritten documents without prior training. The more users work with Transkribus for their transcription, the faster we will reach this ambitious goal!
Yes, we now have a Text2Image matching tool that can match existing text with an image. If you have lots of existing transcripts and would like to use these to train a HTR model, please consult our How to Guide.
No! Documents uploaded to Transkribus are private by default. You can use the “Manage collections…” button in the “Server” tab of Transkribus to allow specific users to view and/or edit your collection if you wish.
In theory, yes! The software needs to be trained to understand each style of handwriting. Every piece of training data submitted to Transkribus is helping to strengthen the overall accuracy of the HTR.
Both technologies are very similar, but OCR is already in an advanced state, whereas HTR is still in an early phase. Unlike OCR, HTR does not focus on individual letters. Instead, it scans and processes the image of entire lines and tries to decode this data. The main difference from the user’s point of view is that the stage of Layout Analysis/Segmentation is integrated into the OCR engine, whereas it is a separate step in the workflow for HTR.
The accuracy of HTR is not complete but impressive Word and Character Error Rates are possible. The latest experiments have generated transcripts with a Character Error Rate of around 5%. This means that 95% of characters in an automatically-generated transcript would be correct. For some successful examples of HTR, have a look at our Example Documents or our Success Stories from the READ project blog. You can measure the accuracy of your HTR model in Transkribus using the ‘Compare’ function in the ‘Tools’ tab.
You can use your HTR model to automatically generate transcripts of your documents by clicking the “Run text recognition” button in the “Tools” tab in Transkribus. You can export your documents and search them in Transkribus by clicking the “Search” button in the Main menu. You can now also search your documents using our new Keyword Spotting tool.
You should then contact the Transkribus team by email firstname.lastname@example.org. They can activate the training button in Transkribus for you. This way you can create a HTR model which is specific to the collection of documents that you have been working with in Transkribus. Find out more in our How to Guide
Firstly, you need to upload your documents to the platform. Secondly, you need to segment the pages of your collection into text regions and baselines. Thirdly, you need to transcribe each page as accurately possible. For more information on these stages, have a look at our How to Guides.
HTR engines cannot process text straight away – they need to be trained to recognise a certain style of handwriting. This can be achieved by creating at least 75 pages (15,000) of training data (images and transcripts) in Transkribus.