01 - Getting help

First, download the Transkribus tool from our website. Next, take a look at our How To Guides to get an overview of how to work with documents in Transkribus.

You can contact us with any questions or commments on info@readcoop.eu

02 - Registration

You can sign up for a Transkribus account at the Transkribus website.

Visit the Transkribus website, click ‘Login’ and then ‘Forgot password?’

03 - Download and Installation

Download Transkribus
  • Once you have registered for an account, you can download Transkribus for free from our website.
  • If the download fails – please download the latest release from here: alternative download location
Supported Operating Systems
  • Transkribus is platform independent and will run on Windows, Mac and Linux.
  • Transkribus is written in Java. You need to have the newest Java 8 installed on your computer for Transkribus to work. This should be the case for most computers.
  • You can get Java 8 from here: https://java.com/de/download/manual.jsp#win
  • If you need to check your Java version: https://java.com/en/download/help/version_manual.xml
  • Please note that Transkribus is developed and tested using the Java Virtual Machine (JVM) by Oracle in version 1.8. While it is known to run also on the JVM that comes with OpenJDK, other implementations are neither tested nor officially supported. The same is valid for Java versions > 9.
Unzip ZIP File
  • After download you will see a ZIP File in the download directory of your computer.
  • Unzip the file before you try to start an executable (.exe) file.
Run Transkribus via an executable file: .exe, .command, .sh
  • Open the Transkribus directory. There you will find the executable files for your operating system.
  • Start Transkribus from your user interface via doubleclick:
    • Windows: Transkribus.bat or use Transkribus.exe
    • Mac OS – Apple: Transkribus.command
    • Linux: Transkribus.sh
Notes for first launch on Linux
  • If the OS is (or is based on) Ubuntu 17.04, installing libwebkit is necessary:

sudo apt install libwebkitgtk-1.0-0

Notes for first launch on Windows
  • If you do not have “Administrator” rights, Windows will produce a warning message, such as: Your Computer is Protected by Windows”, etc.
  • Do not confirm, but go to “More Information”. There you can agree that this is not malware and that you want to run Transkribus on your computer.
Notes for first launch on MAC
  • If you run the program the first time, it may not start because it is a non-signed application (“… can’t be opened because it is from an unidentified developer” message)
  • In this case, right-click (or control-click) the application and choose “Open”. In the appearing dialog box, click “Open” again!
  • Or right click the Track Pad to open the Context Menu and add a security exception for Transkribus.
  • Transkribus is contained in the main jar file Transkribus-<version>.jar
  • To run the program from command line type: java -jar Transkribus-<version>.jar
  • Note: Java 8 is needed to run the program. Make sure Java 8 is either installed system wide or copy a JRE into the program directory!
    • Java 9 and 10 are not supported! If already installed on your system please copy a Java 8 JRE (from https://java.com/de/download/manual.jsp#win) into the Transkribus package (=the unzipped installation folder with the application exe). This way the program takes the Java from there without having it installed on the computer. Note: Please rename the folder to ‘jre’ – only then it will be found!!
    • Some problems (mainly Java heap space) occur because a Java 32 bit version is installed on a 64 bit operating system. Please check!”
  • Note: To run the scripts in Mac (or Linux) you may have to make them executable from the command line: (any version before 0.6.8)
    • Mac console basics
    • change into the program folder using ‘cd’ commands
    • chmod +x Transkribus.command (or chmod +x Transkribus.sh for Linux!)
  • Furthermore you will find several files in the Transkribus package copied to your computer:
    • config.properties can be modified to adjust simple appearance properties
    • virtualKeyboards.xml can be used to specify a set of virtual keyboards
    • logback.xml can be modified to adjust logging properties (for expert users only)
  • The ‘libs’ subfolder contains the necessary libraries for all platforms. Currently supported are:
    • Windows 32/64 bit
    • Linux 32/64 bit
    • OSX 64 bit

If you do not know if you have to use a proxy server and you get “Login failed: already connected” as error message when trying to log in then that’s most likely the indication for it.

  • When the program has started, click on the home menu button on the top left and select “Proxy settings…”. In the following dialog you can set the proxy host, port, user name (optional) and password (optional). This is the recommended method for using a proxy server.
  • Alternatively, you can edit the start script (e.g. Transkribus.bat on Windows, Transkribus.sh on Linux) to include the environment variables for the proxy server:
   java -Dhttps.proxyHost=<proxyserver>
        -Dhttps.proxyPort=<proxyPort>
        -Dhttps.proxyUser=<user name for proxy>   
        -Dhttps.proxyPassword=<password for proxy>
        -jar Transkribus-0.7.0.jar

However, editing this file will be necessary on each update of Transkribus.

Logging in to the server is not possible via Transkribus, but on the website it works.

  • Error message “Already connected”: your Java installation may be outdated and can’t establish a secure connection to the server. You can check your installed version by opening a terminal/command line and entering “java -version”. If you encounter this problem, try updating Java on your machine. We recommend a current version of Java 8 (Oracle or OpenJdk). The Mac version of Transkribus includes a Java runtime. If you encounter this problem on a Mac please download a new package from from https://transkribus.eu and update your installation. If the error persists please contact info@readcoop.eu, ideally including the log file of your installation (from the Transkribus directory: logs/TrpGui.log) and/or information on your Java version and operating system.
  • Since February 2017, old Transkribus versions are blocked from logging in for compatibility reasons. If you use a version older than 1.0.0 an update of the application may be necessary to solve this issue.
  • You may have to configure a proxy server via ‘main menu’ – ‘Proxy settings’.

Java Heap space / No more handles

  • 32 bit Java on a 64 bit Windows OS -> install 64 bit Java from here: https://java.com/de/download/manual.jsp#win
  • Too little RAM: Try to allocate more main memory by opening Transkribus.bat and set e.g. java -Xmx2048m -jar Transkribus-1.5.4.jar
    • Start Transkribus with this bat file

Logging in is prevented by the Firewall of your Internet Provider

  • Some IT departments are blocking the SSL port 443 and/or unknown applications via a firewall. Check with your IT department if that might be the case.

Norton Antivirus detects a threat and is blocking the zip file from being unpacked.

  • Solution: This is a false alarm which Norton gives when encountering software it is not familiar with (WS.Reputation.1). You should be able to restore the file from quarantine by following the instructions from the following resource [1].

Versions older or equal than 0.6.5 cannot update (very long error message):

  • Please click on the “Home” button (upper left corner), then “Install a specific version”, select the newest version from “Releases” and tick the box beside “Download complete package”.
  • Afterwards click on “Update” or “Replace”. This way, the complete package is downloaded and the update should work.

On newer MacBooks (2016 onwards), there seems to be a problem when starting Transkribus directly out of the Downloads folder – try copying the application to a folder where you have full system rights (e.g. the applications folder) and start it from there

If your starting problems persist, here you can find a workaround to start the app from your terminal

If you encounter mysterious error messages like “already connected”, maybe your java is not up to date. Please try to update Java (https://www.java.com/de/download/) and try again.

For Mac users – if the integrated java version is outdated, try to download the latest version of Transkribus from our homepage and replace the complete installation on your computer.

[Pro tip: the Mac version of the expert client comes with a java shipped within the application. If this java version is outdated, you can try to delete or replace it with an updated version. To find the files in Mac finder, right-click (or cmd-click) on the Transkribus application in your programs view, click “show packages contents” in the context menu, then go to the subfolder “Contents/MacOS”. There, the subfolder “jre” contains that java version. If you delete this folder, the application starter will try to find java on your system.]

  • After opening the command file on the Mac, Transkribus says that there is a wrong Java version installed (1.6.0.65) instead of 1.7. However, there is the most current version of Java RE (1.8.0.66) installed.
    • The problem is that Java 1.6.0.65 is the default Java on the command line which the Transkribus.command uses. You can check the default version by opening the terminal and typing ‘java -version’.
    • To solve the problem you can either download the latest jdk as a .tar.gz package from here:
   http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

and unpack it into the Transkribus folder – the Transkribus.command file will automatically check for java installations in its sub directories!

  • Or you could make your java 8 installation the default one on command line following e.g. the instructions here:
   http://myshittycode.com/2014/03/17/mac-os-x-setting-default-java-version/
  • The package “libwebkitgtk” may not be installed. On Fedora you can install the package using dnf on the command line (use “yum” instead of “dnf” in older versions of Fedora):
   sudo dnf install webkitgtk

04 - Handwritten Text Recognition (HTR) Workflow

HTR engines cannot process text straight away – they need to be trained to recognise a certain style of handwriting. This can be achieved by creating at least 75 pages (15,000) of training data (images and transcripts) in Transkribus.

Firstly, you need to upload your documents to the platform. Secondly, you need to segment the pages of your collection into text regions and baselines. Thirdly, you need to transcribe each page as accurately possible. For more information on these stages, have a look at our How to Guides.

The more training data, the better! But you can start to train the HTR with as little as 75 pages (15,000 words) of documents written in a neat hand.

You should then contact the Transkribus team by email info@readcoop.eu. They can activate the training button in Transkribus for you. This way you can create a HTR model which is specific to the collection of documents that you have been working with in Transkribus. Find out more in our How to Guide

You can use your HTR model to automatically generate transcripts of your documents by clicking the “Run text recognition” button in the “Tools” tab in Transkribus. You can export your documents and search them in Transkribus by clicking the “Search” button in the Main menu. You can now also search your documents using our new Keyword Spotting tool.

The accuracy of HTR is not complete but impressive Word and Character Error Rates are possible. The latest experiments have generated transcripts with a Character Error Rate of around 5%. This means that 95% of characters in an automatically-generated transcript would be correct. For some successful examples of HTR, have a look at our Example Documents or our Success Stories from the READ project blog. You can measure the accuracy of your HTR model in Transkribus using the ‘Compare’ function in the ‘Tools’ tab.

Both technologies are very similar, but OCR is already in an advanced state, whereas HTR is still in an early phase. Unlike OCR, HTR does not focus on individual letters. Instead, it scans and processes the image of entire lines and tries to decode this data. The main difference from the user’s point of view is that the stage of Layout Analysis/Segmentation is integrated into the OCR engine, whereas it is a separate step in the workflow for HTR.

In theory, yes! The software needs to be trained to understand each style of handwriting. Every piece of training data submitted to Transkribus is helping to strengthen the overall accuracy of the HTR.

No! Documents uploaded to Transkribus are private by default. You can use the “Manage collections…” button in the “Server” tab of Transkribus to allow specific users to view and/or edit your collection if you wish.

Yes, we now have a Text2Image matching tool that can match existing text with an image. If you have lots of existing transcripts and would like to use these to train a HTR model, please consult our How to Guide.