The Future of Information Extraction – Be Part of TUC 2024! ✨ Feb 15-16, In-Person and Online. Get your Ticket >>

Success Story
Published: 3 years ago

Transkribus Projects at the Vienna City Library

The Vienna City Library has already implemented three projects with Transkribus. Their first project was the Lehmann address books. All of Vienna’s main tenants from 1859 to 1942 are listed in these Viennese address books, which is why they are a valuable source for research into the history of the city of Vienna. The address books were digitized from microfilm in 2011 and put online without OCR. In order to ensure a better searchability, “Lehmann” was digitized again from the original in 2018 and a full-text recognition was carried out. However, the recognition of addresses and names in the Fraktur script with common OCR programs results in an error rate that is too high, therefore the library decided to work with Transkribus. Now there are approximately 200,000 pages of the Lehmann books which are searchable here.

A page from the Lehmann address books © Wienbibliothek im Rathaus

Another project was on the text recognition of handwriting. The Vienna Library owns the estate of Franz Grillparzer, which was bequeathed to the City of Vienna in 1878 as a gift. On the occasion of the 150th anniversary of Grillparzer’s death, which will be celebrated in January 2022, the Vienna City Library is planning to make the Grillparzer manuscripts available online. For this purpose, the letters and manuscripts written by Grillparzer were sent to Transkribus for full-text recognition. After creating a model, around 15,000 scans were successfully recognized.

A Manuscript of Franz Grillparzer © Wienbibliothek im Rathaus

The last project the library carried out with Transkribus was the text recognition of around 20,000 obituary notes. The challenge with these announcements was the different fonts, languages and formats of the obituaries. In addition to that, presorting was not possible due to the sheer volume. Attempts with common OCR programs had failed because the various formats within a death notice were not recognized. With Transkribus, however, it was finally possible to carry out successful text recognition of all death notices. It is planned for them to go online by the end of 2021.

Example of an obituary note © Wienbibliothek im Rathaus