The Linnean Society of London has recently produced some promising results in its experiments with our Transkribus platform.
Founded in 1788, the Linnean Society is the world’s oldest natural history society. Its collections contain thousands of documents and animal specimens which once belonged to the Swedish botanist Carl Linnaeus, who is credited with formalising the taxonomy for living species which is still widely used today. The Linnean Society is one of the READ project’s Memorandum of Understanding partners and hosted our Digital Toolbox conference which took place in October 2016.
After submitting training data to the Transkribus team based on eighteenth-century handwriting in Swedish, English, French and Latin, the Linnean Society now have a Handwritten Text Recognition model which is capable of recognising some of Linnaeus’ handwriting with a Character Error Rate (CER) of 22%. The Linnean Society deliberately selected difficult pages (like the above) to challenge our technology – with complicated layouts, intricate handwriting, several languages and multiple hands. We hope that the results of this text recognition process could be improved in the future if training was focused on a particular language or hand. But these early results already open up the exciting possibility of Keyword Spotting – a new searching tool available in Transkribus which uses Handwritten Text Recognition technology to ensure more accurate search results than those generated through conventional keyword searching of transcripts.