+ English Cycling diaries recognised by University of Warwick

We’ve got some terrific results to report relating to an interesting collection of documents held at the Modern Records Centre at the University of Warwick.

Archivist Elizabeth Wood and her team have recently trained a Handwritten Text Recognition (HTR) model to recognise the writing in a collection of cycling diaries written in English during the early twentieth century by David Allan Hamilton.

The pages from Hamilton’s diary are small and frequently broken up with photos, maps and sketches of life on the road.  This meant that the team at Warwick decided to submit a greater number of transcribed pages to train their model.

The Hamilton model was trained on around 200 transcribed pages (containing nearly 20,000 words) from one volume of Hamilton’s diaries.

The automated transcripts produced by this model have a very impressive Character Error Rate of just 5% – meaning that an average of 95% of  characters are transcribed correctly by the computer.

Screenshot of automated transcription in Transkribus. Page from diary of David Allan Hamilton, 1916-1923, from the National Cycle Archive, Modern Records Centre, University of Warwick [document reference: MSS.328/N93/1].
The team at the Modern Records Centre are currently working with the automated transcripts and are also exploring the possibility of training new models to process other diaries in their holdings.

SHARE THIS ARTICLE

Recent Posts

September 19, 2023
Transkribus
We are thrilled to announce the September 2023 release of the Transkribus web app. After the successful switch to the ...
August 30, 2023
News, Transkribus
Today, the new Transkribus web app is officially launched!  Transkribus has always worked towards simplifying the digitasion and transcription of ...
August 21, 2023
Transkribus User Conference
The Transkribus User Conference 24 (15 & 16 February 2024, Innsbruck) invites stakeholders, users, scholars, and enthusiasts to explore the ...