+ Transcribing Bentham with a computer

The Bentham Project at University College London, which works on the scholarly edition of the writings of the British philosopher Jeremy Bentham, has become increasingly involved with digital humanities across the past decade.  The project has undertaken the digitisation of thousands of Bentham manuscripts and in 2010 launched one of the first academic crowdsourcing initiatives, Transcribe Bentham.  Exciting experiments with Handwritten Text Recognition (HTR) have also been ongoing over the past few years.

Using around 900 pages of Bentham material, a first HTR model was trained with very good results.  The ‘English Writing M1’ model can recognise pages written in a relatively neat hand by Bentham and his secretaries  with an impressive Character Error Rate (CER) of 5-10%.  This model is publicly available in Transkribus and can be applied to English handwriting from the 1800s and 1900s with nice results.

The Bentham Project is now working to improve the automated recognition of Bentham’s most difficult handwriting – written at a time when the philosopher was in his eighties and losing his sight.  Early results show a promising CER of 26%, which is a very good basis for Keyword Spotting as a research tool for scholars interested in Bentham’s ideas.

Find out more at the Transcribe Bentham blog!

Screenshot from Transkribus with automatically generated transcript. Box 31, fol. 78, UCL Bentham Papers, Special Collections, University College London.
SHARE THIS ARTICLE

Recent Posts

July 3, 2024
News, Transkribus
Some Transkribus projects finish with a complete digitised collection in Transkribus. Some take that digitised source and use it to ...
June 12, 2024
News, Transkribus
When you think of Carolingian (or Caroline) minuscule, Charlemagne and his vast Carolingian empire likely come to mind. While the ...
May 14, 2024
Uncategorized
Understanding historical documents is key to understanding history. But understanding historical documents in Polish can be a challenge. Not only ...