More and more archival holdings are being digitised. But there are still thousands of document collections that exist only in manuscript form. This means that interested readers must visit the archive in person to take pictures of and transcribe the documents they are interested in.
The READ project is seeking to make this process easier with a new digitisation service. The Computer Vision Lab at Technical University Vienna is developing DocScan, an Open Source Android mobile app that allows archival users to take high-quality images of historical documents.
DocScan automatically detects the page area of a document and provides real-time feedback on the quality of the image according to factors like perspective, sharpness and light. This allows users to take high-quality images that can be used for Handwritten Text Recognition in Transkribus, or simply for future research. The DocScan app will be connected to Transkribus so users can upload their images directly to our cloud.
The Computer Vision Lab are also working on a prototype of a ScanTent. This is a piece of equipment designed to hold a mobile phone in a stable position in order to produce a more standardised shot. This could be particularly handy for scanning bound volumes, where two hands are sometimes needed to keep the pages in place.
DocScan and the ScanTent can also be of use to archives, as they could enable institutions to build up a collection of user-generated content. QR code recognition or similar technology could be employed to ensure that images are organised correctly within an archive’s digital collections.
If you are interested in finding out more, you can read our reports:
Günter Mühlberger (University of Innsbruck), Markus Diem, Stefan Fiel and Florian Kleber (all at the Computer Vision Lab, Technical University Vienna), D5.14 ScanREAD.
Günter Mühlberger (University of Innsbruck), Markus Diem, Fabian Hollaus, Stefan Fiel and Florian Kleber (all at the Computer Vision Lab, Technical University Vienna), D81. Open Innovation Forum.
You can also take a look at the back-end of the DocScan app on our Github page.
We will be partnering with several archives to test out these two products and we plan to organise a ‘scanathon’ to see how quickly users can produce good quality digital images. Stay tuned to hear more about the development and testing of the app!