One of the READ project partners is helping to make keyword searching of handwritten documents a real possibility! The Pattern Recognition and Human Language Technology (PRHLT) research centre at the Universitat Politècnica de València is part of the HIMANIS project. HIMANIS stands for Historical MANuscript Indexing for user-controlled Search and like READ, it is a project which aims to use new technology to open up access to cultural heritage documents.
The work of HIMANIS focuses on the Trésor des Chartes, a collection of medieval registers which document the charters, grants and privileges bestowed by French Kings in the fourteenth and fifteenth centuries. Collaboration between archivists and computer scientists in HIMANIS means that users are now able to conduct a full-text search of these registers. Try it out!
To give a couple of examples, you can search for person names (Alfonso) and place names (Paris), as well as words in French or Latin. It is also possible to perform more complicated Boolean search queries – you can find a full explanation and examples on the HIMANIS blog.
This search tool is a prototype and you can help to improve it! If you click on a word highlighted in the search results, you can give feedback on whether the word was spotted correctly. If you see words that were missed by the tool, you can double-click on them to ensure that a search will pick them up next time.
This form of searching is known as Keyword Spotting. It represents an enhanced version of keyword searching because the technology is detecting similarities in images of words, rather than searching through computer-generated transcriptions of these words.
This impressive technology will be integrated into our Transkribus platform in the coming months. The PRHLT research centre at the Universitat Politècnica de València will also continue its experiments, this time working with theatre records from the Spanish Golden Age.