This model is trained on reports from the Gothenburg Police Detective department 1868-1902, held at the Swedish National Archives in Gothenburg. The groundtruth for the model training consists of transcribed spreads from 1873, 1880, 1888, and 1896. 165000 words have been trained and the CER on the validation set is 2.3%.
Link to archive finding aid: https://sok.riksarkivet.se/arkiv/gj8w3gHtrH6cyG018W43t3 (material used to train model in series A II)
The training of this model is part of a research and development project at the Swedish National Archives, in collaboration with GPS400: Centre for Collaborative Visual Research at the University of Gothenburg, and Vinnova: Sweden’s innovation agency, as well as participants of the public through Citizen Science activities at the Regional State Archives in Gothenburg, where participants have transcribed most of the groundtruth spreads for training this model.