The model ‘Swedish 17th century (Savo, Eastern Finland)’ is trained to read the Gothic handwriting style, also known as the ‘German’ handwriting style. It is tailored for 17th-century Swedish handwriting and was conceived as part of Ville-Pekka Kääriäinen’s doctoral project at the University of Helsinki.
The focus of the project was on the 17th century Upper Savonia (Ylä-Savo, Iisalmi/Idensalmi parish, Eastern Finland). Consequently, the model’s capability to interpret proper nouns (person and place names) might be somewhat limited due to the distinct geographical range of the training data.
The model includes training data from various document collections held in the National Archives of both Finland and Sweden:
- District court (fi: kihlakunnanoikeus, swe: häradsrätten) records related to the Iisalmi parish from 1639–1699 (jurisdictions of Savolax 1639–1650, Kajana friherreskap 1651–1680, Lilla Savolax 1681–1699)
- Mentions of Iisalmi parish residents in the lagmansrätt (fi: laamanninoikeus) records from 1643–1699 (Legal district of Karjala, Karelska lagsagan)
- Letters sent to Count Per Brahe by local officials, clergy, and townspeople (Skokolstersamlingen, Rydboholmssamlingen)
- The complaints of the common people (fi: rahvaanvalitukset, swe: allmogens besvär) from jurisdiction of Lilla-Savolax (fi: Pien-Savo)
The image quality of the training data varies. The model was primarily developed before the National Archives of Finland undertook a fresh round of digitization of the court records, necessitating work with low-quality microfilm copies. On the other hand, the material from the Swedish National Archives varies between self-photographed, high-quality digitizations and low-quality digitizations from microfilm.
The model adheres closely to the source material in its structure. Monetary units, measurement units, and other abbreviations have been addressed with their inherent logic, albeit without expanding them.
For instance, common currency units like the mark and thaler (swe: daler) are depicted by symbols m/m:r or D/D:r, contingent on the context.
The model has been created through substantial personal effort and commitment.
It is my hope that it will prove beneficial to others. I am open to collaboration to further develop this model. Please feel free to contact me at: v.kaariainen@gmail.com.
GT:
Pages: 1353 (training set) + 147 (validation set) = 1500 pages
Words: 472,655 (training set) + 51,613 (validation set) = 524,268 words