This model was trained as a student project in a master’s program “Digital Humanities” during November 2021 – January 2022.
The text corpus for the model includes books that were published after the Reform of Russian orthography made by Peter I in the following printing houses:
the printing house of the Academy of Sciences in St. Petersburg, the one of the Imperial Moscow University, the one of Vilkovsky and Galchenko, the one of The Land Cadet Corps and some decrees printed in civil script.
Training sources are books, scanned by Rusneb (https://rusneb.ru/) and by Google Books.
The model shows good results on Russian language material, but it does not recognize other languages that can occur in texts of this period.