Citizen science, community science, crowdsourcing science, volunteer monitoring — however you want to call it, this type of scientific research always centres around one basic principle: the data collection is done not by professional scientists but by members of the public. These “citizen scientists” willingly give up their free time to help collect and collate data for research projects, making it possible to process a much larger amount of data than would normally be possible.
The Transkribus platform has been in several high-profile citizen science projects, in which volunteers have transcribed or corrected transcriptions of various types of historical documents. So in this post, we’re going to take a closer look at what citizen science is and how you can set up your own citizen science project with Transkribus.
The basic idea of citizen science
As we said in the introduction, a citizen science project is one in which volunteers are (at least partly) responsible for collecting data. This could be anything from counting wildlife in the garden, discovering planets in images of outer space, or measuring levels of phytoplankton while sailing at sea. The data collected is then verified and analysed by professional scientists.
Such projects often involve hundreds and sometimes thousands of volunteers, enabling researchers to collect a much wider range of data than would be possible without this public participation. Another benefit of citizen science projects is their cost effectiveness. Volunteers usually give up their time and resources for free, making it possible to collect a lot of data at minimal cost.
The importance of technology in citizen science
Technology has revolutionised citizen science projects. Previously, a citizen scientist would have to log everything on paper and send it in, whereas now they can input their findings into apps or similar tools. This is not only less time consuming for volunteers but also allows researchers to receive data from around the world in real-time and carry out the data analysis on a constant basis.
Using technology in citizen science projects also helps to build a sense of community. It makes it possible for citizen scientists working on a project to connect with each other and share experiences. Some citizen science projects also incorporate leaderboards or other competitive elements into their apps, so that volunteers can compare their data collection activities with others and see who has collected the most data, or collected it the fastest etc. This sense of competition often helps citizen scientists to stay engaged in the project, resulting in more data for the professional scientists.
Data quality: is citizen science reliable?
This is usually one of the first questions people have about citizen science. If the volunteers are collecting data without any supervision (and often no qualifications or hands on experience in scientific work), how can we make sure that the data they collect is reliable?
Firstly, there have been several studies (such as this one, this one, and this one) that have quantitatively shown that the data collected by citizen scientists is accurate enough to be used in the scientific process. These studies do stress, however, that the design of the project is key to its success. For example, the data collected should be checked by either a professional scientist or by many other citizen scientists. This could involve multiple volunteers reviewing the same photo or proofreading the same transcription. Training citizen scientists properly at the start of the project is also very important for generating accurate data, particularly if volunteers have never been involved in such projects.
How is citizen science used in digital humanities?
As you may have realised from the types of projects mentioned in this post, citizen science has traditionally been used mainly in the fields of biology and ecology. But many of the more recent studies have taken place in digital humanities — the study of digital technologies and their implementation in the humanities. Many digital humanities projects involve creating digital versions of paper sources, such as books, manuscripts, and records.
There are several reasons why the scientific community is keen to digitise these kinds of sources. Firstly, it makes it easier to find relevant information within them. Many sources are hundreds or thousands of pages long, and finding the couple of pages relevant to your research question can be a time-consuming task. However, a digital version can be searched for keywords in a matter of seconds, saving valuable research time.
Digital versions can also be enriched much more easily than non-digital versions. Researchers can tag certain words or phrases to provide additional information about the topic, or to highlight particular information within the source. This not only makes the digital version more interesting for readers but it can also aid collaboration between researchers. Each researcher can enrich the text with their own interpretations or knowledge and in doing so, share it with other researchers.
How do citizen scientists help with digitising sources?
Citizen scientists play a vital role in this digitisation process, both in creating digital version and in enriching them. The first thing to do when digitising a paper source is to transcribe it. Sometimes, this is done manually by volunteers skilled in the language and handwriting used in the document, who read the paper document and type the words into a computer programme.
However, manual transcription is quite time consuming so more and more citizen science projects are starting to speed things up with transcription platforms such as Transkribus. Transkribus uses AI to create automatic transcriptions of handwritten documents, which citizen scientists can then proofread. It takes much less time for a citizen scientist to proofread an automatic transcription than it does to create a manual transcription from scratch. Because of this, Transkribus allows citizen scientists to process many more documents than would be possible manually.
What kind of citizen science projects are there?
Transkribus has been used in many different citizen science projects in digital humanities. Here are two good examples.
The Europeana project is an initiative funded by the European Union. It aims to digitise items of cultural heritage from museums, archives, and libraries across Europe and create a fully-accessible online collection. Many of those artefacts are handwritten sources, which are transcribed by citizen scientists via the Transcribathon platform.
Previously, Transcribathon only allowed volunteers to transcribe documents manually. But in 2021, Europeana and Transkribus collaborated on a new version of the platform, which would generate automatic transcriptions of the documents. These could then be proofread by the citizen scientists working on the project.
You can find out more information about the collaboration in this blog post.
Led by the Aarhus City Archives in Denmark, the RetroDigitaliserung project focuses on the digitalisation of sources written in old Gothic handwriting, which can be very difficult to read. The project is a collaboration between many small Danish archives, that would otherwise lack the resources to organise a project of this scale alone.
RetroDigitaliserung trains citizen scientists, the majority of who are already of retirement age, to transcribe sources using Transkribus. To do this, they organise workshops around the country and create online resources, teaching the volunteers how to use the software. They have also worked with history students at the University of Aarhus. As the volunteers complete transcriptions, these are published on the project website, making them accessible to everyone.
Anette Larner of the RetroDigitaliserung project recently did a wonderful presentation about the project, which you can find on YouTube.
How can I set up my own citizen science project with Transkribus?
Transkribus is ideal for citizen science projects in digital humanities. The platform allows many different people to work on the same documents or collections, making it easy to combine with citizen science methods.
The first step is to learn how Transkribus works yourself. Having a good knowledge of the platform will help you to set up the project in the most efficient way and solve any problems that the volunteers are having. If you’ve never used Transkribus before, there are plenty of how-to guides on our website, as well as video tutorials on YouTube.
Once you have some experience in Transkribus, you can set up a citizen science project. To do this, all the documents that need transcribing should be uploaded to Transkribus and organised into a collection. This collection can then be shared with the volunteers. Each citizen scientist also needs their own Transkribus account, so that they can view the collection that has been shared with them.
The volunteers can then either create their own manual transcriptions of the documents or proofread the automatic transcriptions created by Transkribus. If you are training your own model for the project, you will first need to provide a certain number of manual transcriptions, known as “Ground Truths” (you can find out more about Ground Truths and the training process in this how-to guide). These can also be transcribed by citizen scientists.
If you need to train your citizen scientists to read a certain type of handwriting before they start work on the actual transcriptions, we can also recommend the Transkribus Learn platform. Transkribus Learn teaches people how to read many older styles of handwriting through a series of quizzes, which become progressively harder as time goes on. You can also add your own documents to the platform and train your volunteers in the specific handwriting used in your project.