The wonderful world of Wikidata
Most people are familiar with Wikipedia, the world’s largest encyclopaedia and source of most of the answers whenever you talk to Alexa or Siri. Higher education initiatives to close knowledge gaps on Wikipedia are also increasingly common, mostly focusing on improving and adding articles. However, underpinning Wikipedia is the database known as Wikidata. Wikidata is a vast collection of structured data but, more than that, it connects data to data. Anyone can then use that data to tell stories. For example, Ada Lovelace (Q7259) is the child (P40) of Lord Byron (Q5679). Or that the Loch Ness Monster (Q49658) is resident (P551) in Loch Ness (Q49650). Despite containing 112,863,167 items as of August 2024, there are still significant gaps in the data. The existing data is also difficult to explore and visualise without a technical background.
The start of a beautiful (knowledge exchange) friendship
As Industrial Liaison at the School of Computer Science at the University of St Andrews, it is my job to make connections between the staff and students at the School with those outwith the University through knowledge exchange, research, and teaching opportunities. One of those connections was with Dr Sara Thomas (Wikimedia UK) who was aware of outstanding gaps in Wikidata and what was needed to fill them. At our Doors Open event in April 2023, we discussed how we might work closer together and the natural fit began with proposing some Senior Honours projects. These projects are undertaken by our final year students over two semesters and require ~10 hours/week of the student’s time. Each student chooses from a smorgasbord of project proposals and I was delighted that every one of my projects was picked!
And so it begins
The students started their projects in September 2023. Each student tackled a different challenge, whether that was collaborating with external partners to add new datasets, visualising existing data, or inviting contributors to fill the gaps. Data cleaning was a major challenge, as not all data is available on a website in a clean and accessible format, especially if the content has been added over a long period of time. There was also a steep learning curve as the students got to grips with the Wiki community policies and processes, whilst also submitting ethics applications to undertake participatory design research.
The projects were:
- Wiki Loves Earth by Mary Olaleye
- Provided a mechanism for those unable to, either due to disability or financial restrictions, explore walking routes from home, inspired by the Wiki Loves Earth photo competition
- Monument Mapper by Patsy Ng
- Provided walking directions to enable users to ‘bag’ photos of monuments to upload to Wikimedia Commons
- Scottish Brick History by Grace Young
- Created a visualisation to enable users to explore a brand new and painstakingly cleaned dataset of Scotland’s industrial past from Scottish Brick History
- Data available by SPARQL Query of P8048
- Memorials to Women by Jennifer Shaw
- Visualised the geographic distribution of memorials to women in Scotland, enriched by the upload of a brand new and painstakingly cleaned dataset from Glasgow Womens’ Library
- Data available by SPARQL Query of P8700
- Wiki Loves Monuments by Yuxin Zhang
- Enabled existing contributors to the Wiki Loves Monuments UK photo competition to explore the impact and distribution of their contribution
The student perspective
Grace developed the Scottish Brick History site and had this to say about her project:
“I initially picked the project due to an interest in data visualisation. Additionally, the chance to make a well rounded dataset more accessible and preserve it in a more organised manner stood out to me as an incredible opportunity. Although I had difficulties gathering and cleaning data from the original Scottish Brick History website, seeing the end product on Wikidata made it all worthwhile as I could see immediate improvements as a result of my work. Creating the visualisation from the data uploaded to Wikidata helped demonstrate the importance of the project, as feedback was incredibly positive and new insights could be immediately gained from the various visualisations I chose to implement.
I was particularly proud of the primary visualisation, the map view, created using Leaflet, as it is an excellent tool to visually see the areas of Scotland most affected by the brick and tile industry. The map view additionally has several different options to filter the locations, alongside having other visual options to improve the accessibility of the map. A screenshot of this map view can be seen below.
I continue to maintain the website to ensure that the visualisations of this important data remain accessible for anyone who wishes to view them. I gained a lot of new skills from the project, and the chance to work with key stakeholders to formulate requirements and gather feedback was incredibly valuable for my own personal experience. I plan to stay involved with the Wiki community in the future.”
Impact of the project so far
There have been several outcomes so far because of the students’ hard work:
- Three students have continued to develop their projects post-graduation with the intention of making them available permanently
- Additional Wiki based projects with new partners available to Senior Honours students in 2024
- One student used their project as an example of persistence in a job interview, and was successful
- Voluntary Summer Team Enterprise Project (STEP) collaboration to plug other gaps on Wikidata
- Poster presentation at the School of Computer Science Doors Open day in April 2024
- Poster presentation at Wikimania 2024 in Katowice, Poland
- Outline plans for an EPSRC Network grant to build capacity in addressing digital literacy and knowledge gaps using the Wiki projects
- If this idea intrigues you, then please do get in contact!
On reflection
I was delighted with how well the projects went, but there is always room for improvement. For example, I would provide all students with introductory training to Wiki and the community so that they know exactly where to go to find help and the information that they need. I would design a blanket ethics document to cover all their projects, rather than expecting them to each create one. I would also add their usernames to a tracking tool called the Outreach Dashboard. This tool, created by the WikiEdu team, enables you to quantitatively measure the impact of edits made by a specific group of contributors. This would enable me to see the impact of the projects on the wider Wiki community, including its readers. Finally, I would definitely include external partners in future projects. The students found this an incredibly valuable component and appreciated that their projects had real world impact, rather than being left on a shelf. If anyone is interested in incorporating Wiki projects into their research, teaching, or knowledge exchange practice then please do get in touch with me (ksrh1@st-andrews.ac.uk).
Top Image credit: Planemad, Public domain, via Wikimedia Commons
Middle Image credit: Grace Young