Brief summary
Wikimedia projects produce a lot of interesting data! The purpose of this project is to create a set of notebook-based tutorials and assets that will make it easier for individuals to access and use that data.
The primary focus of this project is improving technical documentation. The participant will engage technical writing, research, and programming skills while working on the following outcomes:
Outcomes
- Write a library that could work with SQL dumps
- A notebook tutorial that helps users decide between PAWS and Toolforge for their work with datasets and why (What works on PAWS? What works on Toolforge?)
- A notebook tutorial on dumps that shows accessing dumps in XML and SQL
- Propose and draft additional notebook tutorials focused on improving the experience of users working with Wikimedia data
Skills required
- Python 3, SQL, JSON
- Jupyter notebooks
- Technical documentation
- Research