Include the following for your Outreachy 31 project idea:
Project title: Micro-task Generator for Organizers on Wikipedia
Brief summary:
Develop a tool that automatically generates prioritized micro-tasks for Wikipedia articles to help organizers and new editors. The tool will analyze article metadata, maintenance templates, and engagement metrics to surface specific improvements, such as "add citations" or "fix dead links." This reduces the burden on campaign organizers and provides clear entry points for new contributors. The final deliverable will be a web application prototype suitable for deployment on Wikimedia Toolforge or OnWiki.
Skills required: Python, REST APIs, basic web development (HTML/CSS/Javascript), Git.
Learning outcomes: Master the MediaWiki API ecosystem, build a complete tool from concept to prototype, implement data analysis algorithms, and collaborate with open-source community stakeholders.
Possible mentor(s): @Isaac @SBisson @SEgt-WMF
Application microtasks:
You're being asked to complete this notebook that is a start to the sort of tool that you might develop for this internship: https://public-paws.wmcloud.org/User:Isaac%20(WMF)/Outreachy-Dec-2025/Micro-Task-Generator.ipynb
Using your knowledge of Python, do your best to complete the notebook. Imagine that your audience for the notebook is people who are new to Wikimedia data analysis (very possibly like you before starting this task) and provide lots of details to explain what you are doing and seeing in the data. Use a mixture of code cells and markdown to document what you find and your thoughts.
The full Outreachy project will involve more comprehensive coding than what is being asked for here with support from your mentors (and some opportunities for additional explorations as desired). This task will introduce some of the basic concepts and give us a sense of your Python skills, how well you work with new data, documentation of your code, and description of your thinking and results. We are not expecting perfection -- just do your best and explain what you're doing and why!
Any other additional information that the interns should know about:
Some instructions for setting up your environment and receiving feedback:
- Make sure that you can login to the PAWS service with your wiki account: https://hub-paws.wmcloud.org/hub/login
- Using this notebook as a starting point, create your own notebook (see these instructions for forking the raw notebook to start with) and complete the code / questions. All PAWS notebooks are automatically associated with a public link, which can be shared back so that we can evaluate what you did. If you're having trouble figuring out what it is, just ask.
- As you have questions, feel free to add comments to this task (and please don't hesitate to answer other applicant's questions if you can help).
- If you feel you have completed your notebook, you may request feedback and we will provide high-level feedback on what is good and what is missing. To do so, send a single email to all of the mentors (silviaegt@wikimedia.org, sbisson@wikimedia.org, and isaac@wikimedia.org) with the link to your public PAWS notebook. We will try to give this feedback once to anyone who would like it.
- When you feel you are happy with your notebook, you should include the public link in your final Outreachy project application as a recorded contribution. We encourage you to record contributions as you go, as well as to track progress.
A compiled list of existing tools and workflows: Please review this non-exhaustive list of community tools that generate editing tasks. We are interested in how you might integrate, improve upon, or creatively synthesize these existing resources into your Python notebook (extra points if you can also consider ways to make this an on-wiki tool later on).
- WikiProject Template (only in Spanish but Google Translate provides a good idea of what it is Plantilla:Wikiproyecto): A sophisticated template system that automatically generates and maintains a dynamic, standardized project page for a WikiProject highlighting which tasks to do, which editors area active, etc.
- Citation Hunt: Retrieves articles based on Wikipedia categories and returns examples that require citations.
- List Building Tool: Takes a "seed" article and finds similar ones based on criteria like similar words or readership overlap.
- Popular Pages Bot: Identifies the most-read articles within a specific WikiProject that may need improvement.
- SuggestBot: Provides personalized article suggestions to editors based on their past contributions.
- PetScan: Generates lists of pages or Wikidata items matching highly specific, combinable criteria.
- BamBot's Cleanup Worklist Bot: Automatically generates listings of cleanup needs for all articles within a WikiProject (e.g., Africa ⚠️ might take a while to load).
- Module:WikimediaCEETable: Builds tables of articles based on a list of Wikidata IDs.
- A spreadsheet of metadata signals: For a more detailed look at potential data points, you can refer to this collaborative document.