This is the sixth task for T276329, Synchronising Wikidata and Wikipedias using pywikibot, aimed at getting you familiar with looping through Wikipedia categories
- You should already have a Wikimedia account and set up pywikibot (if not, do Tasks 1 and 2 first).
- Look at the subcategories of https://en.wikipedia.org/wiki/Category:Wikipedia_categories_tracking_data_not_in_Wikidata and pick one (say which one you have picked below, so that two people don't pick the same category! And note that the 'date of birth' and 'date of death' categories are not eligible for this task.)
- Write a script that loops through that category, retrieving each article, and finding the relevant ID value that is not on Wikidata
- Save the IDs to wikidata!
- Bonus: check that there is not a Wikidata item that already has the same ID, and if there is, investigate what has happened.
Save your code to a repository, or create a page like https://www.wikidata.org/wiki/User:Mike_Peel/Outreachy_2 (under your username - and change the ending to '6'.) Add the links to the edits at the end of the code as a comment.
Once you are happy, send me a link to your page (by email, on my talk page, or replying to this ticket as you prefer). Make sure to also register it as a contribution on the Outreachy website (https://www.outreachy.org/outreachy-may-2021-internship-round/communities/wikimedia/synchronising-wikidata-and-wikipedias-using-pywiki/contributions/)!
Hints:
- You can probably reuse code for earlier tasks to do this