Page MenuHomePhabricator

Synchronising Wikidata and Wikipedias using pywikibot - Task 6
Open, Needs TriagePublic

Description

This is the sixth task for T276329, Synchronising Wikidata and Wikipedias using pywikibot, aimed at getting you familiar with looping through Wikipedia categories

  1. You should already have a Wikimedia account and set up pywikibot (if not, do Tasks 1 and 2 first).
  1. Look at the subcategories of https://en.wikipedia.org/wiki/Category:Wikipedia_categories_tracking_data_not_in_Wikidata and pick one (say which one you have picked below, so that two people don't pick the same category! And note that the 'date of birth' and 'date of death' categories are not eligible for this task.)
  1. Write a script that loops through that category, retrieving each article, and finding the relevant ID value that is not on Wikidata
  1. Save the IDs to wikidata!
  1. Bonus: check that there is not a Wikidata item that already has the same ID, and if there is, investigate what has happened.

Save your code to a repository, or create a page like https://www.wikidata.org/wiki/User:Mike_Peel/Outreachy_2 (under your username - and change the ending to '6'.) Add the links to the edits at the end of the code as a comment.

Once you are happy, send me a link to your page (by email, on my talk page, or replying to this ticket as you prefer). Make sure to also register it as a contribution on the Outreachy website (https://www.outreachy.org/outreachy-may-2021-internship-round/communities/wikimedia/synchronising-wikidata-and-wikipedias-using-pywiki/contributions/)!

Hints:

  • You can probably reuse code for earlier tasks to do this

Event Timeline

Hello Everyone! I want to try and take up the category of "Category:Date of death not in Wikidata" for this task.

Hello Everyone! I want to try and take up the category of "Category:Date of death not in Wikidata" for this task.

Sorry, I shoud have said that 'Date of death' and 'Date of birth' are not suitable for this task, since they are not IDs, and Pi bot automatically synchrotronises them with Wikidata already - so the only ones in those categories will be mismatches and complicated format strings. Sorry about that!

Hello Everyone! I want to try and take up the category of "Category:Date of death not in Wikidata" for this task.

Sorry, I shoud have said that 'Date of death' and 'Date of birth' are not suitable for this task, since they are not IDs, and Pi bot automatically synchrotronises them with Wikidata already - so the only ones in those categories will be mismatches and complicated format strings. Sorry about that!

I see... Thank you for notifying! In that case, I will be working with Category:ATP template with ID not in Wikidata.