This is the third task for T290718, Automatically matching new Wikipedia articles with Wikidata items using Python, aimed at getting you familiar with searching for Wikidata items and finding the QIDs.
- You should already have a Wikimedia account and set up pywikibot (if not, do Tasks 1 and 2 first).
- Find some terms to search for. This could be name strings identified in previous tasks, or article titles (e.g., those not yet connected to Wikidata),
- Set up a script that connects to Wikidata, searches for the term, and returns the QID. Make sure it is the correct QID!
- Bonus: Explore how to identify the correct item when multiple terms are returned, and how to identify false matches.
- Bonus: Think of other ways of finding the right Wikidata item that doesn't depend on Wikidata labels, based on other information in the article and potential Wikidata item matches.
Save your code to a repository, or create a page like https://www.wikidata.org/wiki/User:Mike_Peel/Outreachy_2 (under your username - and change the ending to '3'.)
Once you are happy, send me a link to your page (by email, on my talk page, or replying to this ticket as you prefer). Make sure to also register it as a contribution on the Outreachy website ( https://www.outreachy.org/outreachy-december-2021-internship-round/communities/wikimedia/automatically-matching-new-wikipedia-articles-with/contributions/ )!
- You might want to use the API, see https://www.wikidata.org/w/api.php?action=help&modules=wbsearchentities
- Also see https://bitbucket.org/mikepeel/wikicode/src/master/example.py
- You can find pages that are not connected to Wikidata using e.g., enwp.querypage('UnconnectedPages')