Page MenuHomePhabricator

Automatically matching new Wikipedia articles with Wikidata items using Python - Task 1
Closed, ResolvedPublic

Description

This is the first task for T290718, Automatically matching new Wikipedia articles with Wikidata items using Python, aimed at getting you familiar with Wikidata and how properties work within structured data.

  1. You should register a Wikimedia account if you don't already have one. You can do so at https://www.wikidata.org/w/index.php?title=Special:CreateAccount
  1. Pick the language Wikipedia that you are most familiar with - any language is OK (see https://www.wikipedia.org/ for a complete list), but Wikipedias with more articles will have more content to work with.
  1. Pick a type of article. This could be books, ships, arthropods, authors, sports, castles, bridges, chemists, museums, rivers, trees - anything you are interested in. Find a few of that type of article (say, 6-12) - as varied as you can.
  1. Have a look at how key facts are stored in the article - particularly in the infobox, but also look at the categories and the text. Start thinking about what you could define as very simple statements about the topic ('this is a human', 'this was published in 2021', 'this is made of stone", etc.)
  1. Have a look at the Wikidata item for each article. You can find that by clicking on the 'Wikidata item' link on the left-hand sidebar of the article (or similar for other languages). Each article has a 'Q' number, which is at the top of the page.
  1. You will see a set of properties with information about the topic. See how well they compare to the statements you were thinking of earlier. Are there obvious matches?
  1. Wikidata stores properties as "P" numbers (for example, 'instance of' is P31). - you can find these by hovering over the property label. Start collecting the ones that are used for your type of article. Properties will be linked against a value (date, number, text string, Q-number, filename, etc.)
  1. Start a page like https://www.wikidata.org/wiki/User:Mike_Peel/Outreachy_1 (change 'Mike_Peel' to your username). Follow the rough format there to document what you're seeing, e.g.:
* {{P|P31}}: [[:en:Lovell Telescope]] is a {{Q|Q184356}} in the infobox

The first part is the property, the second links to the article you were looking at (change 'en' to the relevant language code), and the 'Q' number is the value that it is linking to ('radio telescope' - or if it doesn't have a Q-number, give the string, date, filename etc.). Finally, say where in the article that piece of information is stored.

  1. See how many statements you can find matching properties, and which ones you can't find (list those anyway and we can come back to them). You should aim for around 15-20 different properties) across at least 6 articles.

Bonus 1: you will see that some properties have qualifier values, also document those and understand how they work.

Bonus 2: Look for additional properties that could be used but currently aren't (see https://www.wikidata.org/wiki/Wikidata:List_of_properties ), and document those.

Bonus 3: Add new properties to the Wikidata items you have been looking at. At the bottom of the page, click on 'Add statements', and you can input a property (by P-number or text), and a value (by Q-number or text). If you can't see the link, or it doesn't work, check the top-right of the page to see if it has a padlock - this means that the item is protected and you will need to make edits to other items instead (you will be able to come back to this item once your account is auto-confirmed, so you can note down the item anyway to come back to it!)

Once you are happy, send me a link to your page (by email, on my talk page, or replying to this ticket as you prefer). Make sure to also register it as a contribution on the Outreachy website ( https://www.outreachy.org/outreachy-december-2021-internship-round/communities/wikimedia/automatically-matching-new-wikipedia-articles-with/contributions/ )! I'll send you a reply to say whether it is accepted or not on the talk page for the contribution.

Event Timeline

Hello @Mike_Peel, I have been approved to participate in this year outreachy contribution stage and I am interested in this project.

Below is my submission for the first task and I will be glad to know what you think. Thanks.

https://www.wikidata.org/wiki/User:Abudahakam/Outreachy_1

Hello @Mike_Peel, I have been approved to participate in this year outreachy contribution stage and I am interested in this project.

Below is my submission for the first task and I will be glad to know what you think. Thanks.

https://www.wikidata.org/wiki/User:Abudahakam/Outreachy_1

Hi, it looks like a good start! Remember that the value should be clearly linked with the property: so when you talk about the logo image for the first item, you should identify the value (in this case, the filename) for it from the Wikipedia article. I've also clarified the instructions: ideally try to get around 15-20 different properties, across at least 6 articles, please.

Hi , I need help creating a page on wikidata. so am unable to edit the username. or do i need to create account initially like nancy123/outreachy_1?
trying to follow this this.
Start a page like https://www.wikidata.org/wiki/User:Mike_Peel/Outreachy_1 (change 'Mike_Peel' to your username). Follow the rough format there to document what you're seeing, e.g.:

Hi , I need help creating a page on wikidata. so am unable to edit the username. or do i need to create account initially like nancy123/outreachy_1?
trying to follow this this.
Start a page like https://www.wikidata.org/wiki/User:Mike_Peel/Outreachy_1 (change 'Mike_Peel' to your username). Follow the rough format there to document what you're seeing, e.g.:

Hi , I need help creating a page on wikidata. so am unable to edit the username. or do i need to create account initially like nancy123/outreachy_1?
trying to follow this this.
Start a page like https://www.wikidata.org/wiki/User:Mike_Peel/Outreachy_1 (change 'Mike_Peel' to your username). Follow the rough format there to document what you're seeing, e.g.:

Looking at your account, you have a different username here to on-wiki? So I think your page should be created at: https://www.wikidata.org/wiki/User:Nancy_Sal/Outreachy_1

Hello @Mike_Peel thank for the above help.
So I have started the task on the link below. I would appreciate your feedback.
https://www.wikidata.org/wiki/User:Nancy_Sal/Outreachy_1

Hi @Mike_Peel I'm also having problems starting a page, It has only my name and surname (like here but with capital letters at the beggining) and I can't add "/Outreachy_1"

Hi @Mike_Peel, I have completed this task after getting 15- 20 different properties across at least 11 articles. Here is the link to my task https://www.wikidata.org/wiki/User:Osamaahmed17/Outreachy-1

Hello @Mike_Peel I continue to do task one. here is my link let me know if am on the right track.
https://www.wikidata.org/wiki/User:Nancy_Sal/Outreachy_1
I also find myself getting confused about what exactly the task is all about. i will appreciate any leads to more materials on the same.
Thank you.

Hi, @Mike_Peel , I have completed the task, please do check if I am on the right track or not..
https://www.wikidata.org/wiki/User:Nafiya_Ahmed/Outreachy_1

Hello Respected Sir @Mike_Peel
I am sending link for first task. please have a look and mention if i did any mistakes.
https://www.wikidata.org/wiki/User:Veeshah/Outreachy_1

Hello @Mike_Peel I continue to do task one. here is my link let me know if am on the right track.
https://www.wikidata.org/wiki/User:Nancy_Sal/Outreachy_1
I also find myself getting confused about what exactly the task is all about. i will appreciate any leads to more materials on the same.
Thank you.

Hi Nancy, good start, keep going! The main aim of this Outreachy project is to match Wikipedia articles and Wikidata items with each other, using whatever information is available in the article and items. The main aim of this task is to get you familiar with the structure of Wikipedia articles and Wikidata items, and to find things that are in common between them.

Hi @Mike_Peel I'm also having problems starting a page, It has only my name and surname (like here but with capital letters at the beggining) and I can't add "/Outreachy_1"

Hi Irene, I think you've sorted this out now? I can see a page at https://www.wikidata.org/wiki/User:Irene_Ovadia/Outreachy_1 ?

Hello @Mike_Peel,

Here is a link for the first task ( https://www.wikidata.org/wiki/User:Kelvin_Wachira/Outreachy_1 ).

The insertion and editing of images and the table of content were a bit of a surprise and I was wondering if there is documentation of what the Wiki markup supports and how it should be used.

B,
Kelvin

Hi @Mike_Peel I'm also having problems starting a page, It has only my name and surname (like here but with capital letters at the beggining) and I can't add "/Outreachy_1"

Hi Irene, I think you've sorted this out now? I can see a page at https://www.wikidata.org/wiki/User:Irene_Ovadia/Outreachy_1 ?

Hi Mike, yes I could do it! I'm editing it right now, trying to add new properties to a Wikidata item

Hello all, hello @Mike_Peel! I am completely stucked, I can not create my page with /Outreachy 1. Please help! I have the same issue as @irene_ovadia and @Nancy_123... But I am still did not get it. Thank you very much in advance.

Hi @Pandamasha so how I got around it is once you create the wikidata account follow the link https://www.wikidata.org/wiki/User:Mike_Peel/Outreachy_1 --replace the username with your username. i mean the name you used while applying. if it contains spaces replace with the underscore.
@Mike_Peel l hope I am helping.

Hello all, hello @Mike_Peel! I am completely stucked, I can not create my page with /Outreachy 1. Please help! I have the same issue as @irene_ovadia and @Nancy_123... But I am still did not get it. Thank you very much in advance.

Hello Pandamasha.

To create the page for this task go to a page like this ( https://www.wikidata.org/wiki/User:Mike_Peel/Outreachy_1 ). On the URL, change

Mike_Peel

to the username you are using. i.e

Pandamasha

The new URL will be https://www.wikidata.org/wiki/User:Pandamasha/Outreachy_1 on clicking on it you will be redirected to your user page which has no data or does not exist,
click on

create this page

link and you should have a page/markdown you can edit.

This is how I understand it and I stand to be corrected.

@Nancy_123 and @0703! Thank you very very much! I tried everything, really everything, but this ...
Thank you!

@Nancy_123 and @0703! Thank you very very much! I tried everything, really everything, but this ...
Thank you!

Glad you found a solution in the end, and many thanks to @Nancy_123 and @0703 for helping!

(I'm behind on giving feedback/answering questions, sorry! Am trying to catch up asap.)

Hi all, I think I'm caught up with giving feedback for this task now. I've mostly put feedback on the talk pages. Please ping me if I've missed reviewing yours!

The insertion and editing of images and the table of content were a bit of a surprise and I was wondering if there is documentation of what the Wiki markup supports and how it should be used.

There's good documentation at https://en.wikipedia.org/wiki/Help:Wikitext - also see the language links on the left if you prefer documentation in another language.

Hello @Mike_Peel I have completed task 1. Kindly have a look at it.
https://www.wikidata.org/wiki/User:Nancy_Sal/Outreachy_1
looking forward to your feedback.
Thank you.

Dear @Mike_Peel,
Sorry, but I'm a little confused. Again. Did I understand the task correctly? Should we only match (enter) those cases that coincide with our vision/view? Do we need to enter cases that do not match/meet ours definition?
Should we point out errors and discrepancies in the articles or Wikidata or not? For example, Russian and English versions of the article on ethology do not coincide at all: in the Russian version the creator of the term and first ethologist is Isidore Geoffroy Saint-Hilaire, and in English is Charles Darwin... There is no mentioning of Charles Darwin in the Russian article and etc.

@Nancy_123 @Ope28 : you have answers on the talk pages. :-)

Sorry, but I'm a little confused. Again. Did I understand the task correctly? Should we only match (enter) those cases that coincide with our vision/view? Do we need to enter cases that do not match/meet ours definition?
Should we point out errors and discrepancies in the articles or Wikidata or not? For example, Russian and English versions of the article on ethology do not coincide at all: in the Russian version the creator of the term and first ethologist is Isidore Geoffroy Saint-Hilaire, and in English is Charles Darwin... There is no mentioning of Charles Darwin in the Russian article and etc.

It's entirely up to you. The overall aim of the project is to be able to find things in common between the article and the item (so you could later say: "yes, this article matches this item" or "no, these are about different topics"). But if there are disagreements, please note them as well, as it will be really interesting to hear about them!

Thank you for your answer, @Mike_Peel! I'll keep analysing. I'll try to consider, how to report dismatching items. ))) All folks - have a wonderful week-end!

hi @Mike_Peel

i'm late to submission due to some inconvenience .

kindly evaluate,
task-1 submission https://www.wikidata.org/wiki/User:Nuzhatmila/Outreachy_1

This comment was removed by Nuzhatmila.

Good evening @Mike_Peel,
Please find below a link to my completed first task:
https://www.wikidata.org/wiki/User:Pandamasha/Outreachy_1_English. I will be very glad to receive your comments.

Good evening @Mike_Peel,
Please find below a link to my completed first task:
https://www.wikidata.org/wiki/User:Pandamasha/Outreachy_1_English. I will be very glad to receive your comments.

Reply on the talk page! :-)

Hi @Mike_Peel,

Can you please have a look at my first task: https://www.wikidata.org/wiki/User:Odohemma/Outreachy_1. I look forward to your reply.

Hi @Mike_Peel,

Can you please have a look at my first task: https://www.wikidata.org/wiki/User:Odohemma/Outreachy_1. I look forward to your reply.

Hi, I've replied on the talk page. :-)

This comment was removed by Suha_098.

Hello @Mike_Peel . Kindly have a look at my 1st submission. Below is the link. I will be glad to hear from you. Thank you.
https://www.wikidata.org/wiki/User:Suha_098/Outreachy_1

Hello @Mike_Peel . Kindly have a look at my 1st submission. Below is the link. I will be glad to hear from you. Thank you.
https://www.wikidata.org/wiki/User:Suha_098/Outreachy_1

You have a reply on the talk page. :-)

Hello @Mike_Peel . Kindly have a look at my 1st submission. Below is the link. I will be glad to hear from you. Thank you.
https://www.wikidata.org/wiki/User:Suha_098/Outreachy_1

You have a reply on the talk page. :-)

Thanks a lot 🙏