Page MenuHomePhabricator

What's in a name? Task 3
Closed, ResolvedPublic

Description

This is the third task for T300207, What's in a name? Automatically identifying first and last author names for Wikicite and Wikidata, aimed at getting you familiar with structured citations.

  1. Ask @Mike_Peel and @Pigsonthewing for an item about a scientific article to work on. This may be the same as one you've worked on in the previous tasks, or may be a different one. You can request an item on this task page, and one of us will reply.
  1. Load the item into the code you wrote for Task 2, and print out the author information from it
  1. The item will also have a link to the journal, where bibtex or RIS information will be available about the citation. Load this in to your code (use https://docs.python.org/3/howto/urllib2.html ) and print out the author information it contains.
  1. Try to match up the authors in the Wikidata item with those in the Bibtex or RIS file, and print the information about each author from both sources together.
  1. Identify which is the first part of the author citation, and which is the last part. If you can, write these into the Wikidata item, using the P9687 and P9688 properties ('author first names' and 'author last names' respectively).

Save your code to a repository, or create a page like https://www.wikidata.org/wiki/User:Mike_Peel/Outreachy_3 (under your username)

Once you are happy, send me a link to your page (by email, on my talk page, or replying to this ticket as you prefer). Make sure to also register it as a contribution on the Outreachy website ( https://www.outreachy.org/outreachy-may-2022-internship-round/communities/wikimedia/whats-in-a-name-automatically-identifying-first-an/contributions/ )!

Hints:

Event Timeline

Hello! See my attempt at Task 3 here: https://www.wikidata.org/wiki/User:PangolinMexico/Outreachy_3

If using RegEx is a suboptimal way to get BibTeX info, let me know - I'm happy to change my implementation as required. Thank you so much!

Hi Mike @Mike_Peel & Andy @Pigsonthewing , Hope you are doing well!

  • I put my contributions for task 3 at: https://www.wikidata.org/wiki/User:Jiehui_Ma/Outreachy_3; please let me know if any of the funtions I developed needs a further improvement.
  • I also listed my future contribution thoughts and some initial thoughts about my contribution ideas to be realized in the internship for final application. If possible can we discuss about the timeline to make it more suitable for your project's requirements? :-))

Thank you in advance for your advice!

Best Regards,
Jiehui

@Mike_Peel Thank you so much! - I will be then updating my code to replace str.split() with RegEx and polishing my final application.

Hello @Mike_Peel,
I am getting an error during fetching the URL -https://www.researchgate.net/publication/253799173_Plasma_physics_and_radiation_hydrodynamics_in_developing_an_extreme_ultraviolet_light_source_for_lithography/citation/download , which you have assigned to me for task 3.

I am using this code -
import urllib.request
import pywikibot

with
urllib.request.urlopen('https://www.researchgate.net/publication/253799173_Plasma_physics_and_radiation_hydrodynamics_in_developing_an_extreme_ultraviolet_light_source_for_lithography/citation/download') as response:

html = response.read()

and getting this error-urllib.error.HTTPError: HTTP Error 403: Forbidden
CRITICAL: Exiting due to uncaught exception <class 'urllib.error.HTTPError'>

I think my code is right, I can not understand why I am getting this error.
Please tell me what should I do.

Hi @301295_kTH
I dealt with a similar problem when getting Task 3 setup - what's happening here in most cases is that the server doesn't know who is trying to access the website/where the request is coming from, information it needs to successfully and safely return whatever information the request is asking for. You need to add specific headers to your URL to fix this.

https://stackoverflow.com/questions/13303449/urllib2-httperror-http-error-403-forbidden

The information here really helped me --- let me know if it helps you!

Hi, @Mike_Peel and @Pigsonthewing,
Hope you are doing well.
Completing all three tasks was fun, sometimes challenging 🙈 too for me. But I really learned new things and explored 🧐 a lot during this time period. I am glad that I have chosen this project.
I would also like to express my gratitude for providing valuable and quick feedback on my tasks.
Kindly review my Task3 too.
Thank you for your support.
Update on tasks
✅ Task1
✅ Task2
✅Task3

Hello @Mike_Peel and @Pigsonthewing, here's my final submission for task 3. Thank you for the experience. I look forward to working with you in the future.

https://www.wikidata.org/wiki/User:Azaya89/Outreachy_3

Edit: Are there any specific questions you will like us to answer before we submit our applications on the Outreachy application page?

Also, we will need an application timeline guide as required in the final application.

Furthermore, if there's any additional task you may want us to work on while we wait for the final internship selection, I will like to take a shot at it. Thanks a lot!

Hi @301295_kTH
I dealt with a similar problem when getting Task 3 setup - what's happening here in most cases is that the server doesn't know who is trying to access the website/where the request is coming from, information it needs to successfully and safely return whatever information the request is asking for. You need to add specific headers to your URL to fix this.

https://stackoverflow.com/questions/13303449/urllib2-httperror-http-error-403-forbidden

The information here really helped me --- let me know if it helps you!

Thank u for your attention,
The issue has been fixed, instead of URL, I used the downloaded path of the file which is working now.

This comment was removed by 301295_kTH.

Hello @Mike_Peel,
hope you are safe and healthy
I have completed task 3, here is the link for the same - https://www.wikidata.org/wiki/User:Varsha_ahirwar_from_India/Outreachy_3
I have put different codes for each small task in task 3.
Looking forward to working with you in the future.

Thank u

Hi all, sorry, something unexpected came up. I’ll check and comment on your tasks as soon as I can, but please go ahead and submit applications now, you don’t need to wait for the tasks to be checked first.

Hello everyone,
I am going to submit my final application, but I am confused about the contribution URL, what should I put since we have done three tasks so we have three URLs -
1- wiki URL for task 1
2- GitHub or wiki URL for task 2
3- Github or wiki URL for task 3

so should I keep all three links or should I make a wiki page that will contain all three links?
Or there is another way with which I am not aware

It's up to you, you can add them all, or just the one you think is most appropriate. You should put at least one URL per task though.

Hi @Mike_Peel and @Pigsonthewing,

I hope everything is alright, and you are safe and healthy.
My attempt at Task 3 is here: https://www.wikidata.org/wiki/User:Akandoria/Outreachy_3.

Thank you for your time!

Mike_Peel claimed this task.

Hi all, sorry for the delay in replying. Everyone now has replies on the talk page (basically, everyone's work was accepted as completed), and we'll take this work into account while assessing and deciding on the internship. Thanks everyone for your work during the contribution period!

Hi all, sorry for the delay in replying. Everyone now has replies on the talk page (basically, everyone's work was accepted as completed), and we'll take this work into account while assessing and deciding on the internship. Thanks, everyone for your work during the contribution period!

Hello @Mike_Peel,
Thank you so much for this update,
but there is not any comment on the talk page of my task 3, if there are any improvements or any suggestions please let me know, it will help me in the future to improve my work.

If there is any project in python apart from the Outreachy internship project, to which I could contribute, please let me know.I will be very happy to work on Wikipedia's project more since it was my first time contributing to any open source organization also Wikipedia was the only organization in Outreachy to which I had contributed and I really enjoy this time, whatever will be the result of this contribution period, I am eagerly want to contribute more to Wikipedia's projects in future, also I am learning 'Machine learning' so I want to apply my knowledge of 'ML' into the real world and I think Wikipedia will be the wonderful organization to explore more.

Thank You so much for your guidance in the contribution period, Hoping to work under your guidance in the future whether as an Outreachy Intern or an open-source contributor.

Thank you.

but there is not any comment on the talk page of my task 3, if there are any improvements or any suggestions please let me know, it will help me in the future to improve my work.

Here: https://www.wikidata.org/wiki/User_talk:Varsha_ahirwar_from_India/Outreachy_3

If there is any project in python apart from the Outreachy internship project, to which I could contribute, please let me know.I will be very happy to work on Wikipedia's project more since it was my first time contributing to any open source organization also Wikipedia was the only organization in Outreachy to which I had contributed and I really enjoy this time, whatever will be the result of this contribution period, I am eagerly want to contribute more to Wikipedia's projects in future, also I am learning 'Machine learning' so I want to apply my knowledge of 'ML' into the real world and I think Wikipedia will be the wonderful organization to explore more.

Thank You so much for your guidance in the contribution period, Hoping to work under your guidance in the future whether as an Outreachy Intern or an open-source contributor.

There are always things that need doing. For bot tasks in particular (using pywikibot), have a look at https://www.wikidata.org/wiki/Wikidata:Bot_requests . For MediaWiki development, have a look at https://www.mediawiki.org/wiki/How_to_become_a_MediaWiki_hacker . For content, pick a topic you're interested in working on and start editing. :-) If you have any specific questions or issues, feel free to ping me and I'll try to help!

but there is not any comment on the talk page of my task 3, if there are any improvements or any suggestions please let me know, it will help me in the future to improve my work.

Here: https://www.wikidata.org/wiki/User_talk:Varsha_ahirwar_from_India/Outreachy_3

If there is any project in python apart from the Outreachy internship project, to which I could contribute, please let me know.I will be very happy to work on Wikipedia's project more since it was my first time contributing to any open source organization also Wikipedia was the only organization in Outreachy to which I had contributed and I

really enjoy this time, whatever will be the result of this contribution period, I am eagerly want to contribute more to Wikipedia's projects in future, also I am learning 'Machine learning' so I want to apply my knowledge of 'ML' into the real world and I think Wikipedia will be the wonderful organization to explore more.

Thank You so much for your guidance in the contribution period, Hoping to work under your guidance in the future whether as an Outreachy Intern or an open-source contributor.

There are always things that need doing. For bot tasks in particular (using pywikibot), have a look at https://www.wikidata.org/wiki/Wikidata:Bot_requests . For MediaWiki development, have a look at https://www.mediawiki.org/wiki/How_to_become_a_MediaWiki_hacker . For content, pick a topic you're interested in working on and start editing. :-) If you have any specific questions or issues, feel free to ping me and I'll try to help!

Thank you so much for your reply and feedback on task3,
It is my pleasure, that you assign a new project, during working on this project if I will face any issues, I will contact you.

Best regards.