Page MenuHomePhabricator

Design and Develop a tool to correct false depicts claims manually on Wikimedia Commons
Open, Needs TriagePublic

Description

Proposal:
We could have a micro-contribution tool which enables volunteers to correct most of the depicts claims which have been entered as structured data while uploading images on Wikimedia Commons by hand or other tools.

Background

A significant part of structured data has been entered through other applications(tools) which have been developed to collect structured data for the images on Commons. Though this is a fast way to collect structured data, we realized that in tools like ISA, where campaigns are organized and the participants will have prizes as top contributors, this may push some participants to enter ‘wrong’ or ‘unclear’ information which needs to be verified.

It may not be clearly outlined how the structured data team plans to handle the issue with redundant, unclear and out of topic structured data. The verify-SD tool proposes a manual way for interested volunteers to help double-checking the data which was entered as structured data.

This tool will be a simple application which will work on both web and mobile(so to say mobile first design). Users will be able to search Wikimedia Commons categories which they will desire to work on and then the images in this category will be displayed to them asking them questions about the things they see in the images with emphasis laid on the depicts statements which have been added to the images. All the user has to do is click ‘YES’ or ‘NO’. Upon clicking ‘Yes’ We retain the depicts claim and ‘NO’ means we remove the claim.

Current state of affairs

After a quick look around the such tools which has been development to address this need, I couldn't find one. Therefore, it is still under investigation to see that such tool has not been developed by the Structured Data Team or by Volunteers in the Movement.

Development:

The first version of this tool should be launched at the end of the said program, then full usage begins from there. The project will be managed on Phabricator and project code will reside on Gerrit. The tool in question will be hosted on ToolForge.

Tools and Technologies:
The proposed tools and technologies for this project are

  • Python
  • Javascript and related libraries
  • HTML/CSS
  • SQL

Mentors: @Eugene233, @NavinoEvans

Microtasks: https://www.mediawiki.org/wiki/Good_first_bugs

Other Proposed Tasks: T245759 T230942 T231751 T226306

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 20 2020, 5:57 PM
Eugene233 updated the task description. (Show Details)Feb 20 2020, 6:22 PM
Eugene233 updated the task description. (Show Details)
Eugene233 updated the task description. (Show Details)Feb 20 2020, 6:31 PM

Hello everyone! I am Karma Dolkar, an Electronics and Communication Engineering sophomore at Indian Institute of Technology, Roorkee. I am very interested in such topics as this one! Hence, out of curiosity, I have also contributed to Wikimedia's "Citation Hunt" project (https://github.com/eggpi/citationhunt). A few suggestions/ideas:

  • We could arrange the data in decreasing order of contentiousness so that we can deal with the more controversial (and perhaps incorrect) data first.
  • We could probably consider highlighting/marking content or depicts claims that haven't been verified in a long time.

I would love to contribute to this project. Looking forward to an awesome learning experience!

Also, is there some project code anywhere that I can see and contribute to (like a github link)? Thanks!

Thanks very much for you interest in this project @Karma2902 this project has no code yet and I will strongly suggest you go ahead and attempt the proposed tasks above so as to get a hint of what might be built.

Okay, will do so. Thank you @Eugene233 !

Eugene233 updated the task description. (Show Details)Feb 26 2020, 1:08 PM
Eugene233 added a subscriber: NavinoEvans.
SMe12435 added a subscriber: SMe12435.EditedMar 2 2020, 10:02 AM

Hi everyone, this is Sushant Mehta, pursuing Computer Science Engineering from Manipal University Jaipur. I have a keen interest in both front-end and backend technologies. I have worked with some of the growing startups in the country on the same. I am looking forward to contributing to this project in coming GSOC.

Regards. :)
www.gitlab.com/SMe12435

@SMe12435 thanks for your interest on this proposal please feel free to go through the project idea and look at the proposed micro tasks on the project. Feel free to ask questions if you have any.

Hello everyone, this is Nikunj Bansal, completing B.Tech in CSE with specialization in Machine Learning from University of petroleum and energy studies, Dehradun, India. I love to work in web development and done some good projects in it like developing site for UPES-ACM & ACM-W Student Chapter. I also done some research work on computer vision in the field of remote sensing so my python skills are quite good. And I will like to draft the proposal for this problem because this seems me quite interesting.
But I just want to confirm from mentors that am I lacking in some technology that we are going to use in this project so i will fill that gap. I attached my Github & Linkedin profile for your reference.

Thank you,
Best Regards,
https://github.com/Nikunjbansal99
https://in.linkedin.com/in/nikunj-bansal-6a78a416a

Sonalsk added a subscriber: Sonalsk.EditedMar 8 2020, 3:51 PM

Hello everyone
I am Sonal Kushwaha, a computer science student from new delhi, india. I would love to be a part of this project as the idea behind it excites me and i want to contribute to it to my best.
I am familiar with python, java, web(html, css, bootstrap), sql, jsp, and i am working with javascript too. I have worked with a web develpoment project which used html, css, js, bootstrap and jsp.
Here is my github for further details: https://github.com/sonalsk

It would be really helpful if you could tell me how to get started with this project
Thank you

Here is my linkedin for further details: https://www.linkedin.com/in/sonal-kushwaha/

Aklapper updated the task description. (Show Details)Mar 8 2020, 5:14 PM

Hi and welcome everybody!

It would be really helpful if you could tell me how to get started with this project

Please see https://www.mediawiki.org/wiki/Google_Summer_of_Code/2020 for general information. If you have specific questions, please ask! :)

Hi, I am Sanyam, Electrical and Electronics Engineering Sophomore at BITS, Pilani. I am interesed in working on this idea as a GSoC summer project. I am used to working with python-based backneds and worked full-stack. I have browsed through the codebase at gerrit and developd a thorough understanding. I have also submitted a patch and understood the working of gerrit. I have contributed at hashtags.
Can someone suggest what other microtasks to pick up from the ISA workboard (like whcih category? backlog/incoming bugs). It will be really helpful. Thanks.
github : https://github.com/sanyam-git

Can someone suggest what other microtasks to pick up from the ISA workboard

According to my understanding:

  • You can help with any open tasks
  • You do not have to restrict to just ISA-related tasks

Where should I discuss/propose a subtask/microstask ?

Gabrielchl added a comment.EditedMar 10 2020, 6:07 PM

You could just create the task, just like any other tasks.
Maybe ping the mentors of this task if you want them to add the new task under the new one.

For this project, you can work on https://www.mediawiki.org/wiki/Good_first_bugs as microtasks. As this project is about developing a new tool, proposing microtasks don't apply here I think.
Do you have more to add @Eugene233?
As far as creating a proposal is concerned, see step 9 here https://www.mediawiki.org/wiki/Google_Summer_of_Code/Participants#Application_process_steps.

@srishakatux I will around for some tasks and add if I can find them. I would go for other smaller issues all the same.

Hi , I am mohamed reda Electrical Engineering at KSU at Egypt .I am intersted in the project and I have some questions .
1-Is there an API we can use to get unverified claima and if exisit can I have the link or We will build it from scratch?
2-How would we authentication admins accounts ?
Thanks for your time.

NavinoEvans added a comment.EditedMar 12 2020, 6:02 PM

Hi all,

Thanks so much for all of your interest in this project, it's great to see so many skilled people here!

I'll just answer a few of the questions that have been raised as most of you will need similar info. I'll also give a bit of background about the likely starting point for the project.

1. Use ISA tool as template
As the proposed tool shares a lot of similarity with the ISA tool, the fastest way to create a working MVP for the tool would be to start from a fork/clone of the ISA code.
Because of this, it would be a good to become familiar with how this tool works. You can try using it to learn how it works, and you can clone the code from here https://gerrit.wikimedia.org/r/admin/projects/labs/tools/Isa
Note: the setup instructions are mostly complete in the README file but do have gaps, so just ping me or @Eugene233 if you're having trouble setting up.

For the design of the new tool to be created, it would be worth everyone thinking about how the app will work in terms of changes to the ISA design.
Will will have to discuss and agree some mock wireframes before any significant work can begin.

2. ISA todo list
If you'd like to contribute to ISA, you can have a look at the workboard on phabricator here: https://phabricator.wikimedia.org/tag/isa/
Check the "incoming bugs" and "Incoming features" to see if there is something you'd like to try out.

3. Using Gerrit for contributing code
You will have to use Gerrit for submitting commits, so will need to work through the process here to get setup and used to how it works: https://www.mediawiki.org/wiki/Gerrit/Tutorial

4. ISA tech and libraries
The ISA tools uses the following main libraries etc, so these are handy things to learn about if you're not already familiar with them:

  • Flask
  • SQLAlchemy
  • Flask-Babel
  • Jinja templates
  • Webpack

Hopefully that helps get you acquainted. Don't hesitate to contact us if you have any further questions.
Really looking forward to working with you :D

@Nikunjbansal99, @Sonalsk, hopefully this message helps with your initial questions, fire away if you need anything else

Hi , I am mohamed reda Electrical Engineering at KSU at Egypt .I am intersted in the project and I have some questions .
1-Is there an API we can use to get unverified claima and if exisit can I have the link or We will build it from scratch?
2-How would we authentication admins accounts ?
Thanks for your time.

Hi @Mohamedreda26, thanks for your interest!

You can see from the ISA code which API calls we are using by following the instructions in the last message for cloning the repository. But, in short we use:

For your point 2) can you explain a bit more? Do you mean your authentication for Gerrit access?

Hi, I am Sanyam, Electrical and Electronics Engineering Sophomore at BITS, Pilani. I am interesed in working on this idea as a GSoC summer project. I am used to working with python-based backneds and worked full-stack. I have browsed through the codebase at gerrit and developd a thorough understanding. I have also submitted a patch and understood the working of gerrit. I have contributed at hashtags.
Can someone suggest what other microtasks to pick up from the ISA workboard (like whcih category? backlog/incoming bugs). It will be really helpful. Thanks.
github : https://github.com/sanyam-git

Hi @Sanyam.wikime, sorry it was a bit disorganised but I've cleaned up the work board now. as described in my message above. You can now just pick tasks from the Incoming bugs or incoming features columns

Hi , I am mohamed reda Electrical Engineering at KSU at Egypt .I am intersted in the project and I have some questions .
1-Is there an API we can use to get unverified claima and if exisit can I have the link or We will build it from scratch?
2-How would we authentication admins accounts ?
Thanks for your time.

Hi @Mohamedreda26, thanks for your interest!

You can see from the ISA code which API calls we are using by following the instructions in the last message for cloning the repository. But, in short we use:

For your point 2) can you explain a bit more? Do you mean your authentication for Gerrit access?

I am trying to run the Isa app but I get [Errno 2] No such file or directory: '/~Isa/isa/config.yaml'
For point two what I meant is that The reviewer should have some special permissions or not ?

Gabrielchl added a comment.EditedMar 12 2020, 7:58 PM

I am trying to run the Isa app but I get [Errno 2] No such file or directory: '/~Isa/isa/config.yaml'

(There should be a way to generate that file, but i'm not sure how so... here's what I did)

Create config.yaml with this:

SECRET_KEY: '$(python -c "import os; print repr(os.urandom(24))")'
SQLALCHEMY_DATABASE_URI: <sqlalchemy_db_uri>
SQLALCHEMY_TEST_DATABASE_URI: <sqlalchemy_test_db_uri>
OAUTH_MWURI: https://meta.wikimedia.org/w/index.php
OAUTh_EDIT_URI: https://test-commons.wikimedia.org/w/api.php
CONSUMER_KEY: <oauth_consumer_key>
CONSUMER_SECRET: <oauth_consumer_secret>
SQLALCHEMY_POOL_RECYCLE: 90

sqlalchemi uris: https://docs.sqlalchemy.org/en/13/core/engines.html
wikimedia oauth: https://meta.wikimedia.org/wiki/Special:OAuthConsumerRegistration/propose

Gabrielchl added a comment.EditedMar 12 2020, 8:05 PM

For point two what I meant is that The reviewer should have some special permissions or not ?

What do you mean by "The reviewer", reviewer of what? Do you mean gerrit patch reviewer? Or the task's owner? This GSoC task's mentor?

For point two what I meant is that The reviewer should have some special permissions or not ?

What do you mean by "The reviewer", reviewer of what? Do you mean gerrit patch reviewer? Or the task's owner? This GSoC task's mentor?

Users will be able to search Wikimedia Commons categories which they will desire to work on and then the images in this category will be displayed to them asking them questions about the things they see in the images with emphasis laid on the depicts statements which have been added to the images. Should this user has any special permissions?

Should this user has any special permissions?

No. To add structured data to an image on Wikimedia Commons, one technically doesn't even have to be logged in to do so.

Should this user has any special permissions?

No. To add structured data to an image on Wikimedia Commons, one technically doesn't even have to be logged in to do so.

Yes this is exactly right. With the ISA tool it's just restricted it to users who have a Wikimedia account as we are using the OAuth extension for logging in, but as @Gabrielchl mentioned you don't even need an account for editing in general.

Hi @Eugene233 ,

I'm Mahuton POSSOUPE(@Mh-3110 ) and I'm interrested on working on this task as a GoC project.

I am an E-commerce master holder and a Computer Science Bachelor student at University of Franche-Comté in France.
After some years of work experience as freelance, I have decided this year to continue education to complete a computer science bachelor degree.

I have experiences with programming in Python, Ruby and Javascript.
I also have some experience of web app development using Ruby on Rails, Django and Javascript libraries.

During my freelancer experience, I have worked on many web app projects for ecommerce companies. Here are 2 of my latest works:

In 2014, I discovered Wikipedia and its sister projects and started contributing to wikipedia in French. I started a wikimedians community in my country Bénin. The community is now recognized as a user group: https://meta.wikimedia.org/wiki/Wikim%C3%A9diens_du_B%C3%A9nin_User_Group.

I also started a wikipedia in Fon, the most spoken national language in Bénin and now organizing a volunteer contributors community there. Here is the project in Incubator: https://incubator.wikimedia.org/wiki/Wp/fon/W%C3%A9m%C3%A1_Nuk%C9%94nt%C9%94n

Here is my userpage on Meta: https://meta.wikimedia.org/wiki/User:Mah3110

I had the opportunity to attend Mediawiki Hackathon in 2018 and 2019.
In 2018, I worked on a project to develop a web based keyboard for contributors to write Fon in the Wikimedia Incubator. Here is the project: https://www.mediawiki.org/wiki/Help:Extension:UniversalLanguageSelector/Input_methods/fon-tilde/fr

In 2019, I have developped an interractive documentation tool for GLAM project: https://phabricator.wikimedia.org/T223608. This tool had been developped in Node JS and is hosted on Toolforge. It is accessible here https://tools.wmflabs.org/mortar/

Since 2018, I started contributed(small tasks) to Mediawiki both core and extensions and mainly to Pywikibot library. Here are some of my Mediawiki contributions: https://gerrit.wikimedia.org/r/#/q/owner:Mh-3110+status:merged,n,z

Here is my Github profil: https://github.com/Mahuton

Thanks

Ayushshri121 triaged this task as High priority.Mar 13 2020, 5:04 PM
Ayushshri121 added a subscriber: Ayushshri121.
This comment was removed by Ayushshri121.
Ayushshri121 added a comment.EditedMar 13 2020, 5:07 PM

Hello Sir @NavinoEvans, I am Ayush Shrivastav, Sophomore, Undergrad From PSIT Kanpur, India. I would like to contribute to wikimedia for GSOC for developing tools that corrects false depicts on wikimedia commons for which i have already worked out and devised an idea.

  • I would like to connect with you on some Instant Messenger or email to discuss the effectiveness of my idea . Please let me know about your IM handle or email address.
  • My Skillset includes HTML/CSS, C/C++, git, Linux, bash, JavaScript and its Frameworks, Node.js, Databases- MongoDB, SQL, MySQL and Python (basics).
  • I also have experience with my projects and contribution with open source Community.
  • Though this Project doesn't have an exisitng code repository but i have reviewed wikimedia common related repositories and mediawiki and i am comfortable with the code and can make significant developement with this project and excited to start soon.
  • I would Love to contribute towards this Project and your suggestions and advises are welcomed. Looking forward for an awesome Learning experience and eager to start an exciting coding Journey with you. @NavinoEvans
zhuyifei1999 raised the priority of this task from High to Needs Triage.Mar 13 2020, 7:40 PM

@Mh-3110 @Ayushshri121 Welcome, happy to see your interest in GSoC with Wikimedia :) I want to encourage you to go through the application process steps to ensure there isn't anything that you are missing: https://www.mediawiki.org/wiki/Google_Summer_of_Code/Participants#Application_process_steps

@Mh-3110 Thanks very much for your interest to work on this project. Please do well to look at the previous comment made by @NavinoEvans and others which could go a long way to enlighten you on how to move forward.

Hello Sir @NavinoEvans, I am Ayush Shrivastav, Sophomore, Undergrad From PSIT Kanpur, India. I would like to contribute to wikimedia for GSOC for developing tools that corrects false depicts on wikimedia commons for which i have already worked out and devised an idea.

  • I would like to connect with you on some Instant Messenger or email to discuss the effectiveness of my idea . Please let me know about your IM handle or email address.
  • My Skillset includes HTML/CSS, C/C++, git, Linux, bash, JavaScript and its Frameworks, Node.js, Databases- MongoDB, SQL, MySQL and Python (basics).
  • I also have experience with my projects and contribution with open source Community.
  • Though this Project doesn't have an exisitng code repository but i have reviewed wikimedia common related repositories and mediawiki and i am comfortable with the code and can make significant developement with this project and excited to start soon.
  • I would Love to contribute towards this Project and your suggestions and advises are welcomed. Looking forward for an awesome Learning experience and eager to start an exciting coding Journey with you. @NavinoEvans

Hello @Ayushshri121, thanks so much for your interest, really looking forward to working with you!

For a good starting point, have a look through my comment here: https://phabricator.wikimedia.org/T245758#5965041

That should give an overview of the main areas to look into for the planning stages.

Don't hesitate to ping me or @Eugene233 if you have any questions at all.

@NavinoEvans @Eugene233
What is the scenario of images setup for verification :

  1. Only images to which structured data was added using ISA tool or
  2. All images from respective category (and having some structured data) from Commons

@NavinoEvans @Eugene233
What is the scenario of images setup for verification :

  1. Only images to which structured data was added using ISA tool or
  2. All images from respective category (and having some structured data) from Commons

Obviously i think verification will be done for the images which depicts and contains structured data. Be it from ISA tool or from general category. @Sanyam.wikime
Please correct me if i am wrong @Eugene233 @NavinoEvans

We are working here with images from commons categories.

Gabrielchl added a comment.EditedMar 16 2020, 9:33 PM

@Eugene233 would that be fine if I've got another idea, for example, looking for images with structure data from recent edits.
Because imagining what new user would prefer, I don't think most of them would use the tool with a specific category already in mind.

Eugene233 added a comment.EditedMar 16 2020, 9:42 PM

Thanks for sharing your thoughts @Gabrielchl. I suggest you present your thoughts on your proposal precisely stating why you choose to think it would be better to make use of your idea, the advantages and disadvantages. You could say: using recent edits are better because it provides us access to recently edited images...which may also have a disadvantage that older edits may not appear because...unless we specify the number of previous edits we will like to work on.

Hope this helps

Hi everyone,
Myself Vishwas Singh, an undergrad in Electronics and Communication Engineering form Indian Institute of Information Technology, Allahabad, India. I would like to contribute to this project as the aim of it excites me.

  • My skill set comprises of C/C++, HTML/CSS, JavaScript including DOM, working with REST and APIs , nodeJs and python fundamentals also.
  • I have contributed to open source beforehand including web scraping using python.
  • I would love to connect with the mentors on Slack or any IRC.

Here is my github profile: https://github.com/infern018

Zanna234 added a subscriber: Zanna234.EditedMar 18 2020, 9:31 PM

Hello everyone
I am Abubakar a computer science student from bayero university, nigeria.
wikimedia is an org that really makes an impact , I got interested in participating in developing a tool to correct false depicts claims manually on Wikimedia and i want to contribute to it to my best.
I am skilled with python, javascript , html, css, And sql . looking forward to working on this idea as a GSoC summer project
https://github.com/Abubakarauta

Thank you

@NavinoEvans Hi, I submitted patches at
https://gerrit.wikimedia.org/r/579607/
https://gerrit.wikimedia.org/r/579609/
Can you please review them ? Thanks.

Hi @Sanyam.wikime, many thanks for the contributions, and apologies for the delay. Obviously things a bit up in the air while adjusting to the current crisis but I should have plenty of time from now as I'm all set up to work from home.

I've merged the smaller change now, will check over the other one shortly. Cheers again :)

@NavinoEvans No worries, thanks for reviewing. :)

Hello Everyone,
My name is Bakare Samuel Ayomiku an undergraduate in Lagos State University currently studying Chemical Engineering. I am very excited about the project as we will be helping users have a better experience and easily double checking the data that was entered. Building a tool that will help work to be done faster is something i would love to be a part of.

My skill set includes;

HTML, CSS, JAVASCRIPT, REACT and Python fundamentals.

I have also contributed to building some web apps.

Github; https://github.com/saamlegend

Hello everyone my name is Mufaddal Hamid i am currently pursuing Bachelors in Computer Applications.i would like to contribute to this project.
my skill set includes:-
PHP,MYSQL,PYTHON,JAVASCRIPT
https://github.com/MufaddalHamid

AdityaJ removed a subscriber: AdityaJ.May 5 2020, 5:58 PM