Page MenuHomePhabricator

GSoC Proposal: Design and Develop a tool to correct false depicts claims manually on Wikimedia Commons
Closed, DeclinedPublic

Description

Profile Information

Name: Aditya Jain
IRC nickname on Freenode: AdityaJ
Gmail ID: adi2007jain@gmail.com
Github: https://github.com/Jain-Aditya
Typical working hours: Between 11 am and 6 pm UTC +5:30 (Will give more time if required)

Synopsis

Short summary describing your project and how it will benefit Wikimedia projects

We know, a significant part of structured data has been entered through some tools like ISA which have been developed to collect structured data for the images on Commons. Through these tools some participants may enter ‘wrong’ or ‘unclear’ information which needs to be verified. This project aims at creating a tool which enables volunteers to correct false depicts claims manually on Wikimedia Commons. The tool will be a simple web application through which users will be able to search Wikimedia Commons categories on which they want to work on and then the images in this category will be displayed to them asking them questions about the things they see in the images with emphasis laid on the depicts statements which have been added to the images. All the user has to do is click ‘YES’ or ‘NO’. Upon clicking ‘Yes’ We retain the depicts claim and ‘NO’ means we remove the claim.

Possible Mentor(s)

@NavinoEvans, @Eugene233

Have you contacted your mentors already?

Yes

Deliverables

After the completion of Internship, the new tool will have the following features:

  • A home page will be built on which a autocomplete feature will be provided to search for categories since there are thousands of categories on Commons.
  • After clicking through to a category, user will land on a page displaying all the images under this category.
  • User will be able to select a particular image for which they want to verify the information.
  • Yes and No button will be provided to verify each of the information (like description, caption,depicts, etc.) of the selected image. A final submit button on this page will save the updated information corresponding to this image.
  • The user will be able to authenticate through OAuth.

If time permits, the following features will be added to the tool:

  • We may need to restrict who should be able to review this information and track which user has reviewed the image.
  • Provide search options on the category page to search for a particular image.
  • Paginate the display of images under a category.
  • There are a lot of images on Commons with depicts data. So, we could explore the option for filtering the images which are more likely to be wrong (For eg. Images uploaded by new contributors).

UI Mockups

Homepage to select category would look like

Screen Shot 2020-03-31 at 5.38.06 PM.png (471×707 px, 22 KB)

Images of a category will be displayed like

Screenshot 2020-03-26 at 12.40.31 PM.png (1×1 px, 99 KB)

Review page would look like

Screenshot 2020-03-27 at 2.41.31 PM.png (1×1 px, 137 KB)

Implementation Details

I will be using Python plus Django for backend and HTML/javascript for frontend. The following mediawiki APIs will be used in this tool (Research on available API signature and response will be done in community bonding period):

  • Read API which can provide the list of all categories on commons. (For home page)
  • Read API which can provide the list of all images (along with their metadata) associated with a category (For category page).
  • Update API which can used to update the metadata for an image (This will be used after user completes the review for an image).

Timelines

I have kept buffer time for every development phase. If things go smoothly, I will utilise the remaining time to implement additional proposed features under 'Deliverables' section.

Week 1 & Week 2

  • Community bonding period
  • Discuss design and refine mockups
  • Explore available mediawiki APIs which can be used as per the approach mentioned under ‘Implementation details’ section.

Week 3 & Week 4

  • Setup repository on gerrit
  • Implement home page to display the categories
  • Host the tool on ToolForge.

Week 4 & Week 5

  • Implement category page
  • Start working on frontend of review page

Week 6 & Week 7

  • Finish off with review page
  • Take feedback from users by announcing the tool on mailing list and communicating with the target users

Week 8 & Week 9

  • Integration with OAuth

Week 10 & Week 11

  • I will utilize this time to implement any additional features (which are feasible within the given time) and filing them which are suggested by the users as the feedback.

Week 12 & Week 13

  • Writing test cases (Though I will focus on writing tests as and when a feature gets completed).
  • Documentation and bug fixes

Participation

  • The project code will reside on Gerrit.
  • For sharing status and discussing ideas, I will be using Phabricator. I will be active on Zulip and Gmail during my working hours.
  • I will be writing blog for sharing my experience and progress on this project. I plan to do it after every evaluation.

About Me

Tell us about a few:

Your education

I am a final year Computer Science undergraduate student at Bundelkhand Institute of Engineering and Technology, Jhansi (India).

How did you hear about this program?

I first heard about this program in a campus session regarding Open Source development. Also, I was a GSoC intern with Wikimedia last year.

Will you have any other time commitments, such as school work, another job, planned vacation, etc, during the duration of the program?

Since this is my last year in university, I will be completely free after the exams in mid may until joining of my company arrives (Which is scheduled in July) but still I will be able to give 3-4 hours on weekdays and 7-8 hours on weekends.

We advise all candidates eligible for Google Summer of Code and Outreachy to apply for both programs. Are you planning to apply to both programs and, if so, with what organization(s)?

I am only applying for Google Summer of Code with Wikimedia organisation.

What does making this project happen mean to you?

I am passionate about Problem Solving and Software Development. I am also passionate about the Wikimedia movement. What Wikimedia today is because of its strong community and quality of contributions by volunteers. Building this tool will definitely help Wikimedia sources to become more reliable and this feeling motivates me more towards its development.

Past Experience

I have been coding for a couple of years now. I have deep understanding of Data Structures, Algorithms and Software Development. I have mostly worked on Python,C++, Django, Flask, MySQL, Postgres, Javascript.
Following are some of my personal projects:

  • Discussion-Forum: A Django application that uses MySQL database. It allows admin to create categories for which user may start any discussion. Also other users can post comments on that discussion. The link to the repo is: https://github.com/Jain-Aditya/DiscussionForum
  • Book-Review app: A Flask application that allows logged-in users to search for books by their author name or by ISBN number. I have used Postgres database for this application which is hosted on Heroku. This application fetches the book ratings and reviews from the API provided by goodreads. The link to the repo is: https://github.com/Jain-Aditya/Book-Review

I was a Google Summer of Code 2019 Intern with Wikimedia. I worked on Hashtags search tool under the mentorship of Sam Walton. The project was based on Python, Django and JavaScript. The work report for the project could be found here. I was also a Google Code-in mentor for Wikimedia in the last edition where I mentored tasks for Wikilink tool and Hashtags tool. Also, I have been continuously mentoring new comers to contribute to Hashtags project.

Valuable contributions to Wikimedia

GSoC 2019 Contributions

T186706 T227320 T227319 T227322 T228501 T228029 T208029

Tasks mentored in GCI 2019

T239332 T239598 T234678 T228040 T228987

Other Contributions

  • Some important issues reported on GitHub are listed here.

Related Objects

Event Timeline

Hi @Eugene233 , @NavinoEvans Could you please provide your valuable feedback on my proposal?

I need some clarification on the below things:

  1. Does a user need to be authorised to perform the above verification?
  2. Do we need to keep track of who has verified the image?
  3. Once verified, Is re-verification of the same image allowed?
  1. Does a user need to be authorised to perform the above verification?

Technically, no. Anyone, without being logged in, can add or remove depict statements.

  1. Once verified, Is re-verification of the same image allowed?

What do you mean by "verification"?

"Verification" means pressing the button yes or no. I believe we can do this operation multiple times on the same image.

@NavinoEvans @Eugene233 Just a gentle reminder to review my proposal. :)

@AdityaJ IMO your proposal looks good. Clearly written. Please have you read through the comments under the proposed project just so you didn’t miss anything?

Thanks! Yeah, I have looked at the comments already before beginning with the proposal. Still I will keep an eye if something relevant comes up.

@AdityaJ We are sorry to say that we could not allocate a slot for you this time. Please do not consider the rejection to be an assessment of your proposal. We received over 100 quality applications, and we could only accept 14 students. We were not able to give all applicants a slot that would have deserved one, and these were some very tough decisions to make. Please know that you are still a valued member of our community and we by no means want to exclude you. Many students who we did not accept in 2019 have become Wikimedia maintainers, contractors and even GSoC students and mentors this year!

If you would like a de-brief on why your proposal was not accepted, please let me know as a reply to this comment or on the ‘Feeback on Proposals’ topic of the Zulip stream #gsoc20-outreachy20. I will respond to you within a week or so. :)

Your ideas and contributions to our projects are still welcome! As a next step, you could consider finishing up any pending pull requests or inform us that someone has to take them over. Here is the recommended place for you to get started as a newcomer: https://www.mediawiki.org/wiki/New_Developers.

If you would still be eligible for GSoC next year, we look forward to your participation!