Name: Ayush Shrivastav
IRC Nick: ayush_12 (freenode)
Typical Working Hours: 11 am - 6:30 pm IST (weekdays) ( UTC+5:30 ) + I will also work on weekends and overtime as required.
Wikimedia Commons is used to contribute to the community by uploading free usable media files. In recent times, it has introduced a tool to add structured data to its media files. But it has been observed that sometimes the structured data added is not correct and hence needs to be verified. Hence this Project focuses on building such a tool named “WikiCommons Image Verification Tool” so as to ensure that all the structured data Provided on the commons website is correct.
Currently, the idea is to verify the categories and image description (depict statements)
Mentors: @Eugene233 & @NavinoEvans
- Users Sign-in using Oauth
- Users will be able to search WikiCommons Categories and fetch images and its respective Depict statements based on that Category
- The results can be sorted in 2 parameters - Recents & Random
- Let users review with “ YES” or “NO” if the image description is true or false.
- Attach a Rating for Every Review depending on User Account level (example: google map local guide level)
- Moderator Sign up
- Review by Moderator if in case Anomaly Encountered- Retain or Reject
- A page to see User’s past contributions
- If During Anomaly, those users whose vote matches with Moderator action ( Retain or Reject), then award points to User account which inturn increases User Account level.
- Robust design of the tool.
Will Extend, if time Permits
- extend to additional image information.
- includes hashtags for searching along with categories.
- integrate with ISA tool to keep fair competitions during campaigns.
- use Vision API to get Machine suggestions for the image data.
The following specified is the deadline but it is very likely that we will achieve more.
4 May- 31 May
Community Bonding Period: Build Repository on Gerrit, tasks on Phabricator, Create the Tool on ToolForge, Learn about various API, discuss the implementation with the mentors and get ready with the environment
1 June- 14 June ( 2 weeks )
Setup user sign-in using OAuth, fetch user details and create a past Contributions history page.
15 June- 4 July ( 3 weeks )
Build a user Dashboard with the ability to retrieve Wikicommons Image and descriptions from category along with ability to Vote “YES” / ”NO” and add the ability to Retain or Remove claims based on avg. vote % automatically and update in the DB.
5 July- 18 July ( 2 weeks )
Build a moderator Dashboard with the ability to finally review in case of anomaly and retain or remove the claim. Also to award the user with points on a successful review by moderator.
19 July- 25 July ( 1 week )
Bug Fixes if encountered else UI Improvements.
26 July- 1 Aug ( 1 week )
Implement Administration Dashboard to Select and approve Moderator sign-up.
2 Aug- 8 Aug ( 1 week )
Overall Bug Fixes and other improvements.
9 Aug- 22 Aug ( 2 week )
Writing Documentation and test cases.
23 Aug- 1 Sept. ( 2 week )
Try to implement Extended Features.
Get Final Mentor review & Final submission.
Note: I will also have my semester exams during mid june during which i will be able to work less but I assure you I will compensate for that time beforehand so as this project progress doesn’t get hampered.
Apart from GSOC, I do not have any other commitments or internships scheduled during this period.
- I will maintain my repo and update and upload code every weekday.
- I will be online on IRC and Zulip in my working hours to collaborate with the mentors.
- I will use Phabricator for managing bugs and subtasks.
- I will be available in gmail to be contacted when needed in the non-working hours.
- Write Weekly Reports.
Idea Specification and Implementation:
STEP 1- We can make a web interface where the users can search the wikimedia commons categories.
STEP 2- Now we will use a Rating system for the users where a rating will be assigned to the respective image. This rating will depend on the User level/ experience as we see the rating for Google Local guides.
STEP 3- All the user has to do is click "Yes" or "No" when questioned about the things ( all structured data depicts, Wikidata items) they see in the images and a rating will be attached with the image. For instance if the uploader has set “trees are swaying” as the description while uploading, then we will ask the tool users if the image belongs to the correct category and has the correct description and structured data attached, and then his answer will be recorded as a YES/NO.
STEP 4- If a user depicts YES, then Yes stack flag will be updated with the rating while If a user depicts NO, then No stack flag will be updated with the rating and an average will be calculated for both the votes and a TimeStamp will be attached with it.
STEP 5- If the vote count will reach the threshold value, then going by the average vote %, we will update the Database automatically.
STEP 6- If the average YES and NO Vote percent is equal or ± 2 % and no. of votes ≥ (say) 50k then we will flag those images and a cron job will trigger an alert to the moderator.
STEP 7- Now the Moderator can Log On and can RETAIN or REMOVE only those claims which were flagged for the moderator and as per his decision, Database will be updated.
This process will be reliable and fast as many images can be reviewed at a time by the moderator depending on users average votes.
Q) What is the REQUIREMENT OF A MODERATOR ?
Ans- If there is an Anomaly under which the average YES and NO vote percent is equal or ± 2 % and no of votes ≥ threshold then we will flag those images and the moderator will only review these images. Hence we will achieve more accurate results with less moderator effort.
Q) If a user has already rated an image, WILL THAT IMAGE REPEAT?
Ans- No, Every image will be shown once. I have also created a Pseudo code snippet so as a user world not be shown the same image more than once ( if he has already voted to that image).
Pseudo code is attached below-
Workflow of the Tool:
Use Case Diagrams:
1- User Screen
2- Moderator’s Screen-
(If Anomaly arises)
The Working Graphical Prototype for the user Dashboard can also be found by clicking here.
I am currently in the 2nd year of my Bachelors in Technology Degree from Pranveer Singh Institute of Technology, Kanpur, India with a major in Computer Science Engineering.
I have been tinkering with code since High school days. I love working on Web Apps, Tools as well as have a keen interest in Backend Development. I am a patient learner and like to work in collaboration.
This is my first participation in GSoC. During the summer, GSoC and this Project will be my first priority since I won't have any other commitments during this period.
I have been looking and understanding wikimedia for some time now and tried to contribute to some open issues and i have been thrilled to see the working methods of wikimedia and its products at such an amazing level and i think contributing to wikimedia by this project would impact the society in a positive manner. I am also excited and look forward to working with some amazing people from which I can learn a lot.
- I have also experience with Node.Js and use MySQL For the Databases.
- I am also comfortable with Python and have a basic knowledge of Flask ( which i shall improve during this intern course).
- I have also set up my 2 major and 2 minor personal projects related to Web Development and excited to learn more.
- I am also an Open Source Evangelist.
- I have also organised a Workshop at college level to introduce Freshmen Students to GIT And Version Control Systems.
MicroTasks and Current Progress:
- With Wikimedia , I have been trying to solve issue T232038 and T105637 and also made a PR for it.
- Also raised a PR in gerrit https://gerrit.wikimedia.org/r/#/c/labs/tools/Isa/+/588656/ for updating the README Instructions while setting up ISA Tool. Also currently trying to solve another ISA Issue for removing hidden categories from commons showing in isa tool.
- As a part of the Project , I have also started to create OAuth for login which is in progress.
- Repo link- Github and the same repository will be updated soon with new commits which i have locally.
Why me ?
- I have Studied the API’s required and other prerequisites thoroughly and realised that this project lies in my do-able range ( not easy & not tough).I have realised that i will learn alot during this project.
- Most importantly, I would be happy to make a tool which will affect the community in a positive way and will be used by millions of users.
- I would also keep contributing and working on this tool even after this GSOC development program as a responsible maintainer and contributor and will take the responsibility for further bugs and try to implement new features.
Looking Forward for an awesome learning Experience.