**## Profile Information**
**Name**: Angel Sharma
**Github**: [[https://github.com/fillingtothemomo | fillingtothemomo]]
**Gmail**: rockingpenny4@gmail.com
**Phabricator**: [[https://phabricator.wikimedia.org/p/Rockingpenny4| Rockingpenny4]]
**Gerrit**: rockingpenny4
**Location**: Mathura, India
**Time Zone**: IST(UTC+5.30)
**Working hours:**: 3:00 PM to 3:00 AM (IST)
**## Synopsis**
PageTriage is an MediaWiki extension that allows patrollers on the English Wikipedia to track, categorize and deal with problematic new pages. One of it's features is the VueJS based New pages feed which allows patrollers to filter specific interesting pages they might want to patrol based on certain criteria. However, these filters are often limited and there has been some interest amongst the community to introduce newer filters and in general improve the ability to search for specific content on the New pages feed.
As part of this project, the filtering and searching capabilities of the New pages feed should be enhanced. Particularly, to add AI based topic prediction (leveraging the ORES API), the ability to search for a specific keyword in a article, filter by how many pageviews a article gets and be able to search by how similar a particular page is to other deleted pages.
**Possible Mentor(s)**
@Soda, , @TheresNoTime
**Have you contacted your mentors already?**
Yes
**## Deliverables**les
**T218132 Add ORES topic prediction to the NewPagesFeed and allow filtering by the same:**
T218132 Add ORES topic prediction to the NewPagesFeed and allow filtering by the same:
ORES now supports topic prediction (articletopic). Topic prediction is different than class prediction and potential issue prediction. Topic prediction means predicting if an article is about chemistry, politics and government, sports, Central Asia, etc.
New page patrollers would find this useful if they want to filter the feed by one or two topics that they are interested in or have specialized knowledge in. Currently other tools are used to do this, such as
https://en.wikipedia.org/wiki/User:SDZeroBot/NPP_sorting, but is not integrated into PageTriage.
Integrate articletopic into PageTriage, in both the Special:NewPagesFeed filters menu, and the Page Curation toolbar "Page info" flyout using PHP.
Article topic: (some illustrations)
{F43168031}
Links I referred to:
https://phabricator.wikimedia.org/T245906,https://www.mediawiki.org/wiki/ORES#Topic_routing
,https://www.mediawiki.org/wiki/ORES/Articletopic,https://phabricator.wikimedia.org/T240517
On the frontend, this could be integrated as a select box by enabling selecting multiple topics as the categories are pre-defined and will be more helpful for users.
----
**T207761 Keyword Search for New Pages Feed:**
• Create a new field in the "That" section of the search filters for the New Pages Feed, which is the last option (i.e. bottom of list)
• The text should read "Has the following keyword(s)"
• If a user inputs one or multiple keywords into the field and clicks "Set Filter," the search results in the New Pages Feed should only display results that have matching keywords in the article text.
Implementing the feature like present here https://tools.wmflabs.org/nppbrowser/ .
**Keyword search example:**
{F43167731}
**Mock UI**
{F43169922}
Integrate into Search filters like :by adding to !!FilterRadios.vue!!:
{F43167664} in FilterRadios.vue
----
**T207238 Special:NewPageFeed - add option to filter by pageviews:**
Add functionality to sort the NewPageFeed by pageview count, so that Reviewers can prioritise high impact articles.
Links I used to get better overview: https://phabricator.wikimedia.org/T225169 ,Proposed Approach: Displaying pageview counts for articles without sorting or filtering capabilities. https://phabricator.wikimedia.org/T230567 .
A solution (as described in the T225169 investigation): We could display the number of pageviews in the article record,These counts could include metrics like average daily views or total views in the last 30 days, represented on a logarithmic scale for better visualization. without allowing for sorting or filtering.Utilizing a logarithmic scale ensures manageable distinct values, We could display the number of page views (e.gaiding reviewers in assessing article popularity. average perPageview data should be queried efficiently from a maximum of 30 days ago, median per day,with consideration for a 24-hour lag in display time. total A maintenance script would periodically fetch and store pageviews data in the last 30 daysPageTriage table, etc)with optimized SQL queries ensuring efficient data retrieval. Note that the results displayed will be from 24 hours earlier than the display time, and we’ll want to query from a maximum of 30 days ago (for the sake of general efficiency and manageability of this feature). To get a general sense of popularity of each article we can do that by storing (ceil) log-base10 of the pageviews as a page_tag.Refinements and discussions during the GSOC period will help fine-tune implementation and address any challenges.
Links used for reference: https://phabricator.wikimedia.org/T225169 , That way there are a limited number of distinct values in the tag, and the reviewer has a general sense of the popularity of the article.
The exact approach can be discussed during the GSOC period and refined with the mentor’s reviewshttps://phabricator.wikimedia.org/T230567 .
----
**T327955 See and filter with percent similarity to top deleted revision:**
CSD G4 requires that the new article be substantially similar to the old article. However patrollers that aren't admins cannot see deleted revisions.
PageTriage already detects if an article has been "previously deleted". Explore the idea of expanding this detection to include...
Detection of a previous AFD, by checking for the existence of an AFD page
If previous AFD detected, and the page has been deleted before, there should be an API added to PageTriage to pull the top deleted revision, and then compare it to the current top revision, and provide a % wikicode match.
This should either be run with a button, or run automatically.
May or may not want to make this a pagetriage_page_tag (article metadata).
Approach-
Create a dedicated API( would only calculate this when needed).
Add to front end - could add a button to calculate this, or could auto calculate it when visiting the article, or could auto calculate it for everything and add it as a red article tag in
Special:NewPagesFeed (pagetriage_page_tag / metadata are involved in the latter.
Add similar support for "previous AFD". there is an afd_status page tag, but it only tracks current deletion tagging. There is a recreated tag, but it tracks all kinds of previous deletion, not just AFD.
> **----
## Timeline**
>
**Pre-GSOC**
Work on open issues on wikimedia phabricator and improve my skills and understanding of the mediawiki codebase , whilst still exploring the project and gathering more information about features to be implemented during GSOC project. I have already contributed to various extensions like InlineComments, PageTriage, MobileFrontend, AdminLinks and WikiEduDashboard and learnt a lot from each PR made.
**Community Period**
| May 02, 2024 - May 27, 2024
- Get acquainted with mentors and the Wikimedia community.
- Familiarize myself with the existing codebase and architecture of PageTriage extension also discuss potential ideas and approaches for solving the identified issues.
- Dive deeper into understanding the ORES service and its integration possibilities with PHP and Vue.js.
- Engage in discussions with mentors and community members to refine project goals.
**Coding Period**
| May 27, 2024 - June 10, 2024
- Look into the initial features to be implemented and start work on integrating ORES for searching article topics into the Page Triage filters.
- Research ORES documentation and understand its API for topic prediction to be integrated into Vue JS using PHP.
- Write the bi-weeekly report.
| June 10, 2024 - June 24, 2024
- Implement the backend integration of ORES for topic prediction.
- Begin frontend development for displaying topic filters in the New Pages Feed and Page Curation toolbar.
- Write bi-weekly report.
| June 24, 2024 - July 08, 2024
- Finalize frontend implementation and ensure proper interaction with the backend ORES service.
- Conduct initial testing and resolve any issues encountered whilst simultaneously updating the documentation.
- Start researching on implementation of keyword search like nppBrowser and going through the documentation.
- Prepare for mid-evaluation and resolve bugs , if any.
- Write bi-weekly report.
**Mid-Evaluation**
| July 08, 2024 - July 22, 2024
- Work on feedback received from the evaluation and research for approaches for implementing a page views count , finalise an approach with the help of mentors and work on it.
- Start integration of keyword search on the backend.
- Write bi-weekly report
| July 22, 2024 - August 12, 2024
- Work on frontend UI implementation
- Integrate backend and frontend of keyword search feature and resolve bugs while testing , if any.
- t- Timely update the documentation and write bi-weekly report.
| August 12, 2024 - August 26, 2024
- Start working on implementation of page view counts using the proposed approach.
- Understanding database and efficient SQL queries in pagetriage_page_tags table
- Implementing the maintenance script for calling API that stores page views.
- Write bi-weekly report
| August 26, 2024 - September 9, 2024
- Integrate page views backend logic for retrieving and displaying view counts with frontend UI and sort it.
- Finalise the code on discussing with mentors for seamless integration.
- Write bi-weekly report.
| September 9 , 2024 - September 23, 2024
- Start working on See and filter with percent similarity to top deleted revision feature
- Research methods for comparing revisions and detecting % similarity to top deleted revisions.
- Write bi-weekly report.
| September 23, 2024 - October 7, 2024
- Develop the backend API for pulling and comparing deleted revisions.
- Implement frontend components for displaying % similarity and AFD detection status in the New Pages Feed.
- Write bi-weekly report
| October 7, 2024 - October 21, 2024
- Conduct testing to ensure accuracy and reliability of similarity comparison.
- Finalize implementation, including any necessary optimizations or adjustments based on testing results.
- Prepare documentation for the new features and ensure code quality meets project standards.
- Write bi-weekly report
| October 21, 2024 - November 4, 2024 (Buffer-period)
- Use this period to catch up on any backlog or address any unforeseen challenges encountered during the coding phase.
- Address any pending issues, bugs, or feature requests identified during testing and ensure all features are working as expected.
- Finalise documentation and prepare for final evaluation by organizing code repositories, submitting final reports, and collecting feedback from mentors.
- Write final blog report.
**Post-GSOC**
I am learning a lot by contributing to Wikimedia. Even after the GSoC period ends, I plan on contributing to this organization by adding to my past projects and working on open issues because of the familiarity of the technical stack and the new challenges that I am continually offered in the process.
Also, I would like to complete the future goals mentioned in my proposal. Having picked up many development skills, my primary focus would be to help the project and the community grow. I would also be interested in helping other people in getting started with their open-source journey and guide them in this fun process.
**Participation**
I am active on Email, Zulip, Discord and Slack. I will use Phabricator and Gerrit for issue discussions and code reviews. I plan on regularly meeting with my mentor to discuss my progress and get feedback on my work. I can dedicate 45+ hours a week as I have no other commitments.
> **## About Me**
>
**Education**
College: Indian Institute of Technology(IIT), Roorkee
Year of Study: 2nd year
Field of Study: Mathematics and Computing(Bachelors of Science)
**Skills**
I am a member of IMG- [[https://iitr.ac.in/Campus%20Life/Student%20Groups/Information%20Management%20Group.html| Information and Management Group]] of my college and we are responsible for handling the entire college's data , the [[https://iitr.ac.in/| Institute official website]] and Channeli - a one stop application for student and faculty's entire information ranging from placement stats and noticeboard to lost and found and complaints and grievances ; and various other projects. Hence , I have a lot of experience working on production level apps used my thousands of people and working with an amazing and collaborative team that makes a huge impact.
- Javascript ,HTML,CSS, Tailwind : Used vanilla JS in making projects like comic-book displaying website and basic games like flappy bird and space invaders and CSS for styling.
- Django, PHP: Making backend for various applications
- React JS , Vue JS: Used for frontend development in full stack projects
- Flutter, Java : Used for app development in Android Studio
- Docker
**How did you hear about this program?**
After getting into college, I learned about Google Summer of Code from my seniors, and some of them were selected for it and after talking with them, I looked at the program with greater interest.
**Will you have any other time commitments, such as school work, another job, planned vacation, etc., during the program?**
No, my current semester ends in April last week, and I will have holidays for 2.5 months where I can commit all my focus to this project, and commit to 40+ hours a week, and I have no other commitments. and I have no otherAfter my college starts I can commitmentst to 35+ hours a week as needed.
**We advise all candidates eligible for Google Summer of Code and Outreachy to apply for both programs. Are you planning to apply to both programs and, if so, with what organization(s)?**
I am **100% loyal to Wikimedia Foundation** and only plan on applying to Google Summer of Code with the Wikimedia Foundation.
**What does making this project happen mean to you?**
I have always been excited by the prospect of converting ideas into products with real-world impact and that is exactly what the Wikimedia Foundation does, producing free and open-source applications that impart learning to millions of people over the globe.
I am highly interested in this project, and contributing to Wikimedia since December 2023 has been a really fantastic learning experience with assistance from all mentors; each PR teaches me something new, and each feedback and code review enhances my coding skills and understanding of the project. Getting to work on this project will teach me production-level code structures and massively impact my learning.
> **## Past Experience**
>
**Microtask-**- Create a small independent tool/web app that interacts with any Wikimedia API and displays some information about a article. The tool must have a frontend built using VueJS and the Wikimedia Codex UI library. Include a link to the source code in your proposal
This [[ https://github.com/fillingtothemomo/Wiki_ProjectWord | Wiki_ProjectWord ]] leverages mediawiki's opensearch API and Codex UI library for user to search a specific word in an article of language of their choice with a cool dark mode.
!!Deployed!! [[https://wikiword.netlify.app/|here]]
**Contributions to Wikimedia**
| Title | Link| Status|
| Add timestamp display to comment replies| https://gerrit.wikimedia.org/r/c/mediawiki/extensions/InlineComments/+/1010852| Merged
| App timestamp display on comment creation| https://gerrit.wikimedia.org/r/c/mediawiki/extensions/InlineComments/+/1010349| Merged
| ALRow: Add row search class| https://gerrit.wikimedia.org/r/c/mediawiki/extensions/AdminLinks/+/1007973| Merged
| Fixes expand sections visibility on browser resize| https://gerrit.wikimedia.org/r/c/mediawiki/extensions/MobileFrontend/+/1011221|Merged
| Fixes DateControlSection component cut-off| https://gerrit.wikimedia.org/r/c/mediawiki/extensions/PageTriage/+/1011034| Merged
| Fixes date a11y issues| https://github.com/WikiEducationFoundation/WikiEduDashboard/pull/5687| Merged
| Fixes inconsistent highlight issue in navbar| https://github.com/WikiEducationFoundation/WikiEduDashboard/pull/5661| Merged
| Fixes toolbar falling off screen on zooming| https://gerrit.wikimedia.org/r/c/mediawiki/extensions/PageTriage/+/1013680| Open
|refactors milestones to functional component| https://github.com/WikiEducationFoundation/WikiEduDashboard/pull/5601| Open
| renders dates for milestones in home tab| https://github.com/WikiEducationFoundation/WikiEduDashboard/pull/5581| Open
**Past Projects**
-[[https://github.com/sdswoc/DirecM| DirecM]]
Worked on a project on app development using Flutter and Arduino using infrared sensors for a wayfinding app for blind and visually
impaired people under an event organized by a technical club of our college.
-[[ https://github.com/fillingtothemomo/Autumn_assignment| ProTrack ]]
My first major React project.Made a full stack application for managing personal groups and projects using React JS and Django backend with
MySQL and Tailwind CSS for styling.
-[[https://github.com/amogh-babu-k-a/DRDO-APP-DEVELOPMENT/tree/DRDO_version2| DRDO sensor Malware]]
Worked on developing malware apps for DRDO, India, in a research internship under Dr. Sateesh K. Peddoju
**Other open-source contributions**
-[[https://github.com/IMGIITRoorkee/omniport-docker| Omniport-Docker]]
Official docker distribution of Omniport - one true portal for every educational institute.
-[[https://github.com/CircuitVerse/CircuitVerse/pulls/fillingtothemomo| Circuitverse]]
CircuitVerse is a free, open-source platform that allows users to construct digital logic circuits online.