Page MenuHomePhabricator

[Outreachy Proposal] - Addressing the Lusophone Technological Wishlist Proposals
Closed, DeclinedPublic

Description

Profile Information

Name: Rishan P
Email: Rishan.pgowda@gmail.com
Github : rovertrack
Other communication modes: Discord, Zulip, Slack.
Location: Bangalore India.
Typical Working hours (include your timezone): 6:00 pm to 2:00 am (IST) UTC+5:30

Synopsis

The Lusophone Wikimedia community - the editors across cross Brazil, Portugal, Mozambique.. runs a yearly technological wishlist in which they identify the issues regarding their editing experience.

This project is focused on completing two of those community wishes.

  1. Wikipedia Visual Editor - A rich-text Visual Editor built so that editors across different places don't have to learn wikitext markup to edit and contribute.
  1. WikiScore - WikiScore is a web application built by Wikimedia Brasil specifically to run edit-a-thons and editing contests. It tracks who contributed what, calculates a score for each participant, and displays a live leaderboard.

Project

Project Objectives
  • Wishlist #3 is: to implement a check in the Visual Editor for duplicate references using the reference identifier (ISBN, DOI or URL) and let the user reutilize the already used reference.
  • Wishlist #8 is: to implement Wikidata support for Wikimedia Brasil's scoring tool wikiscore, allowing the community to do edit-a-thons and edit contests using Wikidata.

It is mentioned that This project aims to implement one of the community wishes present in this wishlist, specifically wish #3 or #8
As i am familiar with the visual editor codebase, I'm confident enough to provide support for the both wishes, If still restricted can cope with just one. In order to learn more i want to go ahead and implement both if granted.

Possible Mentor(s)

@Ederporto
@Arcstur

Have you contacted your mentors already?

No.

Implementation

Important links
Visual Editor wishlist
Wikidata wishlist

Phase 1 - Real-Time Duplicate Prevention

Problem
The current citation in the Visual Editor does not recognize what sources are already cited in the article. When an editor pastes a URL, DOI, or ISBN into the Automatic tab, the Visual Editor hands it directly to the Citoid service via performLookup in ve.ui.CitoidInspector.js and generates a new reference object without ever checking the InternalList the in-memory store of all existing references. The Reuse tab is completely disconnected from the Automatic and Manual tabs. The main cause is that identifier normalization does not exist.
So no matter which 10.1000/xyz and https://doi.org/10.1000/xyz, ISBN-10 and ISBN-13 are all treated as completely different identifiers even though they point to the same source. This makes it produces duplicate references, messing the whole article source code, this creates confusion for editors.
Solution
We add a local-first interception layer before performLookup processes to Citoid. After this when the editor submits an identifier

  • A normalization engine cleans the input first i.e mw.Uri handles URLs, regex strips ISBNs, DOIs are lowercased and prefix-stripped
  • A lookup table built from the normalized InternalList is queried instantly, If a match is found Citoid is never called and an OO.ui.MessageWidget alert appears natively inside the Visual Editor offering a one-click switch to the Reuse tab with the matching reference already highlighted and ready to use.
  • If no match, performLookup() proceeds normally with the usual flow.
  • As the editor types, autocomplete suggestions of matching existing references appear in real time. A merge tool handles pre-existing duplicates already in the article by updating all citation tags to point to a single canonical reference

This results in:

  1. Either the Automatic or Manual tab can't create duplicate references
  2. No Citoid API calls wasted on identifiers that already exist locally in memory.
  3. Editor is never blocked, i.e alert is non-blocking.
  4. Fully native OOUI design indistinguishable from built-in Visual Editor components

Phase 2 - Wikidata Scoring Support for WikiScore

Problem
When a contest includes Wikidata tasks such as adding structured claims, labels, descriptions, references to items, WikiScore is not aware of any of it. Organizers are forced to count Wikidata contributions manually by checking each participant's Wikidata history individually and calculating edits by hand after every contest. Raw edit counts are also not meaningful for scoring, adding a fully referenced new item is far more valuable than fixing a typo in a label but both are treated identically. During large contests the Wikibase API returns inconsistent response formats and rate limits requests which is not handled by wikiscore.
Solution
A new Python module which :

  • Connects to the Wikibase API using usercontributions with full pagination and exponential and we have to make sure this is non-blocking and no crashes occur during live contests
  • Passes each contribution to a SPARQL(i have worked on it before) module querying the Wikidata Query Service at query.wikidata.org to verify type of contribution added, Portuguese label added, reference added to claim, new item created etc.
  • Sends verified types into a configurable scoring engine where organizers adjust point values per contribution type per contest without touching code manually.
  • Stores results in a new Django model alongside existing Wikipedia scores without touching any current functionality
  • Update the leaderboard template to display Wikipedia score, Wikidata score, and combined total in a single view, its type, points awarded, and a link to the Wikidata item for verification.
  • A background process fetches task keeps scores updating in real time during live contests without manual page refreshes

The Result

  • No manual counting - all Wikidata contributions fetched, verified, and scored automatically.
  • Contribution quality measured not just counted - meaningful structured edits score higher.
  • Portuguese-specific contributions recognized and rewarded directly for the Lusophone community.
  • Organizers and participants see one accurate leaderboard covering all contributions in real time.
  • Every score is verifiable by clicking through to the actual Wikidata edit behind any point awarded.

Deliverables

Timeline

PeriodClassificationTasks
Community Bonding (May 4 - May 17)Community & SetupGet familiar with mentors and the Lusophone Wikimedia community. Study ve.ui.CitoidInspector.js, Study WikiScore Django codebase and existing Wikipedia scoring flow. Set up local MediaWiki environment with VisualEditor and Citoid. Set up local WikiScore Django environment. Finalize implementation plan with mentors.
Week 1-2 (May 18 - May 31)Wishlist 3 Phase 1: Normalization EngineAnalyze ve.ui.CitoidInspector.js and map exact interception point inside performLookup. Build normalization functions: regex for stripping ISBN hyphens and spaces and implementation for converting ISBN-10 to ISBN-13. Implementing mw.Uri for canonical URL comparisons handling http vs https and www differences. Implement DOI case - folding and prefix stripping. Write unit tests for all functions.
Week 3-4 (June 1 - June 14)Wishlist 3 Interceptor & UIModify performLookup to run local in memory InternalList scan before any Citoid call. Build normalized lookup table from the InternalList at dialog - open time. Implement OO.ui.MessageWidget alert when duplicate is detected to make aware of it to the user/editor. Add switch to Reuse tab with matched reference highlighted. Implement a real-time autocomplete suggestions as editor types identifier.
Week 5-6 (June 15 - June 28)Wishlist 3 Merge Tool & TestingBuild merge tool for existing duplicates already in the article. Implement cross-type detection by re-checking Citoid metadata against lookup table after Citoid API call. Update all citation tags to point to the reference after merge. Rigorous testing of normalization engine across all possible edge cases. Performance testing on articles with 500+ references. Verify UI matches native Visual Editor design.
Week 7 (June 29 - July 9)Wishlist 3: Finalization & SubmissionFinalize Wishlist 3 code. Submit patch via Gerrit/Github for review. Document normalization logic, interception point and UI triggers for future maintainers/developers, Add changes if required after review comments.
July 10 - July 20Break / Busy PeriodDeclared busy period. 1-2 days of busy work only, Rest all working on the project. Light reading of Wikibase API documentation and WikiScore codebase to prepare for Wishlist 8.
Week 8 (July 21 - July 26)Wishlist 8: Environment & FoundationPhase 2 Begins, Run existing WikiScore locally and trace full scoring flow . Study Wikibase API endpoints usercontribs and recentchanges. Study SPARQL and Wikidata Query Service at query.wikidata.org. Prepare detailed integration plan for Wikidata support and review with mentors before implementation begins. Verify implementation plan with mentors
Week 9-10 (July 27 - August 2)Wishlist 8 Fetch Engine & ScoringRun Django migrations to extend schema with new WikidataContribution model. Build Python module to connect to Wikibase API with full pagination and rate limiting. Build SPARQL module to verify type of contributions: new item created, claim added, Portuguese label added, reference added to claim. Implement weighted scoring engine with configurable point values per contribution type. Integrate Wikidata scores alongside existing Wikipedia scores in Django without touching exisiting functionality.
Week 11 (August 3 - August 9)Wishlist 8 Dashboard & IntegrationUpdate WikiScore leaderboard Django template to display Wikipedia score in real time, Wikidata score and combined total. Build per-participant contribution detail view showing each Wikidata item edited, contribution type, points awarded and direct link to Wikidata item. Set up background fetch task with Celery for real-time score updates during live contests. Enforce rate limiting of 200-500 edits per query for API stability. Edge case testing using real data from past Lusophone edit-a-thons.
Week 12 (August 10 - August 16)Final CleanupPerformance optimization and bug fixes for both Wishlist 3 and Wishlist 8. UI polish on WikiScore dashboard. Write full technical documentation and developer guidelines for both tools. Write maintainer handover notes covering known limitations and future improvement ideas. Final project report. Thank you for a wonderful Outreachy 2026.

This roadmap is designed to be flexible. If technical debt or implementation blockers arise, work can continue beyond the 12-week Outreachy period to ensure code quality is never sacrificed for speed. I plan to continue contributing to the Wikimedia Foundation even after the Outreachy program ends, maintaining the implemented features, addressing feedback, and contributing to other areas of the codebase.

About Me

  • How did you hear about this program?

X (twitter)

  • What does making this project happen mean to you?

Means gaining knowledge under guidance of some of the best mentors .Knowledge which will help me improve more Giving to open source, Providing values

Tasks done

Microtask 1: https://github.com/rovertrack/js-manipulate-json
Microtask 2: https://github.com/rovertrack/url-status-code

Contributions to Wikimedia Foundation

Pull request

Visual Editor

PR TaskStatus
#1269587VE Toolbar: Hide View as right-to-left for non applicable languagesOpen

Codex PHP

PR TaskStatus
#1259201Accordion: Add separation styles for componentMerged

The above contriubtion was done for the codex project, This contribution is not one of the microtask which was to be done but a separate contribution to the project, Adding it here as part of contribution to the Wikimedia foudation.

Thank you for your time and consideration.

Event Timeline

Rishannn updated the task description. (Show Details)
Rishannn added subscribers: Arcstur, Ederporto.
Gopavasanth subscribed.

Thank you for your proposal and the effort you put into it. This year we received over 20 strong applications, and after a highly competitive review, we were unfortunately unable to offer you a slot.

Please don't see this as a failure, many contributors who weren't selected for Outreachy have gone on to make meaningful, lasting impact in the Wikimedia community, and we genuinely hope you'll stay engaged. You're very welcome to continue contributing outside of Outreachy. Our mentors and org admins are happy to help you get started or keep going:

We hope to see you around in the community.