== Profile Information ==
Name: Ayush Khati
GitHub: https://github.com/AyushkhatiDev
Portfolio: https://portfoliokhati.netlify.app/
Location: India
I am a backend-focused software developer with experience in building scalable systems, APIs, and data processing pipelines. I have worked on production-grade systems involving performance optimization, caching, and distributed architectures. My interests lie in building reliable and data-driven tools that support real-world applications.
As part of this application, I completed both Outreachy microtasks (T418285 and T418286) and improved them based on mentor feedback.
---
== Synopsis ==
The Lusophone Technological Wishlist is a community-driven initiative aimed at identifying improvements that enhance the experience of contributors across Wikimedia projects.
For this internship, I propose to work on Wishlist #8: adding Wikidata support to the Wikiscore tool.
This feature will allow Wikiscore to track and evaluate Wikidata contributions, enabling edit-a-thons and contests to include Wikidata edits. It expands the scope of the tool and supports a broader contributor base.
---
== Selected Wishlist ==
Wishlist #8: Wikidata support for Wikiscore
Currently, Wikiscore primarily focuses on Wikipedia edits. However, Wikidata contributions are increasingly important and should be included in scoring systems to support structured data contributions.
The goal is to:
- Retrieve Wikidata user contributions using relevant APIs
- Parse, normalize, and validate contribution data
- Integrate Wikidata edits into the existing Wikiscore scoring pipeline
- Ensure accurate, consistent, and efficient computation of scores across different contribution types
---
== Technical Approach ==
The implementation will follow MediaWiki development practices and integrate cleanly with the existing Wikiscore architecture, ensuring maintainability and compatibility with Wikimedia tooling.
1. Data Retrieval
- Use Wikidata APIs (e.g., recent changes or user contribution endpoints) to fetch user edits
- Handle pagination, rate limits, and API constraints
- Ensure reliability with proper error handling
2. Data Processing
- Parse and normalize contribution data
- Filter relevant edit types
- Handle edge cases such as duplicate entries, reverted edits, minor edits, and incomplete metadata
3. Scoring Logic
- Define scoring rules for Wikidata edits
- Ensure compatibility with existing Wikiscore logic
- Maintain consistency across different contribution types
4. Integration
- Integrate Wikidata data into the existing Wikiscore pipeline
- Ensure modular, testable, and maintainable code
- Use caching (e.g., Redis) to optimize repeated queries
5. Performance Considerations
- Optimize API calls using batching strategies
- Implement caching to reduce redundant computations
- Consider API rate limiting and throttling constraints
- Ensure scalability for large datasets and high user activity
---
== Timeline ==
Week 1 (May 18 – May 24):
Set up development environment, run Wikiscore locally, and review project architecture. Explore Wikidata APIs and identify relevant endpoints for user contributions.
Week 2 (May 25 – May 31):
Analyze data flow in Wikiscore, define integration points for Wikidata, and finalize implementation plan with mentors.
Week 3 (June 1 – June 7):
Implement initial data fetching layer using Wikidata APIs, including basic request handling and response parsing.
Week 4 (June 8 – June 14):
Extend data retrieval to support pagination, error handling, and rate limiting. Validate data consistency.
Week 5 (June 15 – June 21):
Design scoring logic for Wikidata edits and implement core scoring functions.
Week 6 (June 22 – June 28):
Integrate scoring logic with fetched data and ensure compatibility with existing Wikiscore system.
Week 7 (June 29 – July 5):
Integrate Wikidata contribution pipeline into Wikiscore backend and ensure correct data flow.
Week 8 (July 6 – July 12):
Optimize performance using caching (e.g., Redis) and batching API requests.
Week 9 (July 13 – July 19):
Test system with real-world datasets and validate scoring accuracy.
Week 10 (July 20 – July 26):
Handle edge cases such as reverted edits, duplicate entries, and incomplete metadata.
Week 11 (July 27 – August 2):
Write technical documentation and document design decisions.
Week 12 (August 3 – August 9):
Refactor code, improve test coverage, and incorporate mentor feedback.
Week 13 (August 10 – August 17):
Finalize implementation, perform end-to-end testing, and prepare for submission and handover.
== Impact ==
This feature will enable organizers to include Wikidata contributions in edit-a-thons and contests, improving participation and recognition.
It will also make Wikiscore more versatile and inclusive, supporting a broader range of Wikimedia contributions and encouraging engagement with structured data.
---
== Why Me ==
My experience in backend development, API design, and data processing aligns well with this project.
I have worked on systems involving:
- REST APIs and large-scale data processing
- Performance optimization using caching and indexing
- Asynchronous workflows and distributed systems
I am comfortable working with complex data pipelines and ensuring reliability and scalability.
I have also worked on research-driven projects, including Physics-Informed Neural Networks (PINNs) and an AI-powered data extraction system. These projects involved building scalable pipelines, working with structured data, and designing efficient processing systems.
Research Work:
- PINNs Research Paper: https://github.com/AyushkhatiDev/Physics-Informed-Neural-Networks-for-Solving-Partial-Differential-Equations/blob/main/PINN_PDE_Solver_Research_Paper.pdf
- Data Extraction Research Paper: https://github.com/AyushkhatiDev/data-extractor/blob/main/paper/DataExtractor%20copy.pdf
I am comfortable iterating based on feedback and delivering incremental improvements, which is essential for contributing effectively to open-source projects like Wikimedia.
---
== Post-Internship Contribution ==
I plan to continue contributing to Wikimedia by:
- Maintaining and improving Wikidata integration
- Contributing to related tools and features
- Supporting new contributors in onboarding
Thank you for your consideration.