== Profile Information ==
Name: Ayush Khati
GitHub: https://github.com/AyushkhatiDev
Portfolio: https://portfoliokhati.netlify.app/
Location: India
I am a backend-focused software developer with experience in building scalable systems, APIs, and data processing pipelines. I have worked on production-grade systems involving performance optimization, caching, and distributed architectures. My interests lie in building reliable and data-driven tools that support real-world applications.
As part of this application, I completed both Outreachy microtasks (T418285 and T418286) and improved them based on mentor feedback.
---
== Synopsis ==
The Lusophone Technological Wishlist is a community-driven initiative aimed at identifying improvements that enhance the experience of contributors across Wikimedia projects.
For this internship, I propose to work on Wishlist #8: adding Wikidata support to the Wikiscore tool.
This feature will allow Wikiscore to track and evaluate Wikidata contributions, enabling edit-a-thons and contests to include Wikidata edits. It expands the scope of the tool and supports a broader contributor base.
---
== Selected Wishlist ==
Wishlist #8: Wikidata support for Wikiscore
Currently, Wikiscore primarily focuses on Wikipedia edits. However, Wikidata contributions are increasingly important and should be included in scoring systems.
The goal is to:
- Fetch Wikidata edits for users
- Process and validate contribution data
- Integrate it into the scoring system
- Ensure accurate and efficient computation
---
== Technical Approach ==
The implementation will follow MediaWiki development practices and integrate cleanly with the existing Wikiscore architecture, ensuring maintainability and compatibility with Wikimedia tooling.
1. Data Retrieval
- Use Wikidata APIs (e.g., recent changes or user contribution endpoints) to fetch user edits
- Handle pagination, rate limits, and API constraints
- Ensure reliability with proper error handling
2. Data Processing
- Parse and normalize contribution data
- Filter relevant edit types
- Handle edge cases such as duplicate entries, reverted edits, minor edits, and incomplete metadata
3. Scoring Logic
- Define scoring rules for Wikidata edits
- Ensure compatibility with existing Wikiscore logic
- Maintain consistency across different contribution types
4. Integration
- Integrate Wikidata data into the existing Wikiscore pipeline
- Ensure modular, testable, and maintainable code
- Use caching (e.g., Redis) to optimize repeated queries
5. Performance Considerations
- Optimize API calls using batching strategies
- Implement caching to reduce redundant computations
- Consider API rate limiting and throttling constraints
- Ensure scalability for large datasets and high user activity
---
== Timeline ==
Week 1–2:
Set up the development environment, analyze the Wikiscore architecture, and explore Wikidata APIs. Finalize implementation approach with mentors.
Week 3–4:
Implement the data fetching layer using Wikidata APIs, including pagination, error handling, and rate limiting.
Week 5–6:
Design and implement scoring logic for Wikidata edits, ensuring compatibility with existing systems.
Week 7–8:
Integrate Wikidata contributions into the Wikiscore pipeline and optimize performance using caching and batching.
Week 9–10:
Test with real-world datasets, validate scoring accuracy, and handle edge cases.
Week 11–12:
Write documentation, perform code cleanup, and refine implementation based on mentor feedback.
Week 13:
Finalize implementation, conduct testing, and prepare for submission and handover.
---
== Impact ==
This feature will enable organizers to include Wikidata contributions in edit-a-thons and contests, improving participation and recognition.
It will also make Wikiscore more versatile and inclusive, supporting a broader range of Wikimedia contributions and encouraging engagement with structured data.
---
== Why Me ==
My experience in backend development, API design, and data processing aligns well with this project.
I have worked on systems involving:
- REST APIs and large-scale data processing
- Performance optimization using caching and indexing
- Asynchronous workflows and distributed systems
I am comfortable working with complex data pipelines and ensuring reliability and scalability.
I have also worked on research-driven projects, including Physics-Informed Neural Networks (PINNs) and an AI-powered data extraction system. These projects involved building scalable pipelines, working with structured data, and designing efficient processing systems.
Research Work:
- PINNs Research Paper: https://github.com/AyushkhatiDev/Physics-Informed-Neural-Networks-for-Solving-Partial-Differential-Equations/blob/main/PINN_PDE_Solver_Research_Paper.pdf
- Data Extraction Research Paper: https://github.com/AyushkhatiDev/data-extractor/blob/main/paper/DataExtractor%20copy.pdf
I am comfortable iterating based on feedback and delivering incremental improvements, which is essential for contributing effectively to open-source projects like Wikimedia.
---
== Post-Internship Contribution ==
I plan to continue contributing to Wikimedia by:
- Maintaining and improving Wikidata integration
- Contributing to related tools and features
- Supporting new contributors in onboarding
Thank you for your consideration.