Personal information
Name: Gabina Luz
Timezone: America/Argentina/Buenos Aires
Location: Rosario, Santa Fe, Argentina
GitHub profile URL: https://github.com/gabina/
Timeline of tasks for the internship period
Introduction
The Dashboard is a Ruby on Rails web application for tracking contributions by groups of editors, providing statistics and details about their contributions. It is commonly used for events that bring new editors to Wikipedia, such as edit-a-thons, Wikipedia editing assignments in schools and universities, and distributed campaigns like #1Lib1Ref. In particular, Spanish #1Lib1Ref campaign (known also as #1Bib1Ref) is a Wikipedia campaign inviting librarians to participate in the online Spanish encyclopedia project, specifically improving articles by adding citations.
The organization of the Spanish #1Lib1Ref campaign uses the Wiki Edu Dashboard and considers the "references added" feature a pretty valuable one, as it helps organizers bring life to this campaign with ethics and care: they can better monitor quality, encourage newcomers, praise champions, and so much more.
The Dashboard currently gets statistics on references added by fetching the features data supplied by Wikimedia's article quality machine learning models, and comparing the values of one revision with values for the previous revision. This reference counting method is constrained by the availability of article quality machine learning models, which are not accessible for the majority of Wikipedia language editions, including the Spanish language version.
Main goal of the internship
The main goal of the internship is to develop a performant alternative implementation of counting references added that does not depend on articlequality features data, and works for every language version of Wikipedia. One high priority is to enable reference counting for Spanish Wikipedia, in support of the Spanish #1Lib1Ref campaign.
One promising route would be to co-opt data from another API that works across languages, such as this one: https://misalignment.wmcloud.org/api/v1/quality-revid-features?lang=es&revid=144495297
Project timeline
The internship commences on December 4, 2023, and concludes on March 1, 2024. Therefore, the project timeline aligns with that time frame.
Please note that this timeline is a guide and may be subject to adjustments based on progress and feedback. The descriptions may have different levels of detail, depending on the depth of knowledge about the topic during the planning phase. Some details are intentionally left abstract, to be refined during the code implementation phase.
Week 1: Dec. 4, 2023 - Dec. 11, 2023
Research the current method for calculating added references. Explore the dashboard interface to locate where the "references added" value is displayed. Identify the specific sections of the code responsible for making API calls, handling their responses, and processing the data. Assess how the backend and frontend collaborate to calculate and present the "references added" value, and evaluate the extent of decoupling in the existing code to determine whether changes in the backend alone will suffice. Additionally, review the existing specifications to gain a comprehensive understanding of the required modifications.
Dec. 11, 2023 - Feedback #1
Week 2: Dec. 11, 2023, Dec. 18, 2023
Having identified the required data for calculating the "references added" feature, investigate the misalignment.wmcloud.org/api/v1/quality-revid-features API to obtain data for the new implementation. This entails reviewing the API documentation and testing its endpoints to assess if the provided data aligns with requirements. It's crucial to confirm that this API is accessible for all languages. Additionally, comparing existing and new "references added" values in specific cases is neccesary to validate the accuracy of the new results.
In case the proposed API doesn't align with the project's requirements, exploration of alternative options may be necessary, potentially leading to an extension of the timeline.
Note: the existing misalignment API was originally designed as a research prototype so it's intended to be eventually removed, but also more importantly it's hosted in a shared space where someone could one day delete it unknowingly. Therefore, it's not a great place for it to exist to sustain the dashboard long-term.
Toolforge would protect against that accidental deletion piece and give wikiedu folks more easy access. In addition, the logic we need around references is relatively simple and the existing API is doing a bunch of other things that slow it down or could cause errors. Moving to toolforge is also a good opportunity to simplify further so it's easier to maintain, faster, and less likely to fail inexplicably.
Week 3: Dec. 18, 2023, Dec. 25, 2023
Flexible time for adjustments.
Week 4: Dec. 25, 2023, Jan. 01, 2024
Evaluate the need for modifications to models to store data from the new API. Handle database migration and spec updates if necessary.
Week 5-6: Jan. 01, 2024, Jan 15, 2024
Develop a streamlined method for making requests to the new API. Add specifications for the new behavior.
Note: LiftWingApi can serve as a reference for this purpose.
Jan. 15, 2024 - Feedback #2
Week 7-8: Jan. 15, 2024, Jan 29, 2024
Implement a way to use the newly created class to import and store data in the database. Add specifications for this new piece of code.
Note: RevisionScoreImporter can serve as a reference for this purpose.
Jan. 31, 2024 - Feedback #3
Week 9-10: Jan. 29, 2024, Feb 12, 2024
Integrate all changes. Conduct manual end-to-end tests and, ideally, implement automated ones.
Perform code cleanup if necessary.
Week 11-12: Feb 12, 2024. Feb 26, 2024
Develop a deployment plan for production.
Address final details, such as the "references added" description displayed in the dashboard.
Feb. 26, 2024 - Feedback #4
Week 13: Feb. 26, 2024, Mar 01, 2024
Conduct a final performance review with my mentor.
Perform auto-review and draw conclusions.