Page MenuHomePhabricator

[GSOC 2024 Proposal] Lingua Libre v3.0 enhancement and migration : Zhjiang
Open, Needs TriagePublic

Description

Profile Information

Name: Zhen(Janet) Jiang
Github: https://github.com/zhjiang1103
Resume: https://zhenjiang1103.tiiny.site/
Location: California, United States (PDT)
Typical working hours: 11 AM - 1 AM. (UTC−07:00)

Synopsis

Lingua Libre v2.0 has been instrumental in recording vocabularies of over 240 languages, contributing 1.2 million words to Wikimedia sites. However, its current backend, utilizing Wikibase and Blazegraph, has limitations such as query speed, lack of API, and data duplication. The primary goal of this project is to enhance Lingua Libre v3.0 by upgrading its backend infrastructure, improving query speed, implementing a robust API, and enhancing the user interface.
This project will benefit Wikimedia projects in several ways:

  1. Increased Data Quality: By improving the query speed and backend infrastructure, Lingua Libre v3.0 will enable contributors to record and upload vocabularies more efficiently. This will lead to a larger and more accurate dataset, enhancing the overall quality of language data available on Wikimedia sites.
  2. Improved Accessibility: The implementation of a robust API will make it easier for developers to access and integrate Lingua Libre v3.0 data into other Wikimedia projects. This will enhance the accessibility of language data and promote its use in various applications and services.
  3. Scalability and Maintainability: The migration to Django and the upgrade of the backend infrastructure will make Lingua Libre v3.0 more scalable and easier to maintain. This will ensure that the platform can continue to support a growing number of contributors and language data over time.

Mentors: @Poslovitch @Yug

Deliverables

PeriodObjectives
May 1 -26 (Community Bonding) 1. Building relationships with community mentors and contributors. 2. Reviewing Lingua Libre documents, issues, and MRs. 3. Engage in discussions with mentors to gain insights and user needs. 4. Familiarize myself with the Django and Vue.js frameworks. 5. Research different migration practices to improve backend performance and compare the pros and cons.
May 27 - June 9 (Week 1 - 2 Planning phase)1. Investigate the existing Python Django / Vue.js revamp to understand its design and implementation rationale. 2. Collaborate with mentors to refine the drafted plan and structure for the backend. 3. Define and design all necessary API endpoints within the Django framework. 4. Design frontend components and structure based on the existing UI, ensuring consistency and user familiarity.
June 10 - June 23 (Week 3 - 4 Coding Phase I)1. Configure the development environment to ensure seamless workflow. 2. Create backend-related tasks in Phabricator, detailing each component's purpose and functionality. 3. Develop and implement backend features and API endpoints in alignment with project requirements. 4. Formulate comprehensive unit tests to validate the functionality and integrity of implemented features. 5. Revise and update documentation to reflect the latest changes and additions in the backend architecture.
June 24 - July 7 (Week 5 - 6 Coding Phase II)1. Create frontend-related tasks in Phabricator, detailing each component's purpose and functionality. 2. Develop and implement frontend components according to project requirements and design specifications. 3. Establish and execute comprehensive unit tests for frontend components and rendering to ensure functionality and reliability.
July 8 - July 21 (Week 7 - 8 Coding Phase III / Midterm Evaluation )1. Integrate the frontend, server, and database components to ensure seamless communication and functionality. 2. Enhance the user interface by implementing styling that aligns with the existing UI, making necessary adjustments for improved usability. 3. Refactor and optimize code for improved performance and maintainability. 4. Prepare for the midterm evaluation by documenting progress, achievements, and any challenges faced during the project.
July 22 - August 4 (Week 9 - 10 Coding Phase IV)1. Develop comprehensive integration tests to ensure the robustness and reliability of the system. 2. Collaborate closely with mentors to incorporate their feedback, enhancing the code's readability and maintainability. 3. Evaluate and select an appropriate hosting solution for deploying the application to production, considering factors such as scalability, reliability, and cost-effectiveness.
August 5 - August 18 (Week 11 - 12 Deployment and Documentation Phase)1. Addressing any remaining bugs or issues that arise during the migration process. 2. Deploying the migrated Lingua Libre platform to the production environment for public access. 3. Completing and finalizing comprehensive project documentation to detail the architecture, models, RestAPI implementation, a developer guide, and a user guide for future maintenance and development.
August 19 - August 26 (Final Evaluation Period)1. Finalize documentation and code, seeking approval from the mentors. 2. Prepare for the final evaluation by documenting progress, achievements, and any challenges faced during the project.
August 26 onwards(Post-GSOC)1. Conduct thorough testing to identify and fix any remaining bugs or issues in the system. 2. Engage with users to gather feedback on the system's usability and performance, and make any necessary improvements based on their input. 3. Identify potential areas for future enhancements and developments, based on user feedback and evolving requirements.

Participation

I will utilize Phabricator to monitor tasks and track progress, while also using Discord and email to communicate with mentors. I plan to be active on Discord daily to stay informed and seek assistance when needed. Additionally, I will use the GitLab Lingualibre repository to share code, create merge requests (MRs), and manage issues.

About Me

  • Education Currently, I am pursuing a Master’s degree in Computer Science at the Georgia Institute of Technology, having previously completed my Bachelor’s degree in Mathematics at UCLA.
  • How did you hear about this program? I learned about this program through the Women Who Code community.
  • Will you have any other time commitments, such as school work, another job, planned vacation, etc, during the duration of the program? I have no other commitments and intend to dedicate full-time hours to the program throughout the three-month summer period.
  • We advise all candidates eligible for Google Summer of Code and Outreachy to apply for both programs. Are you planning to apply to both programs and, if so, with what organization(s)? I have only applied to GSOC.
  • What does making this project happen mean to you? As I transition from being an educator to a developer, this project represents a crucial opportunity for me to apply my coding skills in a meaningful way. By contributing to a platform that provides free access to knowledge, I aim to make a tangible impact. Furthermore, with my multicultural background and proficiency in three languages, I am deeply passionate about supporting a project that records the world's languages, thus promoting and preserving language diversity.

Past Experience

I've developed full-stack web applications using a variety of technologies including JavaScript, React, Angular, Node.js, Express, HTML, and CSS. I also have experience creating interactive user interfaces and utilizing OpenAI to provide personalized recommendations. Additionally, I have designed and implemented relational databases, such as PostgreSQL, to support these applications.

Please add links to any feature or bug fix you have written for a Wikimedia project during the application phase.
https://github.com/WikiEducationFoundation/WikiEduDashboard/pull/5683
https://gitlab.wikimedia.org/repos/wikimedia-france/lingua-libre/lingua-libre/-/merge_requests/10

Describe any relevant projects that you've worked on previously and what knowledge you gained from working on them.
CineNova: During my software developer apprenticeship, I created a Movie Recommendation App with a 3-tier database architecture, handling all aspects from conception to implementation. I utilized ReactHooks to manage API requests for CRUD operations, significantly reducing data processing times by 2X. I employed JavaScript control flow constructs to manage diverse scenarios and implemented automated unit and integration tests using Jest and React Testing Library. This experience enhanced my technical skills in React and Node.js, expanded my knowledge of designing and implementing RestfulAPIs, deepened my understanding of full-stack web application structures, and taught me techniques for code optimization.

Describe any open source projects you have contributed to as a user and contributor (include links).
Mentor of BridgingTech
Contributor of WikiEduDashboard
User of Gitwit

Event Timeline

Received. (You can still edit it.)

Yug renamed this task from GSOC(2024) Proposal for Lingua Libre v3.0 enhancement and migration to [GSOC 2024 Proposal] Lingua Libre v3.0 enhancement and migration : Zhjiang.Mar 31 2024, 11:51 PM