Page MenuHomePhabricator

GSoC Proposal: Create a web-based dashboard for monitoring and managing the inventory of Wikimedia databases
Closed, DeclinedPublic

Description

Profile Information

Name: Zhirang Guo
IRC nickname on Freenode: imguozr
Web Profile: https://github.com/imguozr/
Location: Richardson, Texas, USA
Typical working hours: 9 am - 6 pm (UTC-05:00)

Synopsis

Parent Task: T246435: Create or improve a tool for monitoring or automating tasks for Wikimedia databases

Short summary describing your project and how it will benefit Wikimedia projects

Wikimedia has more than 200 MariaDB instances to store content and other metadata for Wikipedia and other free projects. The current dashboard Tendril, which is being used for monitoring MariaDB status, has been obsolete from several perspectives: such as being laggy, and unable to integrate with modern tools like grafana and backup systems.

This project - Create a web-based dashboard for monitoring and managing the inventory of Wikimedia databases - aims at creating a minimum viable product in terms of a new web-based database dashboard that supports monitoring and inventory management to replace Tendril in the future.

Possible Mentors

@jcrespo, @Marostegui

Have you contacted your mentors already?

Yes. I have contacted @jcrespo through email, Wikimedia’s Phabricator and Gerrit.

Deliverables

After the completion of this internship, the tool will have the following features:

  • It has a reliable storage solution, including replica sets, server groups, server/instances properties and so on.
  • It has the ability to display details of Wikimedia database instances and servers, including name, section, location, server, IP, port, version, and other real-time properties like QPS, latency, lag and so on.
  • It can detect errors and show error messages with explicit colors (like red) when hosts are down.
  • It provides features to manage the database inventory in terms of creating, editing and deleting.
  • It can jump to future features like Activities, Backups, Reports and so on.
  • It has good security patterns like authorized access.

If time permits, the following features will be added:

  • It has user-friendly features like client-side ordering, client-side filtering, as well as concise and extensible UI.
  • It provides a list view of the servers by replication chain, showing related masters and slaves when going into details.

UI Mockups

The instance/server detail page would be like:

image.png (978×1 px, 313 KB)

Implementation details

First things first, I need to design the storage solution from the existing tool. For less-changing inventory data like hostnames, ips, ports, they will be maintained in a local database, and for real-time data like QPS and latency, Redis or Memcached seems to be a good choice to store. Other tools and procedures will take care of the data gathering and updating. If current databases are reusable, they will be used in the next steps of development. What's more, for the real-time stats, tests of robustness and reliability on both ends (backend and frontend) will be done during development.

I plan to use Flask as the backend framework. For the early stage of development, frontends are planned to develop under Bootstrap due to its easiness. If time permits, the frontend will be recreated with modern frontend frameworks like React.js (or others). If there exist other available technologies that match better with the dashboard’s extensibility, they are also in active consideration. I will settle down the details with mentors in the first weeks.

To pass data - in JSON format - to frontend platforms, we need web services with RESTful APIs given below:

  • GET, fetch details of instance/server from databases.
  • GET, fetch replica pairs from databases.
  • POST, send requests to databases to manipulate the database inventory.

All of the above APIs should have been fully tested, and they must have security checks on authorized accesses, integrating with the Wikimedia authentication system.

Timeline with Milestones

May 4 - May 31 (Community Bonding Period)
  • Get familiar with Wikimedia database infrastructures.
  • Get familiar with Wikimedia development workflow, style guidelines, and related tools.
  • Learn more about Tendril, discuss more about the demands with team members and refine mockups.
  • Design storage solutions, determine important technological details, including tech stack.
June 1 - July 3 (Coding Phrase I)
  • Settle down architecture design and algorithm design, write a design document with all requisites, if possible, make it detailed.
  • Implement an initial prototype with one fully-worked fetching-data API which is merged or ready to merge, including tests and documents.
  • Implement a rough front end page for the data-showing part of the dashboard.
July 6 - July 31 (Coding Phrase II)
  • Implement, test and document all other APIs, finish required features listed above.
  • Finish a beta version of the dashboard.
Aug 3 - Aug 24 (Final Coding Phrase)
  • Refine documents and polish the beta version.
  • Implement additional features if time permits.
  • Finish the final report.

Participation

  • I will follow the current style guidelines of Wikimedia and commit everyday progress to Gerrit.
  • I have been and will be in constant touch with my mentors, Jaime Crespo and Manuel Arostegui.
  • I will discuss ideas and share status on Phabricator.
  • I will be available through Phabricator, Gerrit, IRC and email during work time.
  • I will write weekly reports regarding project progress.

About Me

Your education

I am a graduate student at the University of Texas at Dallas pursuing my Master of Science degree in Computer Science.

How did you hear about this program?

From a friend who participated before.

Will you have any other time commitments, such as school work, another job, planned vacation, etc, during the duration of the program?

No, I will be completely available during this program given it is in summer break.

We advise all candidates eligible for Google Summer of Code and Outreachy to apply for both programs. Are you planning to apply to both programs and, if so, with what organization(s)?

I will only apply for Google Summer of Code.

What does making this project happen mean to you?

Wikipedia and other free projects under Wikimedia mean a lot to me. As a child grown up in China, I experienced two stages of Wikimedia: free usage and being blocked by the GFW. I was finally able to learn all kinds of knowledge from Wikimedia again after building a VPN during high school. What makes Wikimedia today is its freedom and openness, and these spirits keep motivating me all the time. By contributing to this project, which aims at building a better tool for the development of Wikimedia, I can help Wikimedia with my strength as a software developer. This is the best thing I can do to express my thankfulness.

Past Experience

My background was in Computer Science. I have experience with multiple technologies like Python, Java, JavaScript, Node.js, Flask, Koa, Vue.js, Redis, MongoDB, and MySQL.

Also, I like to contribute to open source communities. Following is some of my work:

Contributions to Wikimedia

Event Timeline

I don't really have further feedback on the proposal at the moment, but note we may have more questions for candidates. I also have pending a review of https://gerrit.wikimedia.org/r/c/operations/software/wmfmariadbpy/+/578623, haven't forgotten!

I don't really have further feedback on the proposal at the moment, but note we may have more questions for candidates. I also have pending a review of https://gerrit.wikimedia.org/r/c/operations/software/wmfmariadbpy/+/578623, haven't forgotten!

Thank you @jcrespo. I will continue polish this proposal. If there are any questions, please let me know :D

Guozr.im renamed this task from GSoC Proposal: Create a web-based dashboard for monitoring and inventory management for Wikimedia databases to GSoC Proposal: Create a web-based dashboard for monitoring and managing the inventory of Wikimedia databases.Mar 19 2020, 8:46 PM
Guozr.im updated the task description. (Show Details)

Thanks for your contributions, they were very valuable. I have asked a separate student to work on T248661 as I don't want to abuse your help! :-D Feel free to update the (Work in progress) remark on your proposal here about your second ticket and on Google. Also remember to mark it as final on google before the deadline.

If and only if you want some extra task meanwhile (completely optional), I would just suggest to get familiar, on your own, with our database infrastructure. I don't know if I passed these links to you before, but here are some presentations or documentation with some overviews of our infrastructure, which may be interesting for you for context:

You have already contributed to some of these, and maybe you would like to know more about them!

Thank you @jcrespo for the merge :D I will start to learn about the db infrastructure right now!

Pavithraes added a subscriber: Pavithraes.

@Guozr.im We are sorry to say that we could not allocate a slot for you this time. Please do not consider the rejection to be an assessment of your proposal. We received over 100 quality applications, and we could only accept 14 students. We were not able to give all applicants a slot that would have deserved one, and these were some very tough decisions to make. Please know that you are still a valued member of our community and we by no means want to exclude you. Many students who we did not accept in 2019 have become Wikimedia maintainers, contractors and even GSoC students and mentors this year!

If you would like a de-brief on why your proposal was not accepted, please let me know as a reply to this comment or on the ‘Feeback on Proposals’ topic of the Zulip stream #gsoc20-outreachy20. I will respond to you within a week or so. :)

Your ideas and contributions to our projects are still welcome! As a next step, you could consider finishing up any pending pull requests or inform us that someone has to take them over. Here is the recommended place for you to get started as a newcomer: https://www.mediawiki.org/wiki/New_Developers.

If you would still be eligible for GSoC next year, we look forward to your participation!