Name: Hari Krishna
IRC nickname on Freenode: hkrishna (#wikimedia channel)
Location : United Kingdom
Typical working hours: 9am-5pm, GMT+1 (BST)
Based on task from T274636 (Database backup inventory improvements)
The project aims to build a simple web dashboard/webapp that will display and monitor data and metadata from various database backups that are produced everyday in WMF production environments. In addition, the web dashboard will report on the status of these backups or it's errors and will also show the status of past/ongoing/future backups. This will help WMF Database administrators provide a good overview of database backups processes and whether the backups processes are working properly (or not!)
Subject to future discussion, these metadata will be exposed through APIs and then can be used to make a webapp to display the data. As this data is exposed through APIs, other applications can be built upon it.
In order to enable easier maintainability and collaboration among open source developers, we will try and maximize code coverage for testing in this project.
Having worked on the microtasks with other volunteers, I have come to understand that without unit/integration tests, it is difficult to know if existing features are affected by a new patchset, and that can make collaboration between different open source volunteers and developers difficult. As a goal, we will write unit/integration tests for any code that is re-used from the existing database backups repository (in addition to tests for any new code) and as a stretch goal, we will try and maximise code coverage as I feel this would improve developer/volunteer experience.
- Technology stack proposed for this project:
Python3/Flask -- Specific python version to be decided later on (depending on WMF production environment and whether we need to support Debian 9 and 10)
MariaDB v10.x for databases
These frameworks were chosen as the existing codebase is built using these technologies and the development team are also familiar with these technologies.
API First approach -- database inventory and backup metrics could be exposed through API end points and other applications could be built upon using data from these endpoints. (example -- external cURL script or data sources for Grafana dashboards used in WMF, or a web dashboard like what we are making currently)
Simple Bootstrap 5 with VanillaJS or jQuery which will consume the data from the APIs above and show the data on the page. These frameworks/languages were chosen as I believe these are very straightforward and easy to maintain for any developer who may not be working on front-end code on a daily basis as only basic HTML/CSS/JS skills are needed to add/modify features in code.
In addition, new contributors can also contribute to our repository easily (and this increase open source volunteer engagement)
- Possible Mentor(s): Jaime Crespo @jcrespo and Manuel Aróstegui @Marostegui
- Have you contacted your mentors already? : Yes
- Describe the timeline of your work with deadlines and milestones, broken down week by week. Make sure to include time you are planning to allocate for investigation, coding, testing and documentation
For development, we will follow agile methodology and try and deliver features agreed upon weekly or biweekly, with sprint reviews with mentors every week/biweekly.
I've broken down each week into tasks and deliverables.
For mid evaluation milestone on week starting July 12 (alpha), I will aim to try and deliver the following
- A working front-end interface showing some of the data/metadata for backup files -- as a user, you should be able to see the list of backup files from different DB hosts and their associated metadata such as hostname, size, backup type(dump, snapshot) and date taken.
- A working API backend with limited endpoints (/GET) for data to support the above.
- Integration of new code/repository on Github/Gerrit with existing WMF Jenkins CI (eg. tox-docker?) -- as a developer, you should be able to check if the integration tests pass successfully.
- Good tests for any new code we have written so far. (focus is on writing good tests than coverage)
For final evaulation milestone in August, I will aim to deliver a completed database backups and inventory monitoring webapp/solution which will help WMF DB Admins to monitor the backup processes. By final evaluation, the following should be done
- A completed front-end interface showing all of the data/metadata for backups files and objects -- as a user, you should be able to see a list of all the existing backups from various DB hosts, their metadata such as eg. (size, whether it is a dump, snapshot), and statuses of backups jobs (ongoing, failed, finished, scheduled), showing a detailed view of when the backup process was last ran and on which hosts.
- You can monitor the backups processes within the WMF DB instances, and whether there were errors (such as backup failures, etc)
- A completed API backend with necessary endpoints for obtaining the backup data above, and can be parameterised (eg. /POST)
- Good test coverage (try and aim for 60-80%) for any code used in the project
- Good documentation using sphinx
- Any stretch goals that have been agreed beforehand.
The plan below assumes that we will have a programming sprint of 2 weeks, where at the end of every sprint, a sprint review/retrospective will be done with the mentors.
Set-up and Introduction Phase (week 1 - 17th-23rd May) (15 hours)
- Get to meet and know mentors and their work, understand their ways of working and availabilities.
- Understand the backend infrastructure of WMF
- Get to understand the project
- Onboarding into WMF development resources/environments (eg. WMF Cloud, staging)
- Understand bigger picture of the project and how it fits into existing infrastructure
Investigation Phase (week 2 and 3 - 24th May - 30th May) (15 hours)
- Understanding problems that needs to be solved by gathering requirements from mentors through user stories
- Understanding any constraints for our project and ensuring we have the right resources
- Create rough mock ups for UI based on requirements gathered
- Create overview of system design (mock up)
- Present and gather feedback
Refinement phase (week 3 - 31st May - 6th June) (15 hours)
- Based on feedback received, refine and agree upon stories and features for first evaluation (Alpha)
- Evaluate whether we are using any existing code from codebase such as wmfmariadb, and evaluate whether we need to include test coverage for dependancies
- Agree upon product delivery goals and testing goals for Alpha (mid evaluation milestone)
- Convert goals into stories, following agile methodology and delivering stories weekly/biweekly and performing sprint review/retrospective (subject to mentor availability - may change)
Alpha development - Sprint 1 (Week 4 - 7th June to 13th June) (15 hours)
- Set up framework for projects, create repositorries in Gerrit/GitHub, any Jenkins jobs, etc.
- Develop basic skeleton for nack-end Flask/Django, basic front-end skeleton for the dashboard
- Understand how to work with WMF development/staging environments and how to integrate our project with WMF infrastructure.
Alpha development - Sprint 1 (Week 5- 14th June to 20th June) (15 hours)
- Create front-end design (Bootstrap/JS)
- Integrate program with existing codebase for triggering/obtaining backup data/metadata
- Add testing coverage for features developed in the previous week, finish off any left over work.
- Sprint review / Code review, retrospective/feedback session from mentors
- Agree upon goals for next sprint
Alpha development - Sprint 2 (Week 6+7 - 21st June to 4th July) (32 hours)
- Create code for database CRUD operations
- Create APIs for exposing some of the data through back-end
- Integrate front-end (Bootstrap/JS) code with back-end APIs (Python) code
- Create front-end code (Bootstrap/JS) to work with the APIs
- Add functionality to perform Jenkins CI builds for new repository on WMF using existing tox-docker job.
- Sprint review and feedback
Alpha (First Milestone) ready for evaluation - Test, documentation and cleanup - Sprint 3 (Week 8+9 - 5th July to 18th July) (20 hours)
- Begin documentation process using sphinx
- Ensure good testing, clean up code and make it ready for first submission
- Any other last minute fixes
- Obtain feedback from mentors post evaluation
- Agree upon requirements/stories, acceptance criteria for final product
- Agree upon any change of scope and/or fixes for final submission
- Agree upon any stretch goals that we can do towards the end (if we have time)
Final product development - Sprint 4 (Week 10+11) (20th July to 2nd August) (20 hours)
- Finish backlog from pre-evaluation
- Ensure good test coverage for any existing codebase by writing unit/integration tests (or any reused codebase from wmfmariadb)
- Complete API design to expose all of the data
- Complete work on front-end to work with the completed API design
Final development, testing, documentation - Sprint 5 (week 12+13) (3rd August to 17th August) (32 hours)
- Backlog from previous sprint, if any
- Complete Sphinx documentation
- Ensure we have fully completed all stories and requirements to meet acceptance criteria
- Perform code reviews with mentors, review and feedback
- Acceptance testing of requirements with mentors
Final week of programme, preparing final milestone for evaluation - Sprint 6 (week 14+15) (18th August to 30th August) (20 hours)
- Fix any bugs relating to issues observed in pre-production/staging environment
- Prepare program for final submission, ensure .git readme is up to date and documentation is also up to date
- Time reserved for any stretch goals
I am open to additional suggestions for the below, am very flexible with my schedule
Participation during project
- Mentorship from Manuel and Jaime - I will arrange weekly standup video-call / screen-share with Google Hangouts (1hr or more -- subject to mentor availability) with mentors to discuss progress and tasks and to obtain continuous feedback and guidance. This could be in the form of a sprint retrospective. and could be weekly/bi-weekly.
- Direct communication with mentors through IRC/Zulip for any questions that spring up in the heat of the moment during development (I don't expect any response outside your working hours)
- Weekly reports as recommended by Wikimedia in the form of blog posts or shared Trello/Kanban board to document progress of project
- Project code to be published open-source in either Gerrit and/or Github, subject to discussion with mentors during project.
Participation after end of programme
- Competition of any project backlog if any of for some reason we still have some minor backlog remaining)
- Competition of any stretch goals that were agreed beforehand and couldn't be completed due to time constraints. (hopefully this doesn't happen)
- Performing any final additions so that the project so that it is ready for deployment into production (could be a stretch goal but this depends)
- Contributing to open source by being a maintainer of the project in the longer term
Tell us about a few:
- Your education (completed or in progress)
Currently, I am a 4th year student doing an undergraduate degree in Computer Science studying in UK. Over the course of my degree, I have done trainee/internships for a large employer, with 7+ months of industrial experience as an intern.
- How did you hear about this program?
Through another person who participated in the program many years ago -- I wasn't sure about my skills back then hence I am applying now :)
- Will you have any other time commitments, such as school work, another job, planned vacation, etc, during the duration of the program?
I have University and life commitments on working days but otherwise I am free most of the time and I am also free on weekends and evenings and will make time on working days as follows (GMT+1 / BST)
Monday - Friday : 9am-1pm, sometimes 5pm-8pm
Saturday - : 9am - 5pm
Sunday - : 9am - 5pm (if needed)
Time commitment: 12-16 hours per week
I will try to be contactable during sociable/fixed hours. I aim to spend 12-16 hours a week working on the project and the schedules above are extremely flexible -- this will help me balance between life, university work and GSoC work.
- We advise all candidates eligible for Google Summer of Code and Outreachy to apply for both programs. Are you planning to apply to both programs and, if so, with what organization(s)?
First time hearing about outreachy, not sure if I am eligible but can try next time as the deadline has passed I think.
- What does making this project happen mean to you?
As a student I've only had the chance to contribute using my skills in various code bases that are often closed-source and not necessarily visible or used by the public.
I always wanted to learn how to contribute to the open source software community using skills I have learned from university and in industry. I've learned to do this through the microtasks (my first ever open source contribution!) and I've been able to learn good coding standards in Python through code reviews and this has greatly helped me. I've had the opportunity to interact and help fellow volunteers as well. I would like to give back and learn more from other developers/volunteers and with this project, I believe I will be able to do just that with my skillset and also get a good mentorship experience by helping build a good DB monitoring application which will benefit WMF DB Admins in the long run. In addition I chose Wikimedia Foundation as I am fond of the work that they do to ensure distribution of free knowledge to the world and educating everyone -- I've always relied on Wikipedia as a kid for knowledge and I am happy to contribute back!
- Describe any relevant projects that you've worked on previously and what knowledge you gained from working on them.
Over the course of my degree, I have done trainee programmes/internships for a large employer, with 7+ months of industrial experience (as of writing), where I had to create and write applications to monitor critical infrastructure using test driven development, which I learned was very useful as it makes building upon code/modifying code very easy and straightforward especially when working with different developers. I've also learned the importance of site reliability and it's metrics.
I have also done projects in University where I had acted as an external consultant developing solutions for an external real world client in with frontend in Python, Tornado, MySQL for backend and simple Bootstrap/JS/jQuery for the front end, creating a webapp with both front end and back end components, which would then expose datapoints for Grafana dashboards. That being said, I still need to look up online if I need to center a div :)
Over the course of university, I've had good exposure to web technologies such as Node.js and creating REST APIs with them
and I have spent about additional 6+ months on these technologies over the past years in various projects at University.
I was able to make my first ever open source contribution and here are my contributions (pending review)
- Describe any open source projects you have contributed to as a user and contributor (include links).
In spite of all the closed-source experience, I have not had a chance to contribute to open source until I discovered the GSoC program through a friend who recommended it and I found Wikimedia projects interesting -- the database project interested me and I had a look at the good first tasks.
I'm happy that I was able to make my first ever open source contribution and here are my contributions T277160 T277162
Frankly, this helped me understand the project and also gave me an avenue to improve my coding and communication skills with other volunteers and professionals.
Here are some of the tickets I have worked on, where I was able to contribute with good unit tests. T277160, T277162, and doing these tickets also helped me uncover a small bug in the code and raised a ticket for it in T277754.
I've also had the chance to work with another developer for a ticket I've raised, where I was able to help them out T277754
- You must have written a feature or bugfix for a Wikimedia project during the application phase (see the section about microtasks in the application process steps), please link to it here. We give strong preference to candidates who have done so.
T277160 (not merged yet)
T277162 (not merged yet)
T277754 (raised a ticket and provided support through code reviews)