Page MenuHomePhabricator

Create/Refine abstract system design and front-end mockups for Bernard/WMFDBBackupsDashboard
Open, Needs TriagePublic


For T279552, GSoC 2021

We will need to create an abstract mockup of the system design for Bernard/WMFDBBackupsDashboard and also a mockup of what a potential front-end would look like


  • @h.krishna will make a system-design mockup for WMFDBBackupsDashboard and present it to @jcrespo and @Marostegui to gather feedback - DONE 27/05
  • @h.krishna will create a front-end design mockup for WMFDBBackupsDashboard and present it to gather feedback - DONE 27/05
  • @h.krishna will make changes according to feedback and will upload the designs into this ticket and present it again -- DONE 03/06

Event Timeline

First iteration of system design (as discussed on Thursday 27th May -- highly abstract!)

image.png (882×1 px, 135 KB)

I've created a dbbackups directory in your home dir at the wmcs instance so you can check the production data:

  • 3 sql files, one per table, including the table definition (which should match that at
  • The backups.sql also contain the full list of backup history and their status at the moment of the copy being generated
  • the backup_files.sql contains only the list of files from an enwiki dump (backup_id = 11768) and an enwiki snapshot (backup_id = 11734). Normally you would encounter the file list for all successful backups.
  • the backup_objects.sql has the definition, but no rows (like production)

If you need help loading that into a database, please ask for it.

Thanks @jcrespo -- I have managed to successfully load it into local MariaDB with phpmyadmin -- very useful to see the pattern of the data

First iteration of front-end mockup and presentation slides for the associated meeting

image.png (918×1 px, 70 KB)

image.png (409×1 px, 31 KB)

Presentation for the 2nd meeting. Some key points

  • We care about each section of the backups (eg S1-S8, X1.. etc) rather than the server it's hosted on. Initial assumption was that backup is done on a server-by-server importance rather than "sections". I learned that Wikipedia stores it's data on various datacenters around the world ( -- this data is stored in SQL format (MariaDB) and the data is replicated across different data centers. There are different types of data from different languages of wikipedia and these are stored as "sections". Various sections of data that are backed up as snapshots (large size - easy to recover) and logical dumps (small size - easy to store for longer term). Our dashboard will check if the backup process behind these backups are working normally
  • Dashboard should give a user-friendly/simple idea of whether the backup processes are being executed normally -- try to avoid making a replica of icinga.
  • Focus is on creating a simple proof-of-concept, start small and scale it up
  • Create something that's easy to maintain in the longer run than something that has too many features
  • Backups have certain error criteria's (size inconsistency, freshness of backups, etc)

Slide deck below

Second iteration of feedback, and design

image.png (688×1 px, 72 KB)

image.png (900×954 px, 63 KB)

Presentation for 3rd meeting, key points

  • Minor changes to style, old style was also good (better to keep it simpler)
  • For mid evaluation, we will focus on a simple dashboard, rather than one with so many features (simplest feature is to show whether there was a backup error (or not))
  • Simplicity, easy to maintain over an app with so many features

Project features/goals and some of the must haves

  • Show statuses of backups by section
  • Should be Python v3.6+ and should be able to run on Debian 10
  • Should deploy on WMCloud
  • Never overpromise, always underpromise and try to deliver (or over-deliver)
  • Dashboard shouldn't take "forever" to load, 1s loading time atleast

I will leave this ticket open so that we can discuss any system design questions or any questions related to the above

Meeting slide deck