This ticket is to document and cover the design of various API endpoints that could be present in the database, along with the SQL query
Basically our program will execute the SQL query with databases v0.4.3 (async databases, uses pymysql in the background), and this data will be passed onto a Pydantic model for verification (we don't really need to do this unless we are inputting the data into a table from our program)
As the table is already created, I tried using SQLAlchemy reflection to grab an ORM of the existing table.
Endpoint 1: To get all the recent backups metadata for each unique section
API endpoint: api/v1/get_recent_backups_data
Proposed SQL query on backups table to get this data:
SELECT id, name, status, source, host, type, section, start_date, end_date, total_size FROM backups b WHERE b.start_date = (SELECT MAX(start_date) FROM backups b2 WHERE b2.section = b.section)
(I'm not that great with SQL query but if you have a better query to do this, please post it :-D )
Expected JSON output example (same for the others as well):
[ { "id": 860, "name": "dump.s2-test.2019-03-06--10-03-06", "status": "finished", "source": "localhost", "host": "es2002.codfw.wmnet", "type": "dump", "section": "s2-test", "start_date": "2019-03-06T10:03:06", "end_date": "2019-03-06T10:03:08", "total_size": 142031 }, { "id": 869, "name": "snapshot.s1-test.2019-03-06--10-54-56", "status": "finished", "source": "localhost", "host": "es2002.codfw.wmnet", "type": "snapshot", "section": "s1-test", "start_date": "2019-03-06T10:54:56", "end_date": "2019-03-06T10:55:10", "total_size": 116361965 }, { "id": 875, "name": "dump.section6.2019-03-12--08-30-18", "status": "failed", "source": "dbstore2001.codfw.wmnet:3316", "host": "es2001.codfw.wmnet", "type": "dump", "section": "section6", "start_date": "2019-03-12T08:30:18", "end_date": "2019-03-12T08:30:18", "total_size": null }, ]
Endpoint 2: To get stale backups for each section
API endpoint: api/v1/get_stale_backups
Show me backups whose last successful backups were more than X hours ago (X is the freshness criteria
Proposed SQL query
TODO -- we may do the logic in the actual python program rather than putting it in an SQL query
Endpoint 3: To get recent backups with large percentage change.
API endpoint: api/v1/get_inconsistent_backups
Show me backups whose last successful backups sizes had a percentage change of X percentage
Proposed SQL query
TODO - we may do the logic in the actual python program rather than putting it in an SQL query