Page MenuHomePhabricator

Create a table in labs with replication lag data
Closed, ResolvedPublic

Description

I am creating this bug at the request of springle.

It will be very useful to be able to consult replication lag on a table with wide access.

The analytics team has nightly reports that compute stats for the day. If replication lag is too high there might not be any data for that day, thus teh report will look "artifically" empty.

We could monitor timestamps of data but it will be much more useful to have a table we could consult with replication lag data before our daily run, if the lag is too high the run will just not happen. Our software backfills so if today's run is skipped we will log the issue and tomorrow's run can backfills today data if issue is resolved.


Version: unspecified
Severity: enhancement

Details

Reference
bz69463
Related Gerrit Patches:

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:43 AM
bzimport set Reference to bz69463.
Nuria created this task.Aug 13 2014, 10:24 AM

We schedule reports by project and i imagine replication will reported per host, not per project so a global measure of how replication is working on the labs cluster will be sufficient for us.

Nuria added a comment.Aug 13 2014, 2:37 PM

Please note that this table needs to exist on the labs side, not on the production side.

Springle set Security to None.Aug 28 2015, 4:20 AM
Springle added a subscriber: jcrespo.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 28 2015, 4:20 AM
Krenair added a subscriber: Krenair.
QChris removed a subscriber: QChris.Sep 3 2015, 11:12 AM
Krinkle moved this task from Triage to Backlog on the DBA board.Sep 23 2015, 4:27 AM
Krinkle moved this task from Backlog to Triage on the DBA board.Sep 23 2015, 7:07 AM
jcrespo claimed this task.Oct 5 2015, 9:38 AM

This will be possible very soon due to T111266

Change 249105 had a related patch set uploaded (by Jcrespo):
Replicate pt-heartbeat table to labs. Stop replicating msg_resource

https://gerrit.wikimedia.org/r/249105

Change 249105 merged by Jcrespo:
Replicate pt-heartbeat table to labs. Stop replicating msg_resource

https://gerrit.wikimedia.org/r/249105

jcrespo triaged this task as Low priority.Nov 6 2015, 4:54 PM

The table exists already, but for some reason it does not show all shards. I will have to investigate later.

MariaDB LABS localhost heartbeat_p > SELECT * FROM heartbeat;
+-------+----------------------------+------+
| shard | last_updated               | lag  |
+-------+----------------------------+------+
| s6    | 2015-11-24T12:21:13.500950 |    0 |
| s2    | 2015-11-24T12:21:13.501200 |    0 |
| s7    | 2015-11-24T12:21:13.501190 |    0 |
| s3    | 2015-11-24T12:21:13.501110 |    0 |
| s4    | 2015-11-24T12:21:13.501170 |    0 |
| s1    | 2015-11-24T12:21:13.500670 |    0 |
| s5    | 2015-11-24T12:21:13.500780 |    0 |
+-------+----------------------------+------+
7 rows in set (0.00 sec)

Change 255434 had a related patch set uploaded (by Jcrespo):
Add SQL for the creation of heartbeat tables

https://gerrit.wikimedia.org/r/255434

Change 255434 merged by Jcrespo:
Add SQL for the creation of heartbeat tables

https://gerrit.wikimedia.org/r/255434

jcrespo closed this task as Resolved.Nov 25 2015, 7:56 PM

It only took a year, but it was finally done.