This task is to track the upgrade to Prometheus to v2 in beta / deployment-prep and tools. Both for testing purposes and consistency with production.
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | fgiunchedi | T220104 TEC6: Metrics monitoring infrastructure (Q4 2018/19 goal) | |||
Resolved | fgiunchedi | T187987 100% of Prometheus traffic served by Prometheus v2 | |||
Resolved | fgiunchedi | T215272 Upgrade Prometheus to 2.7 in deployment-prep and tools |
Event Timeline
Change 486051 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] prometheus: add feature flag for v2 compat
Change 486051 merged by Filippo Giunchedi:
[operations/puppet@production] prometheus: add feature flag for v2 compat
Change 488344 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] prometheus: use v2 rules for beta
Change 488344 merged by Filippo Giunchedi:
[operations/puppet@production] prometheus: use v2 rules for beta
Mentioned in SAL (#wikimedia-cloud) [2019-02-06T14:00:44Z] <godog> switch beta-prometheus to deployment-prometheus02 - T215272
Mentioned in SAL (#wikimedia-cloud) [2019-02-06T14:12:23Z] <godog> shut off deployment-prometheus01 - T215272
Mentioned in SAL (#wikimedia-cloud) [2019-02-07T08:41:13Z] <godog> upgrade prometheus-02 to prometheus 2.6 - T215272
Conversion of tools-prometheus-02 worked as expected, I've stopped v1, moved v1 metrics out of the way and installed v2 package, then started v2. Once v2 is running and starts collecting metrics into an empty storage I've ran prometheus-storage-migrator on v1 data. That took about ~5h, and once complete the migrated data can be merged back into the fresh v2 storage. Overall there has been about 3h worth of missing data, from 6am to 9am (when I've upgraded to v2)
Mentioned in SAL (#wikimedia-cloud) [2019-02-08T11:08:47Z] <godog> flip tools-prometheus.wmflabs.org to tools-prometheus-02 - T215272
Tools and deployment-prep are running Prometheus 2.7.1 rebuilt from unstable with k8s support and their storage has been migrated from Prometheus v1.
The former prometheus instance in beta, deployment-prometheus01 is now off as it is unused and can be deleted.