Page MenuHomePhabricator

Configure monitoring / alerting of Postgresql / redis / ... cluster for maps
Closed, ResolvedPublic

Description

We have zero alerts on Postgresql for maps. Minimal alerting should be done on:

  • redis running / listening on port
  • postgres running / listening on port
  • lag in postgresql replication

Details

Related Gerrit Patches:
operations/puppet : productionAdding Icinga checks for Maps

Event Timeline

Gehel created this task.May 18 2016, 4:07 PM
Restricted Application added a subscriber: Zppix. · View Herald TranscriptMay 18 2016, 4:07 PM
Gehel renamed this task from Configure monitoring / alerting of Postgresql cluster for maps to Configure monitoring / alerting of Postgresql / redis cluster for maps.May 19 2016, 8:05 PM
Gehel renamed this task from Configure monitoring / alerting of Postgresql / redis cluster for maps to Configure monitoring / alerting of Postgresql / redis / ... cluster for maps.May 26 2016, 2:21 PM
Gehel updated the task description. (Show Details)

Karthoterian check could be an HTTP check on https://maps.wikimedia.org/osm-intl/0/0/0.png (or the equivalent on localhost)

Gehel claimed this task.May 27 2016, 7:09 PM
Yurik moved this task from Backlog to In progress on the Maps-Sprint board.May 27 2016, 10:11 PM

Change 291023 had a related patch set uploaded (by Gehel):
Adding Icinga checks for Maps

https://gerrit.wikimedia.org/r/291023

Postgresql and redis are only used for tile generation, not user facing operations. As such, it is not as critical as monitoring kartotherian.

Yurik moved this task from Kartotherian to Maps-data on the Maps board.Jun 26 2016, 8:20 PM
Yurik moved this task from Stalled/Waiting to To-do on the Maps-Sprint board.
Yurik updated the task description. (Show Details)Oct 20 2016, 3:13 AM
Gehel updated the task description. (Show Details)Oct 25 2016, 12:35 PM
Yurik removed a project: Maps.Dec 15 2016, 4:33 AM
Gehel moved this task from To-do to Done on the Maps-Sprint board.May 30 2017, 7:44 PM
Gehel moved this task from Done to Backlog on the Maps-Sprint board.
debt moved this task from Backlog to To-do on the Maps-Sprint board.Jun 6 2017, 7:45 PM
debt added a subscriber: debt.

Moving to prioritized as it's on our list of things that do need doing.

Gehel moved this task from To-do to Backlog on the Maps-Sprint board.Jun 15 2017, 7:19 PM
Gehel moved this task from Backlog to Done on the Maps-Sprint board.Jun 21 2017, 9:47 AM

As part of T167871, the standard monitoring of postgres and redis are now applied, which should be sufficient. We can close this task (and open specific tasks for monitoring improvement as needed).

debt closed this task as Resolved.Jun 22 2017, 7:04 PM

Yay!