Page MenuHomePhabricator

Add monitoring and alerting for gerrits tcp-proxy
Closed, ResolvedPublic

Description

Gerrit is behind the CDN soon (T411895) so the tcp-proxy VMs are a production dependency. We should make sure this instances are monitored properly. We need at least:

Details

Related Changes in Gerrit:

Related Objects

StatusSubtypeAssignedTask
ResolvedDzahn
ResolvedJelto

Event Timeline

Jelto triaged this task as Medium priority.Tue, Feb 3, 1:43 PM

I started to collect some relevant dashboards under Gerrit TCP Proxy. The dasboard combines haproxy metrics and trafficserver metrics for monitoring both ssh and https. I don't think the dashboard should be super detailed about every metric because there are more detailed dashboards in sre-traffic-team folder. Most of them allow to also filter for gerrit.discovery.wmnet.

Change #1236746 had a related patch set uploaded (by Jelto; author: Jelto):

[operations/alerts@master] gerrit: add GerritHaProxy* alerts

https://gerrit.wikimedia.org/r/1236746

Change #1236746 merged by jenkins-bot:

[operations/alerts@master] gerrit: add GerritHaProxy* alerts

https://gerrit.wikimedia.org/r/1236746

Jelto updated the task description. (Show Details)

I'll resolve this task for now, there is basic monitoring in place and existing blackbox checks should switch to the new CDN backend once the DNS entry is updated. So I'll resolve this task for now. Tweaks or new alerts might be needed, I'll reopen or open a new task then.