Page MenuHomePhabricator

[api-gateway] add alert for uptime
Closed, ResolvedPublic

Description

Created as prompted by @taavi, the Toolforge API gateway (api.svc.tools.eqiad1.wikimedia.cloud) does not appear to be monitored for uptime.

Event Timeline

dcaro renamed this task from Monitor the Toolforge API gateway to [api-gateway] add alert for uptime.Apr 18 2024, 9:16 AM
dcaro triaged this task as High priority.
dcaro moved this task from Backlog to Ready to be worked on on the Toolforge board.

I'd like to do this if someone could guide me or point me to some similar past task

On one side you have to register the url to be added to the pingthing/blackbox monitoring config, same as these but for the new url:
https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/profile/manifests/toolforge/k8s/haproxy.pp#52

That will add it to the monitored list, note that it will add also an alert by default, so that should be enough.

This task is related (not the same, not overridden by, just related) T367389: [k8s,infra,alerting] improve HAproxy and k8s apiserver interaction

Change #1093339 had a related patch set uploaded (by David Caro; author: David Caro):

[operations/puppet@production] toolforge:haproxy: add api gateway health check

https://gerrit.wikimedia.org/r/1093339

dcaro changed the task status from Open to In Progress.Wed, Nov 20, 2:02 PM
dcaro moved this task from Next Up to In Review on the Toolforge (Toolforge iteration 16) board.

group_203_bot_4866fc124f4b41659f667468a6115cf3 opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/616

api-gateway: bump to 0.0.56-20241120144516-f10abf2a

Change #1093339 merged by David Caro:

[operations/puppet@production] toolforge:haproxy: add api gateway health check

https://gerrit.wikimedia.org/r/1093339

dcaro moved this task from In Progress to Done on the Toolforge (Toolforge iteration 16) board.
dcaro reopened this task as In Progress.Wed, Nov 20, 4:54 PM
dcaro moved this task from Done to In Review on the Toolforge (Toolforge iteration 16) board.
dcaro moved this task from In Review to In Progress on the Toolforge (Toolforge iteration 16) board.

Change #1093384 had a related patch set uploaded (by David Caro; author: David Caro):

[operations/puppet@production] toolforge:haproxy: monitor the https port, not the internal one

https://gerrit.wikimedia.org/r/1093384

Change #1093384 merged by David Caro:

[operations/puppet@production] toolforge:haproxy: monitor the https port, not the internal one

https://gerrit.wikimedia.org/r/1093384

Change #1093395 had a related patch set uploaded (by David Caro; author: David Caro):

[operations/puppet@production] toolforge:haproxy: use the external name and force tls

https://gerrit.wikimedia.org/r/1093395

Change #1093395 merged by David Caro:

[operations/puppet@production] toolforge:haproxy: use the external name and ip and force tls

https://gerrit.wikimedia.org/r/1093395

dcaro moved this task from In Progress to Done on the Toolforge (Toolforge iteration 16) board.