Steps:
* Create script that does an http request to harbor to check that it's up
* Make that script export the result to a node-exporter file (`/var/lib/prometheus/node.d/node_harbor.prom`)
* Add the puppet code to wrap the script inside a systemd timer and run it periodically in the k8s worker nodes (see the following puppet module for an example of both, puppet code and script https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/prometheus/manifests/node_cloudvirt_libvirt_stats.pp)
* Then create the alert:
As of writing this task, creating the alert can only be done directly in the DB, more info here:
https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Monitoring#Monitoring_for_Cloud_VPS
Essentially, we have a cloud vps project, metricsinfra, where we have a setup with prometheus(alertmanager), specifically, there's a couple hosts:
metricsinfra-controller-1.metricsinfra.eqiad1.wikimedia.cloud
metricsinfra-controller-2.metricsinfra.eqiad1.wikimedia.cloud
That generate the alerts for prometheus from a DB, that is hosted in trove.
You have to login into that DB (you can find the credentials and host in the controller hosts config, /etc/prometheus-manager/config.yaml).
There you have the `prometheusconfig` database, with the table `alerts`, that you have to update with the alerts that you want to add, an example row:
```
*************************** 1. row ***************************
id: 1
project_id: 12
name: GridQueueProblem
expr: sge_queueproblems{project="toolsbeta",state=~".*(e|E).*"}
duration: 30m
severity: warn
annotations: {"summary": "Grid queue {{ $labels.queue }}@{{ $labels.host }} is in state {{ $labels.state }}", "runbook": "https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Runbooks/ToolsGridQueueProblem"}
```
The column `expr` is the prometheus expression that you want to monitor, you can find out, check and test them here:
https://prometheus.wmflabs.org/
Another place you can use to find the expression to use is:
https://grafana-rw.wmcloud.org/d/TJuKfnt4z/kubernetes-namespace?orgId=1&var-cluster=prometheus-toolsbeta&var-namespace=image-build&forceLogin&search=open
Inspecting the graphs there and the datasources you will be able to see which prometheus instance and which expression are the ones that give you the data you want.
About the alert itself, it should have also an annotation called 'service' with the value 'toolforge,build_service'.