Two pages were sent during the upgrade:
- search.svc.codfw.wmnet/LVS HTTP IPv4 is CRITICAL
- search.svc.codfw.wmnet/ElasticSearch health check for shards is CRITICAL
The second one is probably a consequence of the first one because:
ElasticSearch health check for shards on search.svc.codfw.wmnet is CRITICAL: CRITICAL - elasticsearch http://10.2.1.30:9200/_cluster/health error while fetching: HTTPConnectionPool(host=10.2.1.30, port=9200): Read timed out. (read timeout=4)
Looking at the cluster the number of active shards was never at a critical level, it went yellow and lost few shards but not enough to trigger the health check alert if the check worked properly.
These alerts were not expected during this kind of operation.