Page MenuHomePhabricator

Investigate the phabricator-prod-1001 alert
Closed, ResolvedPublic


There is a permanent "Project devtools instance phabricator-prod-1001 is down" alert in Alertmanager [1]. The phabricator-prod-1001 is active in the Openstack browser [2]. Can we silence the alert, shut down the host or maybe both?


Event Timeline

@brennen now that we have phorge-1001 in devtools, can phabricator-prod-1001 be decommissioned?

Jelto claimed this task.
Jelto subscribed.

The metric is flapping

Also the services confd.service and networking.service are failing. There was a ipv6 address configured in /etc/network/interfaces. I commented this ipv6 address out and moved the source /etc/network/interfaces.d/* to the bottom of the file (which contains some generic cloud-init settings which should run after our custom configuration). After a restart the instance is "up" again.

I'll close this task.

@brennen now that we have phorge-1001 in devtools, can phabricator-prod-1001 be decommissioned?

I added this to the topic for the next Collab-RelEng sync meeting.

@brennen now that we have phorge-1001 in devtools, can phabricator-prod-1001 be decommissioned?

IIUC yes; see also T334519#8787677 is pointed at phabricator-prod-1001, where we did an in-place upgrade. I believe the phorge-1001 instance was a from-scratch install that dzahn did early in the process of exploring the migration.

I'm not sure why the alert is flapping here, since is up at the moment. (Edit: Failed to fully read Jelto's comment above.)

We currently have an alert again for this for some time. Noticed today in metrics review. is pointed at phabricator-prod-1001, where we did an in-place upgrade. I believe the phorge-1001 instance was a from-scratch install that dzahn did early in the process of exploring the migration.

Yes, phorge-1001 was the first "proof of concept" install of phorge, using puppet but a separate class / role from the phab production role.