Page MenuHomePhabricator

integration-slave-precise-1001 needs to be rebuilt (doesn't have DNS)
Closed, ResolvedPublic

Description

integration-slave-precise-1001 was acting a bit weird so I tried ssh'ing into it, and couldn't. @Andrew took a look and found that the instance was missing DNS. He said trying to add a DNS record back onto the instance would be messy, so we should rebuild it instead.

I already marked the instance as offline in jenkins.

Event Timeline

hashar subscribed.
[22:37:03] <andrewbogott>	 hm...
[22:37:22] <andrewbogott>	 it doesn’t have dns.  Which...
[22:37:50] <andrewbogott>	 well, a few weeks ago I cleaned up some leaked dns entries and accidentally deleted at least one entry for a working instance.  That one acted like this one...
[22:37:57] <andrewbogott>	 can we rebuild it?
[22:38:08] <andrewbogott>	 retrofitting a dns record back on to it will be a bit messy

And indeed that instance lacks a DNS entry:

$ host integration-slave-precise-1001.eqiad.wmflabs
Host integration-slave-precise-1001.eqiad.wmflabs not found: 3(NXDOMAIN)
# salt -v '*slave-precise-1001*' cmd.run 'hostname; hostname --fqdn'
Executing job with jid 20160304083821097268
-------------------------------------------

integration-slave-precise-1001.integration.eqiad.wmflabs:
    integration-slave-precise-1001
    hostname: Name or service not known

Lets rebuild it!!!!!

Actually I am going to just remove the instance. The reason is we barely have any jobs still tied to Precise beside the Zend 5.3 jobs. The instance precise-1001 just provided 2 executors. So there is no need to refill it.

From https://integration.wikimedia.org/ci/label/UbuntuPrecise/load-statistics

Capture d’écran 2016-03-04 à 09.40.59.png (521×756 px, 45 KB)

Mentioned in SAL [2016-03-04T08:42:54Z] <hashar> CI deleting integration-slave-precise-1001 (2 executors). It is not in labs DNS which causes bunch of issues, no need for the capacity anymore. T128802