Page MenuHomePhabricator

Update Grid Engine and grid_configurator.py for the new domain names
Closed, DeclinedPublic

Description

The new <project>.eqiad1.wikimedia.cloud domain break grid engine because it is highly dependent on DNS naming for host permissions.

This happens on two levels. First, the value of hostname -f is sent to the master as the client ID, and that is compared to the reverse DNS resolution of the IP of the requestor (so this must match the PTR record). That requires fixing the grid servers' /etc/hosts files.
Then the grid will check its config for the hostname (once it is considered matching DNS). Right now, that will not match, and grid_configurator.py has eqiad.wmflabs hardcoded.

So far, I have updated the admin hosts to include the new FQDNs of the master and shadow on tools and toolsbeta.

Event Timeline

Bstorm triaged this task as Medium priority.Feb 18 2020, 11:37 PM
Bstorm created this task.

Change 574885 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] sonofgridengine: accomodate the new domain name

https://gerrit.wikimedia.org/r/574885

Change 575637 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] sonofgridengine: prepare for new domain name

https://gerrit.wikimedia.org/r/575637

Change 575637 merged by Bstorm:
[operations/puppet@production] sonofgridengine: prepare for new domain name

https://gerrit.wikimedia.org/r/575637

That last patch just using the host_aliases should be enough, really.

Change 574885 abandoned by Bstorm:
sonofgridengine: accomodate the new domain name

Reason:
Decided to try the approach in Ida1a2c10 per jhedden's suggestion

https://gerrit.wikimedia.org/r/574885

Used aliases instead