Page MenuHomePhabricator

(Need by: 2020-03-01) rack/setup/install htmldumper1001.eqiad.wmnet.
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of htmldumper1001.eqiad.wmnet.

Hostname / Racking / Installation Details

Hostnames: htmldumper1001
Racking Proposal: Can go anywhere
Networking/Subnet/VLAN/IP: Internal network, 1G is fine
Partitioning/Raid: Raid 10 for the /srv disks but we can just use francium's partman recipe

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

htmldumper1001:

  • - receive in system on procurement task T242009
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Event Timeline

RobH added a parent task: Unknown Object (Task).Feb 18 2020, 10:39 PM
RobH moved this task from Backlog to Racking Tasks on the ops-eqiad board.
wiki_willy renamed this task from (2020-03-01) rack/setup/install htmldumper1001.eqiad.wmnet. to (Need By 2020-03-01) rack/setup/install htmldumper1001.eqiad.wmnet..Feb 19 2020, 2:11 AM
wiki_willy renamed this task from (Need By 2020-03-01) rack/setup/install htmldumper1001.eqiad.wmnet. to (Need by: 2020-03-01) rack/setup/install htmldumper1001.eqiad.wmnet..Feb 24 2020, 8:26 PM

Change 578590 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Add mgmt dns for htmldumper1001

https://gerrit.wikimedia.org/r/578590

Change 578590 merged by Cmjohnson:
[operations/dns@master] Add mgmt dns for htmldumper1001

https://gerrit.wikimedia.org/r/578590

Change 578640 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Add production dns for htmldumper1001

https://gerrit.wikimedia.org/r/578640

Change 578640 merged by Cmjohnson:
[operations/dns@master] Add production dns for htmldumper1001

https://gerrit.wikimedia.org/r/578640

Change 578989 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/puppet@production] Add htmldumper1001 to dhcpd file and netboot.cfg

https://gerrit.wikimedia.org/r/578989

Change 578990 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/puppet@production] Add htmldumper1001 to site.pp

https://gerrit.wikimedia.org/r/578990

Change 578989 merged by Cmjohnson:
[operations/puppet@production] Add htmldumper1001 to dhcpd file and netboot.cfg

https://gerrit.wikimedia.org/r/578989

Change 578990 merged by Cmjohnson:
[operations/puppet@production] Add htmldumper1001 to site.pp

https://gerrit.wikimedia.org/r/578990

Change 578996 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] use new role(insetup) on a few hosts in setup

https://gerrit.wikimedia.org/r/578996

Change 578996 merged by Dzahn:
[operations/puppet@production] use new role(insetup) on a few hosts in setup

https://gerrit.wikimedia.org/r/578996

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

htmldumper1001.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003111928_cmjohnson_3633_htmldumper1001_eqiad_wmnet.log.

Change 579286 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/puppet@production] update mac address for htmldumper1001

https://gerrit.wikimedia.org/r/579286

Change 579286 merged by Cmjohnson:
[operations/puppet@production] update mac address for htmldumper1001

https://gerrit.wikimedia.org/r/579286

Change 579288 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] updating dns/asset tag names to reflect correct servers htmldumper/fran1001

https://gerrit.wikimedia.org/r/579288

Change 579288 merged by Cmjohnson:
[operations/dns@master] updating dns/asset tag names to reflect correct servers htmldumper/fran1001

https://gerrit.wikimedia.org/r/579288

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

htmldumper1001.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121513_cmjohnson_203633_htmldumper1001_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['htmldumper1001.eqiad.wmnet']

Of which those FAILED:

['htmldumper1001.eqiad.wmnet']

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

htmldumper1001.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121523_cmjohnson_205338_htmldumper1001_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

htmldumper1001.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121611_cmjohnson_213880_htmldumper1001_eqiad_wmnet.log.

Dzahn subscribed.

There are a lot of Icinga alerts about all the things on htmldumper1001 for some reason:

https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?host=htmldumper1001

I suspected it just needs restart of nagios-nrpe-server but i also can't SSH to it. I get asked for password.

I don't think it got puppetized properly, https://netbox.wikimedia.org/extras/reports/puppetdb.PhysicalHosts/ alerts as missing physical device in PuppetDB: state Staged in Netbox.

Set its state back to "planned".

Dzahn added a subscriber: Jclark-ctr.

Hi Chris, can this be fixed from remote? The host is in an odd state. It exists but we can't SSH to it or use install_console either.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

htmldumper1001.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003301929_cmjohnson_251747_htmldumper1001_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['htmldumper1001.eqiad.wmnet']

and were ALL successful.

@Dzahn I reimaged and am now able to login

cmjohnson@Bolts2 ~ % ssh htmldumper1001.eqiad.wmnet
Linux htmldumper1001 4.19.0-8-amd64 #1 SMP Debian 4.19.98-1 (2020-01-26) x86_64
Debian GNU/Linux 10 (buster)
htmldumper1001 is a Host being setup for later application of a role (insetup)
The last Puppet run was at Mon Mar 30 21:06:42 UTC 2020 (20 minutes ago).
Last puppet commit: (290d2ed2a5) John Bond - cloude tlsproxy::envoy: remove ~ default
Debian GNU/Linux 10 auto-installed on Mon Mar 30 19:49:54 UTC 2020.
Last login: Mon Mar 30 20:14:26 2020 from 208.80.154.86

@Dzahn I reimaged and am now able to login

@Cmjohnson Yes, i can login now. Thank you! :)

@ArielGlenn htmldumper1001 is now usable. Any idea what kind of role we want on it?

It should get role(dumps::web::htmldumps) like francium. It doesn't hurt to have multiple hosts with this role.

Change 585688 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site: add role(dumps::web::htmldumps) to htmldumper1001

https://gerrit.wikimedia.org/r/585688

Change 585688 merged by Dzahn:
[operations/puppet@production] site: add role(dumps::web::htmldumps) to htmldumper1001

https://gerrit.wikimedia.org/r/585688

It should get role(dumps::web::htmldumps)

Done. Applied and puppet ran and no puppet failure.

Dzahn reassigned this task from Cmjohnson to ArielGlenn.

Change 659341 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site: update comment on htmldumper1001

https://gerrit.wikimedia.org/r/659341

Change 659349 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] dumps: switch htmldumps server from francium to htmldumper1001

https://gerrit.wikimedia.org/r/659349

@ArielGlenn hi! a task to decom francium has been created but I noticed it seems to still be in config. So I uploaded patches to actually switch it. But I have no idea how easy or hard it is to actually do that. If that should have a separate task to move it into production, happy to make one.

Change 659341 merged by Dzahn:
[operations/puppet@production] site: update comment on htmldumper1001

https://gerrit.wikimedia.org/r/659341

Change 659349 merged by Dzahn:
[operations/puppet@production] dumps: switch htmldumps server from francium to htmldumper1001

https://gerrit.wikimedia.org/r/659349