Page MenuHomePhabricator

(Need by: 2020-03-06) rack/setup/install logstash102[6-9].eqiad.wmnet
Closed, ResolvedPublic

Description

This task covers the racking/setup/installation of logstash102[6-9].eqiad.wmnet purchased via T238586.

Racking Details:
Please ensure the 4 new hosts are racked 1 into each row (A.B,C,D). As they are all 10G network (upgraded) hosts (compared to existing logstash), they need the row diversity. Please attempt to also keep them in different racks than existing logstash, but that is lower priority to placing 1 new logstash into each row. Existing hosts are in racks A3, B1, D1, A4, B4, & C5.

logstash1026:

  • - receive in system on procurement task T238586
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - list network info on task if you do not have access to setup switch ports
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged
  • - handoff for service implementation
  • - service implementer changes from 'staged' status to 'active' status in netbox'

logstash1027:

  • - receive in system on procurement task T238586
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - list network info on task if you do not have access to setup switch ports
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged
  • - handoff for service implementation
  • - service implementer changes from 'staged' status to 'active' status in netbox'

logstash1028:

  • - receive in system on procurement task T238586
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - list network info on task if you do not have access to setup switch ports
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged
  • - handoff for service implementation
  • - service implementer changes from 'staged' status to 'active' status in netbox'

logstash1029:

  • - receive in system on procurement task T238586
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - list network info on task if you do not have access to setup switch ports
  • - network port setup (description, enable, vlan)
    • end on-site specific steps
  • - production dns entries added
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation
  • - puppet accept/initial run (with role:spare)
  • - host state in netbox set to staged
  • - handoff for service implementation
  • - service implementer changes from 'staged' status to 'active' status in netbox'

Event Timeline

RobH triaged this task as Medium priority.Dec 16 2019, 6:23 PM
RobH created this task.
Restricted Application added a project: SRE. · View Herald TranscriptDec 16 2019, 6:23 PM
RobH added a parent task: Unknown Object (Task).Dec 16 2019, 6:24 PM
RobH moved this task from Backlog to Racking Tasks on the ops-eqiad board.
RobH moved this task from Backlog to Externally blocked on the Wikimedia-Logstash board.
RobH moved this task from Backlog to Acknowledged on the SRE board.
RobH updated the task description. (Show Details)Dec 16 2019, 6:26 PM
wiki_willy renamed this task from rack/setup/install logstash102[6-9].eqiad.wmnet to (No Need By Date) rack/setup/install logstash102[6-9].eqiad.wmnet.Jan 2 2020, 11:37 PM

Hi @wiki_willy, do you know what the ETA is for these hosts?

These are needed for okr work this quarter, and are related to the hosts in T240882 that were racked in Jan.

Thanks in advance!

@herron - is there a specific date that you need these by? We can adjust our priorities and the need by date of this task to meet that.

Thanks,
Willy

@herron - is there a specific date that you need these by? We can adjust our priorities and the need by date of this task to meet that.

@wiki_willy this week or next would be ideal, if possible.

@herron - I'll add next week on as the target date. We're in the middle of a couple other installs, but we can try prioritize this one afterwards. Thanks, Willy

wiki_willy renamed this task from (No Need By Date) rack/setup/install logstash102[6-9].eqiad.wmnet to (Need By March 6) rack/setup/install logstash102[6-9].eqiad.wmnet.Feb 24 2020, 4:04 PM
wiki_willy renamed this task from (Need By March 6) rack/setup/install logstash102[6-9].eqiad.wmnet to (Need by: 2020-03-06) rack/setup/install logstash102[6-9].eqiad.wmnet.Feb 24 2020, 8:25 PM
RobH removed a subscriber: RobH.Mar 3 2020, 6:01 PM

Change 576897 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Adding mgmt dns for logstash102[6-9]

https://gerrit.wikimedia.org/r/576897

Change 576897 abandoned by Cmjohnson:
Adding mgmt dns for logstash102[6-9]

https://gerrit.wikimedia.org/r/576897

Change 576909 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Adding mgmt dns for logstash102[6-9]

https://gerrit.wikimedia.org/r/576909

Change 576909 merged by Cmjohnson:
[operations/dns@master] Adding mgmt dns for logstash102[6-9]

https://gerrit.wikimedia.org/r/576909

Cmjohnson updated the task description. (Show Details)Mar 4 2020, 9:16 PM
Cmjohnson updated the task description. (Show Details)Mar 5 2020, 5:19 PM

@herron do these need to be 10G? I racked them today in 1G racks

Change 577337 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Add production dns entries for logstash102[6-9]

https://gerrit.wikimedia.org/r/577337

Change 577337 merged by Cmjohnson:
[operations/dns@master] Add production dns entries for logstash102[6-9]

https://gerrit.wikimedia.org/r/577337

Change 577650 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Finishing mgmt entries for logstash1029

https://gerrit.wikimedia.org/r/577650

Change 577650 merged by Cmjohnson:
[operations/dns@master] Finishing mgmt entries for logstash1029

https://gerrit.wikimedia.org/r/577650

Change 577685 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/puppet@production] Add dhcpd/netboot.cfg and site.pp (role::spare) logstash102[6-9]

https://gerrit.wikimedia.org/r/577685

Change 577685 merged by Cmjohnson:
[operations/puppet@production] Add dhcpd/netboot.cfg and site.pp (role::spare) logstash102[6-9]

https://gerrit.wikimedia.org/r/577685

Cmjohnson updated the task description. (Show Details)Mar 6 2020, 11:40 PM

moved these to 10G racks today, updated all the network ports and did the operations/puppet updates

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

logstash1027.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003101143_cmjohnson_199547_logstash1027_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

logstash1028.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003101144_cmjohnson_199713_logstash1028_eqiad_wmnet.log.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

logstash1029.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003101148_cmjohnson_200098_logstash1029_eqiad_wmnet.log.

Cmjohnson updated the task description. (Show Details)Mar 10 2020, 12:28 PM

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

logstash1026.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003101252_cmjohnson_211610_logstash1026_eqiad_wmnet.log.

Cmjohnson reassigned this task from Jclark-ctr to herron.Mar 10 2020, 1:15 PM
Cmjohnson updated the task description. (Show Details)
Cmjohnson removed a project: ops-eqiad.
Cmjohnson added a subscriber: Jclark-ctr.

@herron these servers are all yours, I have already added them to role spare in site.pp. I am removing the ops-eqiad tag and assigned it to you.

Dzahn added a subscriber: Dzahn.Mar 10 2020, 9:54 PM

logstash1029 seems to be different from other servers and has problems.

Icinga alerts and can't ssh to it.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

logstash1029.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003102322_cmjohnson_59915_logstash1029_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['logstash1029.eqiad.wmnet']

and were ALL successful.

I did a reimage of logstash1029, everything appears normal now

Yep, that fixed it. logstash1029 now all green in Icinga. thanks

herron closed this task as Resolved.Mar 11 2020, 12:27 AM

Thanks @Cmjohnson! Will resolve this and track service setup in T247376

Change 579019 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site: let new logstash machines use role(insetup)

https://gerrit.wikimedia.org/r/579019

Change 579019 merged by Dzahn:
[operations/puppet@production] site: let new logstash machines use role(insetup)

https://gerrit.wikimedia.org/r/579019