Page MenuHomePhabricator

(Need By: TBD) rack/setup/install mc10[37-54].eqiad.wmnet
Closed, ResolvedPublic

Description

This task will track the racking, setup, and OS installation of mc10[37-54].eqiad.wmnet

Hostname / Racking / Installation Details

Hostnames: mc10[37-54].eqiad.wmnet
Racking Proposal: 4x in A7, 4x in B2 or B4, 4x C2, 4x D4....and the remaining two can go in the other 10G racks - please note room in A7 is dependent on T272085#6805439
Networking/Subnet/VLAN/IP: 1*10G production connection, 1*1g mgmt connection
Partitioning/Raid: existing raid1-2dev.cfg applied to mc*, no change needed
OS Distro: buster

Per host setup checklist

Each host should have its own setup checklist copied and pasted into the list below.

mc1037:

  • - receive in system on procurement task <enter task # here> & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
    • end on-site specific steps
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1038:

  • - receive in system on procurement task <enter task # here> & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
    • end on-site specific steps
  • - operations/puppet update - this should include updates to install_server dhcp and netboot, and site.pp role(insetup) or cp systems use role(insetup::nofirm).
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1039:

  • - receive in system on procurement task <enter task # here> & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1040:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1041:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1042:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1043:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1044:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1045:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1046:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1047:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1048:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1049:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1050:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1051:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1052:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1053:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

mc1054:

  • - receive in system on procurement task T272085 & in coupa
  • - rack system with proposed racking plan (see above) & update netbox (include all system info plus location, state of planned)
  • - bios/drac/serial setup/testing
  • - add mgmt dns (asset tag and hostname) and production dns entries in netbox, run cookbook sre.dns.netbox.
  • - network port setup via netbox, run homer to commit
  • - firmware updated - bios to 2.11.2, network updated to 21.80.x, idrac 5.00.00.00
  • - operations/puppet update
  • - OS installation & initital puppet run via wmf-auto-reimage or wmf-auto-reimage-host
  • - host state in netbox set to staged

Once the system(s) above have had all checkbox steps completed, this task can be resolved.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
RobH added a parent task: Unknown Object (Task).
RobH mentioned this in Unknown Object (Task).
RobH moved this task from Backlog to Racking Tasks on the ops-codfw board.
RobH edited projects, added ops-eqiad; removed ops-codfw.
RobH moved this task from Backlog to Racking Tasks on the ops-eqiad board.
RobH added a project: serviceops.
RobH unsubscribed.

Handing over to Chris to finish the 2 we have received

mc1037 r,A7 u41 P5 ID5352
mc1038 r,A7 u42 P19 ID5353

jijiki subscribed.

@Cmjohnson please let me know if I can help to put these 2 servers into production as soon as possible

jijiki raised the priority of this task from Medium to High.Mar 16 2021, 12:24 PM
Cmjohnson added subscribers: RobH, Cmjohnson.

@RobH mc1037 NIC cfg done (enabled pxe on 10G disabeld on the 1GE), MAC BC:97:E1:E4:4B:30

mc1038 same and MAC is BC:97:E1:E4:27:30

@RobH after you set the 2 up can you assign this task back to @Jclark-ctr to finish the remainder please

Change 673137 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] mc103[78] install params

https://gerrit.wikimedia.org/r/673137

Change 673137 merged by RobH:
[operations/puppet@production] mc103[78] install params

https://gerrit.wikimedia.org/r/673137

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

['mc1037.eqiad.wmnet', 'mc1038.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202103172210_robh_1618.log.

Completed auto-reimage of hosts:

['mc1037.eqiad.wmnet', 'mc1038.eqiad.wmnet']

and were ALL successful.

RobH updated the task description. (Show Details)

serviceops please be aware mc1037 and mc1038 are ready for your team to place into service. The rest are now assigned back to @Jclark-ctr for racking.

@RobH thank you! @Jclark-ctr, mc1039-mc1054 can be racked in Q4, unless we have more mc* victims. Thank you!

@jijiki Racking these host i only have 2 available spots in D4 will any of the ones in this rack be decommissioned soon?

@Cmjohnson I have finished up these remaining ones i can do once i get space in A7 and D4.

name rack_name position cable id port
mc1041 B4 24 5010 24
mc1042 B4 25 5011 40
mc1043 B4 26 5373 39
mc1044 B4 19 5374 38
mc1045 C2 23 5372 36
mc1046 C2 24 3805 37
mc1047 C2 25 5252 38
mc1048 C2 26 5381 39
mc1049 C4 15 5382 44
mc1050 C4 16 5383 45

@jijiki Racking these host i only have 2 available spots in D4 will any of the ones in this rack be decommissioned soon?

D7: mc1033-mc1036 will be decommissioned in this round, but if we must coordinate. I will ping you on the task when it is safe to decommission at least 2 of them, and make room.

A7: To my knowledge, all mw1269-mw1279 will be decommissioned (after we finish with T279309), so we should hold on until part of this work is done. (@Dzahn can confirm)

Thank you!

Hi all, I just made the (so far missing) decom ticket [[T280203]] for mw1261 through mw1301. From the procurement date and ticket in netbox these are the ones that look to me like they have to go. It currently looks like we have to do installation of new servers and decom of old servers in parallel in small batches.

Cmjohnson updated the task description. (Show Details)

mc1041-50 netbox and network ports updated have been completed, need to go on-site and setup idrac

@Jclark-ctr Can you please cable mc1039, 1051-1054. I could not find mc1039 so please update netbox with location as well.

@Cmjohnson @Jclark-ctr We would like to start putting those servers in production, is it possible to update or complete any actions remaining for the racked servers ?

@jijiki I am only held up on two servers for racking mc1039, mc1040 per racking proposal i am waiting for rack A7 to have 2 available spots. I do have 2 Available spots in Rack A2 i can get these two servers in and hand over to @Cmjohnson for configuration.

@jijiki If the remaining two work for being racked in A2 instead fo A7 they have been racked and can be configured by @Cmjohnson Otherwise we will be waiting for T280203 for room in A7

mc1039 A2. U14. Port32. Cableid#11025
mc1040 A2. U15. Port.30 Cableid#11026
mc1051 D4. U22. Port.13 Cableid#11027
mc1052 D4. U23. Port.14 Cableid#11028
mc1053 D4. U31. Port.15 Cableid#11029
mc1054 D4. U32. Port.17 Cableid#11030

The only remaining on most of these is the idrac setup, This will happen tomorrow (Friday 6 AUG)

the on-site specific work has been completed

Change 711189 had a related patch set uploaded (by RobH; author: RobH):

[operations/puppet@production] new mc nodes install params

https://gerrit.wikimedia.org/r/711189

Change 711189 merged by RobH:

[operations/puppet@production] new mc nodes install params

https://gerrit.wikimedia.org/r/711189

Change 711191 had a related patch set uploaded (by RobH; author: RobH):

[operations/puppet@production] fixing entry for new mc host

https://gerrit.wikimedia.org/r/711191

Change 711191 merged by RobH:

[operations/puppet@production] fixing entry for new mc host

https://gerrit.wikimedia.org/r/711191

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

['mc1039.eqiad.wmnet', 'mc1040.eqiad.wmnet', 'mc1041.eqiad.wmnet', 'mc1042.eqiad.wmnet', 'mc1043.eqiad.wmnet', 'mc1044.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202108101912_robh_23622.log.

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

['mc1041.eqiad.wmnet', 'mc1042.eqiad.wmnet', 'mc1043.eqiad.wmnet', 'mc1044.eqiad.wmnet', 'mc1045.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202108102014_robh_6220.log.

Completed auto-reimage of hosts:

['mc1041.eqiad.wmnet', 'mc1042.eqiad.wmnet', 'mc1043.eqiad.wmnet', 'mc1044.eqiad.wmnet', 'mc1045.eqiad.wmnet']

and were ALL successful.

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

['mc1046.eqiad.wmnet', 'mc1047.eqiad.wmnet', 'mc1048.eqiad.wmnet', 'mc1049.eqiad.wmnet', 'mc1050.eqiad.wmnet', 'mc1051.eqiad.wmnet', 'mc1052.eqiad.wmnet', 'mc1053.eqiad.wmnet', 'mc1054.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202108102117_robh_24888.log.

Completed auto-reimage of hosts:

['mc1046.eqiad.wmnet', 'mc1047.eqiad.wmnet', 'mc1048.eqiad.wmnet', 'mc1049.eqiad.wmnet', 'mc1050.eqiad.wmnet', 'mc1051.eqiad.wmnet', 'mc1052.eqiad.wmnet', 'mc1053.eqiad.wmnet', 'mc1054.eqiad.wmnet']

and were ALL successful.

RobH updated the task description. (Show Details)