Page MenuHomePhabricator

rack/setup/install mw2259-mw2290
Closed, ResolvedPublic

Description

This task will track the receiving/racking/setup/installation of 32 new mw systems.

Hostnames: mw systems simply continue the sequence. The last mw host was mw2258, so these start at mw2259.

Racking Proposal: mw systems in codfw have been racked in the #3 and #4 racks in each row. Presently, there is a bit of space in A3, and A4, a small about in B3, row C looks full, and LOTS of room in D3 and D4. Please split them as follows:

racksystems
a35 systems
a45 systems
b32 systems
d310 systems
d410 systems

mw2259:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2260:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2261:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2262:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2263:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2264:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2265:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2266:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2267:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2268:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2269:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2270:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2271:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2272:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2273:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2274:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2275:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2276:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2277:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2278:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2279:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2280:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2281:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2282:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2283:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2284:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2285:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2286:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2287:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2288:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2289:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

mw2290:

  • - receive in system on procurement task T183404
  • - rack system with proposed racking plan (see above) & update racktables (include all system info plus location)
  • - bios/drac/serial setup/testing
  • - mgmt dns entries added for both asset tag and hostname
  • - network port setup (description, enable, private vlan)
    • end on-site specific steps
  • - production dns entries added (private subnet)
  • - operations/puppet update (install_server at minimum, other files if possible)
  • - OS installation (stretch)
  • - puppet accept/initial run
  • - handoff for service implementation

Related Objects

StatusSubtypeAssignedTask
ResolvedJoe
ResolvedPapaul
ResolvedRobH

Event Timeline

RobH triaged this task as Medium priority.Feb 26 2018, 7:23 PM
RobH created this task.

Before these are racked, I'd like someone to review my racking proposal:

Racking Proposal: mw systems in codfw have been racked in the #3 and #4 racks in each row. Presently, there is a bit of space in A3, and A4, a small about in B3, row C looks full, and LOTS of room in D3 and D4. Please split them as follows:

racksystems
a35 systems
a45 systems
b32 systems
d310 systems
d410 systems

@RobH since I have rack space to covert in B3 (9-17) what about not put anything in A3 and put 7 hosts in B3 see below

racksystems
A45
B37
D310
D410

@RobH since I have rack space to covert in B3 (9-17) what about not put anything in A3 and put 7 hosts in B3 see below

racksystems
A45
B37
D310
D410

I'm fine with that, it just seemed you had less space in row B. If you can fit the cables/power overhead, that is fine with me! Just update the task description to reflect what you are doing =]

This is the current layout of our mw codfw servers:

ABCD
appserver2025 (20)370
api1228 (15)150
videoscaler12 (2)10
jobrunner12050

In parentheses there are the hosts part of mw[2090-2134] that needs to be decommed. If we remove them, this is how it would look like:

ABCD
appserver205370
api1213150
videoscaler1010
jobrunner12050

The above chart doesn't count the hosts to be racked in this task.

My racking recommendation would be to put the new servers in row B in place of the ones we're decommissioning.

That would maintain a better balance between clusters and racks/rows.

I can work on logically decommission said servers today if needed.

My racking recommendation would be to put the new servers in row B in place of the ones we're decommissioning.

That would maintain a better balance between clusters and racks/rows.

+1 FWIW

All the servers were racked and wired before leaving Dallas. Will have to unrack them and re-wiring them.

let's say for now I'll just proceed decommissioning old machines, then we can just reassess the situation.

Change 418908 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Comment out mw2270 for now from scap_proxies

https://gerrit.wikimedia.org/r/418908

Change 418908 merged by Alexandros Kosiaris:
[operations/puppet@production] Comment out mw2270 for now from scap_proxies

https://gerrit.wikimedia.org/r/418908

@Papaul I would move the servers you put in row A to row B after you decommission the old servers in B 3, if that works for you.

Else, I'll try to reshuffle things when we reinstall the fleet to stretch later this year.

@Joe ok .
For now I have 5 new servers in A4 and 7 new servers in B3. so moving all the new server in A3 to B3, B3 will have a total of 12 new servers.
That works for you?

@Joe ok .
For now I have 5 new servers in A4 and 7 new servers in B3. so moving all the new server in A3 to B3, B3 will have a total of 12 new servers.
That works for you?

Yes, I think it works well, thanks!

Change 420372 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/dns@master] DNS: Add mgmt DNS entries for mw2259-mw2290

https://gerrit.wikimedia.org/r/420372

Change 420372 merged by Dzahn:
[operations/dns@master] DNS: Add mgmt DNS entries for mw2259-mw2290

https://gerrit.wikimedia.org/r/420372

Change 420425 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/dns@master] DNS: Add production DNS entries for mw2259-mw2290

https://gerrit.wikimedia.org/r/420425

@Joe @MoritzMuehlenhoff the last mw server me2258 has Jessie installed on it. Are we doing Stretch on the new onces or keep installing Jessie?

@Joe @MoritzMuehlenhoff the last mw server me2258 has Jessie installed on it. Are we doing Stretch on the new onces or keep installing Jessie?

Given that we're about to starting reimaging the app servers to stretch in the first half of April and that we won't have a switchover until then, I'd say let's go with stretch right away. Unless Giuseppe disagrees?

Change 420425 merged by Dzahn:
[operations/dns@master] DNS: Add production DNS entries for mw2259-mw2290

https://gerrit.wikimedia.org/r/420425

Change 420804 had a related patch set uploaded (by Papaul; owner: Papaul):
[operations/puppet@production] DHCP: Add MAC address entries for mw2259-mw2290

https://gerrit.wikimedia.org/r/420804

Change 420804 merged by Dzahn:
[operations/puppet@production] DHCP: Add MAC address entries for mw2259-mw2290

https://gerrit.wikimedia.org/r/420804

Change 420905 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] site: add mw2259 thru mw2290 with role spare

https://gerrit.wikimedia.org/r/420905

Change 420905 merged by Dzahn:
[operations/puppet@production] site: add mw2259-mw2290 with role spare, per row

https://gerrit.wikimedia.org/r/420905

mw2259 thru mw2290 have all been installed by Papaul and then i added them to site with role::spare but already sectioned by rows, see change above.

all puppet certs have been signed and they had their first runs, just a single exception, mw2267 had an issue and isn't fully done yet

Dzahn updated the task description. (Show Details)
Papaul updated the task description. (Show Details)

This is the current layout of our mw codfw servers:

ABCD
appserver2025 (20)370
api1228 (15)150
videoscaler12 (2)10
jobrunner12050

In parentheses there are the hosts part of mw[2090-2134] that needs to be decommed. If we remove them, this is how it would look like:

ABCD
appserver205370
api1213150
videoscaler1010
jobrunner12050

We are adding 20 servers in row D, and 12 in row B.

My proposal would be:

  • add 8 servers in row D to the API cluster, and 2 in row B (so the total servers of the API cluster in codfw would be ~ to the one in eqiad)
  • add 5 servers in row D to the jobrunners, abd 5 in row B, so that they're more evenly balanced.
  • do not add videoscalers if we want to unify the two clusters
  • add the remainder (7 servers in row D, 7 severs in row B) to the appservers

then we can think of redistributing the row C machines more evenly.

The final distribution would become

ABCD
appserver2012377
api1215158
videoscaler1010
jobrunner12555

At a later time, we might also want to redistribute servers a bit more, specifically I'd move some of the row-c appservers to the api cluster. But that could be done later.

Since I'm still not sure we're going to actually merge videoscalers to jobrunners soon, I think I'll re-add two in row B, in place of 2 appservers.

Change 420990 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] codfw: assign roles to the new appservers

https://gerrit.wikimedia.org/r/420990

Some of the figures above were wrong, so here it is again, this time correctly counted from site.pp:

roleABCD
appserver198377
api1215158
videoscaler1210
jobrunner55105

Apart from the appservers, it seems well balanced out.

Change 420990 merged by Giuseppe Lavagetto:
[operations/puppet@production] codfw: assign roles to the new appservers

https://gerrit.wikimedia.org/r/420990