Page MenuHomePhabricator

Reduce the number of appservers we're using in eqiad
Closed, ResolvedPublic

Description

We must decommission more of the older appservers for:

  • Free rack space in eqiad
  • Decommission hardware that is more than 5 years old
  • Test a number of cores/memory equal to what we have in codfw

Event Timeline

Joe created this task.Feb 8 2016, 5:47 PM
Joe claimed this task.
Joe raised the priority of this task from to Normal.
Joe updated the task description. (Show Details)
Joe added subscribers: jcrespo, gerritbot, Joe and 2 others.
Joe renamed this task from Reduce the number of appservers we're using in eqiad preparing for decommission to Reduce the number of appservers we're using in eqiad.Feb 8 2016, 5:54 PM
Joe set Security to None.
Joe added a subscriber: ori.
Johsthao closed this task as a duplicate of T126250: <spam>.Feb 8 2016, 6:24 PM
matmarex reopened this task as Open.Feb 8 2016, 6:32 PM
Joe added a comment.Feb 9 2016, 2:50 PM

I depooled mw1025-1050 for now setting all of them to 'inactive'.

I'll wait tomorrow to merge the patches to make that definitive.

Joe added a comment.Feb 10 2016, 11:24 AM

I just depooled mw1051-69 as well, the cluster still seems unimpressed...

Change 275374 had a related patch set uploaded (by Giuseppe Lavagetto):
appservers: decommission permanently mw1026-69

https://gerrit.wikimedia.org/r/275374

Change 275374 merged by Giuseppe Lavagetto:
appservers: decommission permanently mw1026-69

https://gerrit.wikimedia.org/r/275374

Joe added a comment.Mar 7 2016, 10:01 AM

I have removed every reference to mw1026-1069 from puppet and conftool, and shut down the machines. I'' also opening a separated ticket for decommissioning

Change 275383 had a related patch set uploaded (by Giuseppe Lavagetto):
scap: remove decommissioned appservers from the scap dsh group

https://gerrit.wikimedia.org/r/275383

Change 275383 merged by Giuseppe Lavagetto:
scap: remove decommissioned appservers from the scap dsh group

https://gerrit.wikimedia.org/r/275383

Joe changed the task status from Open to Stalled.Mar 7 2016, 10:38 AM
Joe added a comment.Mar 7 2016, 10:42 AM

I think we can reduce the pool size further, but it's already smaller than the current pool in codfw

Change 275756 had a related patch set uploaded (by Giuseppe Lavagetto):
Remove decommissioned appservers

https://gerrit.wikimedia.org/r/275756

Change 275756 abandoned by Giuseppe Lavagetto:
Remove decommissioned appservers

Reason:
Already done in I040c2e27b750ea1906b989b2380a10bbd23f7906

https://gerrit.wikimedia.org/r/275756

Joe changed the task status from Stalled to Open.Apr 1 2016, 6:41 AM

I will adjust the weights in various clusters, and start removing more servers today, up to the point where I don't feel comfortable removing more.

I want to have all the mw* clusters to an average utilization of around 20% at least.

Mentioned in SAL [2016-04-01T07:00:27Z] <_joe_> depooling mw1070-89 from the appserver cluster. T126242

Joe added a comment.Apr 18 2016, 9:25 AM

I am waiting until we switch back mediawiki from codfw before I definitively decommission the last batch of appservers I removed.

Change 285604 had a related patch set uploaded (by Giuseppe Lavagetto):
mediawiki: remove decommissioned appservers

https://gerrit.wikimedia.org/r/285604

Change 285605 had a related patch set uploaded (by Giuseppe Lavagetto):
dhcp: remove entries for decommissioned appservers

https://gerrit.wikimedia.org/r/285605

Mentioned in SAL [2016-04-27T08:40:50Z] <_joe_> stopping puppet on mw10[7-8][0-9] and mw112[1-9]/mw1130 for T126242

Change 285604 merged by Giuseppe Lavagetto:
mediawiki: remove decommissioned appservers

https://gerrit.wikimedia.org/r/285604

Mentioned in SAL [2016-04-27T09:56:30Z] <_joe_> clean puppet certs and facts on mw10[7-8][0-9] and mw112[1-9]/mw1130 for T126242

Mentioned in SAL [2016-04-27T10:01:34Z] <_joe_> shutting down mw10[7-8][0-9] and mw112[1-9]/mw1130 for T126242

I think I can close this task as resolved, the subtasks aren't real blockers, more of "related tickets"

Joe closed this task as Resolved.Apr 27 2016, 10:38 AM

Change 285605 merged by Cmjohnson:
dhcp: remove entries for decommissioned appservers

https://gerrit.wikimedia.org/r/285605