Page MenuHomePhabricator

Phase out Nodepool from production
Closed, ResolvedPublic

Description

All CI jobs have been migrated off of Nodepool (T190097) This task is to decommission Nodepool from the infrastructure.

WMCS

  • delete leftover instances
  • delete images and snapshots
  • purge custom disk images from WMCS infrastructure
  • delete contintcloud WMCS project - T209644
  • disable nodepoolmanager user from LDAP - T217064
  • clean up router/firewall rules between production and wmcs - T215173

Prod / puppet

See also server decom T209642

  • archive operations/debs/nodepool
  • remove nodepool package from apt.wikimedia.org
  • Kept around per operations
  • clean up puppet
  • drop sudo rules
  • clean up passwords/tokens from private repository done via T212230
  • remove firewall rules on contint1001/contint2001 (ferm should clean them up though)
  • drop database nodepooldb and user 'nodepool'@'10.64.16.155' on MySQL database (production-m5) - T212230

Misc

  • Delete Grafana boards
  • Purge Graphite metrics - T215172
  • Archive wiki documentation

Jenkins

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
hashar updated the task description. (Show Details)Nov 13 2018, 2:29 PM

I have migrated the last jobs still using Nodepool. Ready to phase out Nodepool :-]

Change 473202 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Remove Nodepool / diskimage-builder material

https://gerrit.wikimedia.org/r/473202

Change 473202 merged by jenkins-bot:
[integration/config@master] Remove Nodepool / diskimage-builder material

https://gerrit.wikimedia.org/r/473202

Change 473824 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Clean Zuul config

https://gerrit.wikimedia.org/r/473824

Change 473827 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] Clean JJB config

https://gerrit.wikimedia.org/r/473827

Change 473824 merged by jenkins-bot:
[integration/config@master] Clean Zuul config

https://gerrit.wikimedia.org/r/473824

Change 473827 merged by jenkins-bot:
[integration/config@master] Clean JJB config

https://gerrit.wikimedia.org/r/473827

hashar updated the task description. (Show Details)Nov 15 2018, 8:07 PM

Change 473834 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] nodepool: labtestservices2003 is not used for testing

https://gerrit.wikimedia.org/r/473834

hashar updated the task description. (Show Details)Nov 15 2018, 8:57 PM

Change 473846 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] ci: stop monitoring zmq on Jenkins

https://gerrit.wikimedia.org/r/473846

hashar updated the task description. (Show Details)Nov 15 2018, 9:03 PM

Mentioned in SAL (#wikimedia-operations) [2018-11-15T21:05:38Z] <hashar> Stopped nodepool on labnodepool1001.eqiad.wmnet . Service is no more used. T209361 T209642

Mentioned in SAL (#wikimedia-operations) [2018-11-15T21:06:58Z] <hashar> Deleting Nodepool instances on contintcloud T209361

hashar updated the task description. (Show Details)Nov 15 2018, 9:12 PM
greg awarded a token.Nov 16 2018, 5:21 PM
hashar updated the task description. (Show Details)Nov 16 2018, 5:35 PM

Change 473834 merged by Andrew Bogott:
[operations/puppet@production] nodepool: labtestservices2003 is not used for testing

https://gerrit.wikimedia.org/r/473834

Mentioned in SAL (#wikimedia-releng) [2018-11-23T11:09:22Z] <hashar> Jenkins: removing plugins "Single Use Slave" and "Event Publisher (via ZMQ PUB SUB)". Were used for Nodepool | T209361

hashar updated the task description. (Show Details)Nov 23 2018, 11:09 AM
greg triaged this task as Normal priority.Nov 27 2018, 9:13 PM

Change 473846 merged by Andrew Bogott:
[operations/puppet@production] ci: stop monitoring zmq on Jenkins

https://gerrit.wikimedia.org/r/473846

hashar updated the task description. (Show Details)Dec 15 2018, 9:07 AM
hashar updated the task description. (Show Details)
hashar updated the task description. (Show Details)Dec 18 2018, 4:14 PM
hashar updated the task description. (Show Details)Dec 18 2018, 4:18 PM

Change 480546 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] admin: remove CI sudo rule for "nodepool"

https://gerrit.wikimedia.org/r/480546

Change 480659 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] nodepool is gone, no need to assign a cluster

https://gerrit.wikimedia.org/r/480659

Change 480659 merged by Andrew Bogott:
[operations/puppet@production] nodepool is gone, no need to assign a cluster

https://gerrit.wikimedia.org/r/480659

Change 480663 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] cumin: remove nodepool from misc-releng group

https://gerrit.wikimedia.org/r/480663

Change 480663 merged by CRusnov:
[operations/puppet@production] cumin: remove nodepool from misc-releng group

https://gerrit.wikimedia.org/r/480663

Change 480546 merged by Muehlenhoff:
[operations/puppet@production] admin: remove CI sudo rule for "nodepool"

https://gerrit.wikimedia.org/r/480546

hashar updated the task description. (Show Details)Dec 19 2018, 9:49 AM
hashar updated the task description. (Show Details)Dec 19 2018, 10:00 AM

Change 481201 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] contint: remove unused classes

https://gerrit.wikimedia.org/r/481201

Change 481201 merged by Dzahn:
[operations/puppet@production] contint: remove unused classes

https://gerrit.wikimedia.org/r/481201

hashar updated the task description. (Show Details)Feb 4 2019, 3:18 PM
hashar updated the task description. (Show Details)Feb 4 2019, 3:21 PM
hashar updated the task description. (Show Details)Feb 4 2019, 3:25 PM

Mentioned in SAL (#wikimedia-releng) [2019-02-04T15:25:43Z] <hashar> removed Jenkins user "nodepoolmanager" as well as related authorizations | T209361

hashar updated the task description. (Show Details)Feb 4 2019, 3:30 PM
hashar updated the task description. (Show Details)Feb 4 2019, 3:33 PM

I have filled sub tasks for the other teams to act on :)

hashar changed the task status from Open to Stalled.Feb 4 2019, 3:35 PM

Change 488019 had a related patch set uploaded (by Hashar; owner: Hashar):
[operations/puppet@production] jenkins: stop purging nodepool agents config history

https://gerrit.wikimedia.org/r/488019

hashar updated the task description. (Show Details)Feb 5 2019, 1:01 PM

Change 488019 merged by Dzahn:
[operations/puppet@production] jenkins: stop purging nodepool agents config history

https://gerrit.wikimedia.org/r/488019

Mentioned in SAL (#wikimedia-operations) [2019-02-05T18:17:39Z] <mutante> contint1001/contint2001 -manually deleting crontab lines unpuppetized in gerrit:488019 (T209361)

hashar updated the task description. (Show Details)Feb 5 2019, 8:03 PM

I caught up with some more cleanup. Last thing to do is to disable the nodepoolmanager user in LDAP. I am not quite sure how to do that though.

hashar updated the task description. (Show Details)Feb 25 2019, 5:52 PM

I caught up with some more cleanup. Last thing to do is to disable the nodepoolmanager user in LDAP. I am not quite sure how to do that though.

Filled as sub task T217064

hashar closed this task as Resolved.Feb 26 2019, 10:45 PM
hashar updated the task description. (Show Details)

All clean up tasks conducted.

The remaining one is to dispose of the hardware which is T209642.