Let's upgrade parsercache to Bullseye
pc1:
- pc2014
- pc2011
- pc1014 (floating host)
- pc1011
pc2:
- pc2012
- pc1012
pc3:
- pc2013
- pc1013
Let's upgrade parsercache to Bullseye
pc1:
pc2:
pc3:
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T291916 Tracking task for Bullseye migrations in production | |||
Resolved | Marostegui | T298585 Upgrade WMF database-and-backup-related hosts to bullseye | |||
Resolved | Marostegui | T299046 Upgrade parsercache infra to Bullseye |
I haven't seen anything relevant performance-wise on pc1011 so I think it is ok to go ahead and migrate our parsercache infra to Bullseye.
Change 753874 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] pc2012: Disable notifications
Change 753874 merged by Marostegui:
[operations/puppet@production] pc2012: Disable notifications
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host pc2012.codfw.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host pc2012.codfw.wmnet with OS bullseye completed:
Change 753912 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] pc2013: Disable notifications
Change 753912 merged by Marostegui:
[operations/puppet@production] pc2013: Disable notifications
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host pc2013.codfw.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host pc2013.codfw.wmnet with OS bullseye completed:
Mentioned in SAL (#wikimedia-operations) [2022-01-14T09:11:17Z] <marostegui> Move pc1014 from pc1 to pc2 T299046
Change 753943 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] pc2011: Disable notifications
Change 753943 merged by Marostegui:
[operations/puppet@production] pc2011: Disable notifications
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host pc2011.codfw.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host pc2011.codfw.wmnet with OS bullseye completed:
Change 754784 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] pc1014: Disable notifications
Change 754784 merged by Marostegui:
[operations/puppet@production] pc1014: Disable notifications
Change 754805 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] pc1014: Move it to pc2
Change 754805 merged by Marostegui:
[operations/puppet@production] pc1014: Move it to pc2
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host pc1014.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host pc1014.eqiad.wmnet with OS bullseye executed with errors:
I have finished this reimage manually - but I am going to run it again to see why it could've failed, as it failed out quickly at:
Running Puppet with args --quiet --attempts 30 on 1 hosts: alert1001.wikimedia.org ----- OUTPUT of 'run-puppet-agent...et --attempts 30' ----- ================ PASS | | 0% (0/1) [00:09<?, ?hosts/s] FAIL |███████████████████████████████████████████████████████████████████████████████| 100% (1/1) [00:09<00:00, 9.41s/hosts] 100.0% (1/1) of nodes failed to execute command 'run-puppet-agent...et --attempts 30': alert1001.wikimedia.org 0.0% (0/1) success ratio (< 100.0% threshold) for command: 'run-puppet-agent...et --attempts 30'. Aborting. 0.0% (0/1) success ratio (< 100.0% threshold) of nodes successfully executed all commands. Aborting. Exception raised while executing cookbook sre.hosts.reimage: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/spicerack/_menu.py", line 234, in run raw_ret = runner.run() File "/srv/deployment/spicerack/cookbooks/sre/hosts/reimage.py", line 466, in run self.downtime.get_runner(self.downtime.argument_parser().parse_args(downtime_args)).run() File "/srv/deployment/spicerack/cookbooks/sre/hosts/downtime.py", line 119, in run self.puppet.run(quiet=True, attempts=30) File "/usr/lib/python3/dist-packages/spicerack/puppet.py", line 193, in run self._remote_hosts.run_sync(Command(command, timeout=timeout), batch_size=batch_size) File "/usr/lib/python3/dist-packages/spicerack/remote.py", line 528, in run_sync print_progress_bars=print_progress_bars, File "/usr/lib/python3/dist-packages/spicerack/remote.py", line 720, in _execute raise RemoteExecutionError(ret, "Cumin execution failed") spicerack.remote.RemoteExecutionError: Cumin execution failed (exit_code=2) **The reimage failed, see the cookbook logs for the details**
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host pc1014.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host pc1014.eqiad.wmnet with OS bullseye executed with errors:
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host pc1014.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host pc1014.eqiad.wmnet with OS bullseye completed:
Change 754864 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/mediawiki-config@master] ProductionServices.php: Promote pc1014 to pc2 master
Change 754865 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] mariadb: Promote pc1014 to pc2 master
Change 754865 merged by Marostegui:
[operations/puppet@production] mariadb: Promote pc1014 to pc2 master
Change 754864 merged by jenkins-bot:
[operations/mediawiki-config@master] ProductionServices.php: Promote pc1014 to pc2 master
Mentioned in SAL (#wikimedia-operations) [2022-01-18T08:30:28Z] <marostegui@deploy1002> Synchronized wmf-config/ProductionServices.php: Promote pc1014 to master in pc2 T299046 (duration: 00m 51s)
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host pc1012.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host pc1012.eqiad.wmnet with OS bullseye completed:
pc1012 reimaged, I have configured its replication from pc1014 to get all the keys that were inserted during its reimage.
Mentioned in SAL (#wikimedia-operations) [2022-01-18T09:59:39Z] <marostegui@deploy1002> Synchronized wmf-config/ProductionServices.php: Revert: Promote pc1014 to master in pc2 T299046 (duration: 00m 50s)
Mentioned in SAL (#wikimedia-operations) [2022-01-18T10:00:45Z] <marostegui> Move pc1014 to pc3 T299046
Change 754871 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] pc1014: Move to pc3
Change 754871 merged by Marostegui:
[operations/puppet@production] pc1014: Move to pc3
Change 756873 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/mediawiki-config@master] ProductionServices.php: Promote pc1014 to pc3 master
Change 756874 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] pc1013: Disable notifications
Change 756873 merged by jenkins-bot:
[operations/mediawiki-config@master] ProductionServices.php: Promote pc1014 to pc3 master
Change 756874 merged by Marostegui:
[operations/puppet@production] pc1013: Disable notifications
Mentioned in SAL (#wikimedia-operations) [2022-01-25T06:26:50Z] <marostegui@deploy1002> Synchronized wmf-config/ProductionServices.php: Promote pc1014 to master in pc3 T299046 (duration: 00m 49s)
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host pc1013.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host pc1013.eqiad.wmnet with OS bullseye completed:
Mentioned in SAL (#wikimedia-operations) [2022-01-25T08:25:39Z] <marostegui@deploy1002> Synchronized wmf-config/ProductionServices.php: Revert: Promote pc1013 to master in pc3 T299046 (duration: 00m 49s)
Change 756947 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] pc1014: Move it to pc1
Change 756947 merged by Marostegui:
[operations/puppet@production] pc1014: Move it to pc1