Much of our restbase infrastructure is still running on stretch. We need to migrate to buster asap, ideally while creating a process we can use for the coming upgrades to bullseye.
Initial work:
- Investigate whether we can reuse data-persistence work to migrate restbase nodes without wiping their disks
If not, automate (via a cookbook? possibly overkill) the decommissioning of Restbase nodes while we reimage them
- Ensure that our reimaging process will allow us to rejoin nodes that have been reimaged without issue (a decommission will stop nodes in the cluster talking to this
Process:
Reimage: sudo cookbook sre.hosts.reimage --os buster -t T295375 $HOSTNAME -c
Fix permissions after reimage (check by hand - some gids are varying between hosts): sudo find /srv/ -user envoy -exec chown cassandra:cassandra {} \;
Workarounds for T300177:
- sudo -u deploy-service /usr/bin/scap deploy-local --repo cassandra/twcs
- sudo -u deploy-service /usr/bin/scap deploy-local --repo restbase/deploy
- sudo -u deploy-service /usr/bin/scap deploy-local --repo cassandra/logstash-logback-encoder
Enable cassandra: sudo touch /etc/cassandra-{a,b,c}/service-enabled && for i in a b c; do sudo service cassandra-${i} start; done
Once host has rejoined the clusters (check compactions and potential instance-data checks failing on cluster), on puppetmaster: sudo confctl select name=HOSTNAME set/pooled=yes
Host migration:
Hosts marked with * are affected by T299652 and require BIOS upgrades to be reimaged
- restbase1016.eqiad.wmnet
- restbase1017.eqiad.wmnet
- restbase1018.eqiad.wmnet
- restbase1019.eqiad.wmnet*
- restbase1020.eqiad.wmnet*
- restbase1021.eqiad.wmnet*
- restbase1022.eqiad.wmnet*
- restbase1023.eqiad.wmnet*
- restbase1024.eqiad.wmnet*
- restbase1025.eqiad.wmnet*
- restbase1026.eqiad.wmnet*
- restbase1027.eqiad.wmnet*
- restbase1028.eqiad.wmnet
- restbase1029.eqiad.wmnet
- restbase1030.eqiad.wmnet
restbase2009.codfw.wmnet
- restbase2010.codfw.wmnet
- restbase2011.codfw.wmnet
- restbase2012.codfw.wmnet
- restbase2013.codfw.wmnet
- restbase2014.codfw.wmnet
- restbase2015.codfw.wmnet
- restbase2016.codfw.wmnet
- restbase2017.codfw.wmnet*
- restbase2018.codfw.wmnet
- restbase2019.codfw.wmnet*
- restbase2020.codfw.wmnet*
- restbase2021.codfw.wmnet
- restbase2022.codfw.wmnet
- restbase2023.codfw.wmnet
- restbase2024.codfw.wmnet
- restbase2025.codfw.wmnet
- restbase2026.codfw.wmnet
To be replaced by new instances
- restbase-dev1004.eqiad.wmnet
- restbase-dev1005.eqiad.wmnet
- restbase-dev1006.eqiad.wmnet
- deployment-restbase03.deployment-prep.eqiad1.wikimedia.cloud
- restbase-dev1004.eqiad.wmnet
- restbase-dev1004.eqiad.wmnet
- restbase-dev1004.eqiad.wmnet
- deployment-restbase04