= Inventory of hosts to be upgraded to bullseye
== Hadoop-test
[x] {T329363}
== Hadoop
[x] {T332570}
[x] {T332572}
[x] {T332573}
[x] {T332578}
== Stats clients
[] {T329360}
== Launcher
[] {T332580}
== Presto
[x] {T329361}
== Druid
[x] {T332584}
[x] {T332604}
[x] {T332589} **n.b. Refresh druid100[4-6] with druid10[09-11]**
== Kafka
[x] kafka-jumbo - 9 - `cumin 'P{F:lsbdistcodename = buster} and A:kafka-jumbo'` **n.b. Refresh kafka-jumbo100[1-6] with kafka-jumbo10[09-15]** - {T348495}
== Airflow
[x] airflow - 5 - `cumin 'P{F:lsbdistcodename = buster} and A:analytics-airflow'`
== AQS
[x] aqs - 24 - `cumin 'P{F:lsbdistcodename = buster} and A:aqs'` (#data-persistence, see: T347738)
== Zookeeper
[x] {T329362}
== Event schemas
[x] schema - 4 - `cumin 'P{F:lsbdistcodename = buster} and A:schema'`
== Misc
[x] eventlogging - 1 - `eventlog1003.eqiad.wmnet`
[] archiva - 1 - `archiva1001.wikimedia.org` **n.b. {T317182} at the same time**
[] matomo - 1 - `matomo1001.eqiad.wmnet` {T349397}
[x] web publishing - 1 - `an-web1001.eqiad.wmnet`
[x] ~~hue - 1 - `an-tool1009.eqiad.wmnet`~~ **decommissioned**
[x] yarn - 1 - `an-tool1008.eqiad.wmnet`
----
= Original description below
Recent updates are **written in bold text**
During the migration to Buster we worked on two things that should reduce a lot the pain of upgrading:
1) Partman partition re-use recipes for Debian installs of most of our hosts. This means that it will be way easier to reimage/reinstall every node of the cluster without stressing too much about backing up data first etc..
2) Fixed uid/gid of most of the system users. This will allow us to avoid weird permission errors/mismatches after reinstall/reimage.
It is nonetheless a sizeable amount of work :)
Some high level notes:
* Moving the Hadoop test cluster to Bullseye ahead of time may be a good way to see if anything weird comes up.
* A lot of VMs like matomo1002, archiva1002, eventlog1003, an-tool100*, etc.. should be easy to migrate. The work to do is to create a new VM with Bullseye running the same packages, test that everything is fine and flip the traffic over. **There is a `sre.ganeti.reimage` cookbook, making a reimage in place an even easier option in many cases**
* Most of our systems like Hadoop, Druid, etc.. are not ready for Java 11, so we'll need to use 8. **We now have full support for Java 8 in bullseye, so we are good to go**
* Moving Hadoop to Bullseye poses some further questions, since on paper the current version of Bigtop that we run (1.5) doesn't support Bullseye **We have now built bigtop 1.5 for bullseye and deployed it so we are good to go**