== Problem ==
Historically, WMCS has upgraded Openstack on a looser cadence, intending to following the stable -1 version. Without a tighter cadence, generally WMCS has lagged 1-2+ versions behind stable. At times this has caused issues with new feature adoption, for example with Trove and most recently Magnum requiring newer versions of Openstack before deployment.
== Goals: ==
* Better manage upgrades. Openstack releases in April and October. We should also plan consistent times of the year to do upgrades in response.
* Run a generally newer Openstack version on average, while still seeking lag time for stability.
* Make it easier to patch or run newer versions of Openstack as needed in response to a bug or desired feature
== Constraints and Risks ==
* A stable system is prioritized over features
* Doing nothing will mean k8s clusters run by Magnum will almost always be EOL during operation.
** This is due to the following. Kubernetes supports releases for 18 month. Openstack adopts a 9 month old release for stable. 6-9 months later, we upgrade to this version, thus 18 months have elapsed since the kubernetes upstream release, making it EOL.
** It doesn't seem possible to upgrade Magnum k8s version without upgrading Openstack. This means, our Openstack and kubernetes versions will be tied together.
** Note, currently our existing k8s version is EOL.
* Today, WMCS is dependent on debian to package Openstack. In the past, this has led to delays due to this packaging work, as well as not all point releases being packaaged.
* WMCS currently patches openstack, and will continue to do so
== Decision Record ==
https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/Decision_record_T316866_Openstack_Upgrade_Cadence
== Proposals: ==
=== Option 1: ===
Do Nothing. Accept the status quo, including the adhoc upgrade cycle for openstack releases.
==== Pros ====
* No change required
* Maximum flexibility for planning work
* Ability to defer upgrades and run EOL without missing expectations
==== Cons ====
* No set expectations for ourselves or users
* Potential for cumbersome scenarios requiring multiple unplanned upgrades to occur to fix an issue or add a feature
* No goals met
=== Option 2: ===
Maintain n-1 target. Accept running EOL k8s. Schedule twice yearly upgrade months to set expectations.
==== Pros ====
* Same as option 1, with only minor change and flexibility loss
* Ensures predictability and maintenance of upgrades, rather than relying on adhoc efforts
==== Cons ====
* Potential for cumbersome scenarios requiring multiple unplanned upgrades to occur to fix an issue or add a feature. Even if delay is accepted on adding a feature (according to the upgrade schedule), critical or security issues
* Addresses only the first of the three stated goals
=== Option 3: ===
Create new n-0.5 target cadence. Upgrade to stable version 1-3 months after release.
==== Pros ====
* Ensures openstack and kubernetes versions are up to date and supported during the entire time of operation
* Ensures predictability and maintenance of upgrades, rather than relying on adhoc efforts
* No change to upgrade process required
==== Cons ====
* Patching burden isn't improved
* Maintains dependency on debian packaging
=== Option 4: ===
Create new n-0.5 target cadence. Upgrade to stable version 1-3 months after release. Utilize docker or similar for deployment.
==== Pros ====
* Everything under option 3
* Lower patching burden
* Improved flexibility to upgrade or respond to issues
* Meets all stated goals
* Better target for automation
==== Cons ====
* Requires changing how we deploy Openstack; this will require research and design