In general, deployment documentation right now is a mess. Several large pages are redundant with one another and slightly out of sync, navigation is difficult, and important details of policy are hard to find. We should consolidate a number of pages under a more coherent structure, make sure everything actually reflects current practice, and improve the navigation aids. This applies to the procedural train docs as well as to descriptions of how deployments are structured overall and how backports are to be conducted.
Things that need tweaked for recent policy changes:
- [[https://wikitech.wikimedia.org/wiki/Deployments/Holding_the_train|Holding the train]]
- [X] Mention client errors and 1k limit in a 12 hour period before it's an UBN
- [ ] Client errors < 100 / hour
- [ ] Specific error budget - 2 or more times in a version?
- [ ] Define "new" in regards to errors
- [[https://wikitech.wikimedia.org/wiki/Heterogeneous_deployment/Train_deploys|Heterogeneous deployment/Train_deploys]]
- [X] Mention client error dashboard
- [ ] Client errors < 100 / hour
- [ ] Define "new" in regards to errors
cc: @thcipriani, @dancy if there are specifics I'm forgetting here.