Points to cover:
* What happened two weeks ago, that Ext:ORES exacerbated T179156.
* Incident report for T181006.
* Emergency protocols for keeping critical pages such as Special:RecentChanges up even when Ext:ORES fails.
* Ext:ORES and the data flow that makes it fragile
* Ext:ORES and how to fail gracefully to the user while still blowing up the logs to get attention when *appropriate*
* How to maintain latest production rollback SHA-1 even when using tin to deploy to multiple clusters.
* Document protocol for watching both "client" (MW/Ext:ORES) and server-side errors during deployment.
* Thoughts about how we might be able to canary all the Special pages on each language when making ORES changes that might affect all wikis.
* How to speed up deployment and rollback--currently takes 43 min to push a new version, and NN min to rollback.