This task lists the concrete steps and functionalities we need / expect from the (future) RESTBase deployment system.
= Deployment Process =
All of the actions are to be carried out from a deployment host (currently `tin`). Unlike the current Trebuchet-powered deploy, we do not need the deployment host to be a proxy serving the hosts, they may get their code directly from git/gerrit.
The deployment process can follow two different (but similar) paths:
- regular code deploy
- deployment involving schema changes
== Regular Code Deploy ==
Most of the time the code to be deployed represents logic improvements and feature additions and as such do not bear any impact on the underlying storage (Cassandra). Here are the needed steps to complete a successful deploy, to be executed in sequence on each host:
# depool host
# stop RESTBase
# send / fetch the code
# (re)start RESTBase
# wait for it to bind to its port
# checks / tests
## `curl` some known endpoints
## check the logs and graphite for anomalies after restart
### error- and fatal-level log entries
### Cassandra connection issues
### 5xx request response rates for the given host
# repeat all steps for the next host
=== Abort Mechanism ===
For regular code deploys, aborting is pretty straightforward. The deployment system should keep track of the repository state before the deployment. Should a deploy fail, it simply enforces the previously-known-to-work code tag/branch/hash on all of the hosts sequentially.
== Deploying config changes ==
As config changes can trigger database changes in RESTBase, it is [very important](https://wikitech.wikimedia.org/wiki/Incident_documentation/20150519-RESTBase) that those are deployed in a rolling fashion as well. The configuration templating is handled by puppet, which doesn't directly support rolling deploys. To work around this, we need to manually perform a rolling deploy by disabling puppet & then re-enabling it one by one. Procedure (note: all of the following commands need to be run as root):
* Disable puppet on all restbase* hosts, to make sure that config changes are applied one host at a time: <code>puppet agent --disable</code>
* For each node:
** re-enable / run puppet: <code>puppet agent --enable; puppet agent -tv</code>
** re-start & test RESTBase, as above
See also: See: https://wikitech.wikimedia.org/wiki/RESTBase#Config_changes
=== Abort Mechanism ===
Because schemas are versioned, the back-end storage will refuse to apply a schema with a version lower than the currently-present one. This means that in this instance the abort mechanism involves an additional step - a manual commit from the deployer bumping the schema version number of the last stable schema.