Switch AQS to new cluster
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Duplicate | • mobrovac | T125345 Many error 500 from pageviews API "Error in Cassandra table storage backend" | |||
Resolved | JAllemandou | T124314 Better response times on AQS (Pageview API mostly) {melc} | |||
Resolved | elukey | T144497 Switch AQS to new cluster | |||
Resolved | Nuria | T144521 Coalesce nulls to 0s in output | |||
Resolved | JAllemandou | T145087 Setup regular loading jobs new aqs cluster (per-article, top and unique devices) |
Event Timeline
Things to have in mind:
Aqs code is slightly different on new and old cluster as compression scheme is hardcoded as part of aqs setup.
Communicate switchover but downtime is not expected. We need to choose a date.
Cluster switch - we'll add the new nodes behind the AQS LVS load balancer leaving the current ones running, and then we'll deprecate old hosts if nothing major appears during the following days.
Remove old hosts from LVS and monitor.
Once it has baked for days, we remove loading job for old hosts.
Bring new cluster up to date with newer aqs code.
Change 310831 had a related patch set uploaded (by Elukey):
Add the new aqs nodes to conftool-data
Mentioned in SAL (#wikimedia-operations) [2016-09-19T14:21:01Z] <elukey> adding aqs1004 to live traffic - aqs.svc.eqiad.wmnet - T144497
Mentioned in SAL (#wikimedia-operations) [2016-09-20T16:58:23Z] <elukey> adding aqs1005 to live traffic - aqs.svc.eqiad.wmnet - T144497
Mentioned in SAL (#wikimedia-operations) [2016-09-20T17:01:53Z] <elukey> adding aqs1006 to live traffic - aqs.svc.eqiad.wmnet - T144497
Mentioned in SAL (#wikimedia-operations) [2016-09-21T06:21:06Z] <elukey> removing aqs100[123] from live traffic - aqs.svc.eqiad.wmnet - T144497
Change 314284 had a related patch set uploaded (by Nuria):
Bringing master up to date with aqs-new-cluster branch
Change 314284 abandoned by Nuria:
Bringing master up to date with aqs-new-cluster branch
Change 314722 had a related patch set uploaded (by Nuria):
Updating master with new-aqs-cluster branch
Change 314722 abandoned by Nuria:
Updating master with new-aqs-cluster branch
Reason:
not needed