Page MenuHomePhabricator

Switch AQS to new cluster
Closed, ResolvedPublic8 Story Points

Description

Switch AQS to new cluster

Details

Related Gerrit Patches:

Event Timeline

Nuria created this task.Sep 1 2016, 3:12 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 1 2016, 3:12 PM
elukey added a subscriber: elukey.Sep 1 2016, 3:13 PM
Nuria added a comment.EditedSep 8 2016, 3:52 PM

Things to have in mind:

Aqs code is slightly different on new and old cluster as compression scheme is hardcoded as part of aqs setup.

Communicate switchover but downtime is not expected. We need to choose a date.

Cluster switch - we'll add the new nodes behind the AQS LVS load balancer leaving the current ones running, and then we'll deprecate old hosts if nothing major appears during the following days.

Remove old hosts from LVS and monitor.

Once it has baked for days, we remove loading job for old hosts.

Bring new cluster up to date with newer aqs code.

Nuria set the point value for this task to 8.Sep 8 2016, 4:09 PM
Milimetric assigned this task to elukey.Sep 15 2016, 3:40 PM
Milimetric triaged this task as Medium priority.

Change 310831 had a related patch set uploaded (by Elukey):
Add the new aqs nodes to conftool-data

https://gerrit.wikimedia.org/r/310831

Change 310831 merged by Elukey:
Add the new aqs nodes to conftool-data

https://gerrit.wikimedia.org/r/310831

Mentioned in SAL (#wikimedia-operations) [2016-09-19T14:21:01Z] <elukey> adding aqs1004 to live traffic - aqs.svc.eqiad.wmnet - T144497

Mentioned in SAL (#wikimedia-operations) [2016-09-20T16:58:23Z] <elukey> adding aqs1005 to live traffic - aqs.svc.eqiad.wmnet - T144497

Mentioned in SAL (#wikimedia-operations) [2016-09-20T17:01:53Z] <elukey> adding aqs1006 to live traffic - aqs.svc.eqiad.wmnet - T144497

Mentioned in SAL (#wikimedia-operations) [2016-09-21T06:21:06Z] <elukey> removing aqs100[123] from live traffic - aqs.svc.eqiad.wmnet - T144497

Nuria added a comment.EditedSep 27 2016, 4:13 PM

See latency drop

Change 314284 had a related patch set uploaded (by Nuria):
Bringing master up to date with aqs-new-cluster branch

https://gerrit.wikimedia.org/r/314284

Change 314284 abandoned by Nuria:
Bringing master up to date with aqs-new-cluster branch

https://gerrit.wikimedia.org/r/314284

Change 314722 had a related patch set uploaded (by Nuria):
Updating master with new-aqs-cluster branch

https://gerrit.wikimedia.org/r/314722

Change 314722 abandoned by Nuria:
Updating master with new-aqs-cluster branch

Reason:
not needed

https://gerrit.wikimedia.org/r/314722

Nuria moved this task from Ready to Deploy to Done on the Analytics-Kanban board.Oct 11 2016, 6:59 PM
Nuria closed this task as Resolved.Oct 19 2016, 7:22 PM