Page MenuHomePhabricator

Set up a testing environment for the AQS Cassandra 3 migration
Closed, ResolvedPublic

Description

T249756 contains useful details about what the Analytics team wants to do for the AQS Cassandra 3 upgrade, namely doing it in place.

This task should be about finding a suitable testing environment to test the 2.2 -> 3.11 upgrade procedure, and possibly writing the spicerack/cookbook automation to do it.

Event Timeline

A lot of things changed, we are getting the 6 new nodes for AQS (due to hw refresh) sooner so our idea is the following:

  • install Debian buster on the new nodes, configure partitions etc.. and deploy cassandra 2.2.6 (the same that we have on the current AQS nodes)
  • stream data from the current cassandra cluster to the new one (using something like sstableloader)
  • upgrade in place the new cluster, and verify that everything looks correct.
  • make the switch to the new cluster

This means that the "testing" environment will basically also be the final production environment, but it seems the best compromise to do all the following:

  • migrate aqs to debian buster
  • refresh hardware
  • test the in-place upgrade procedure for cassandra
  • avoid to migrate data between old and new clusters (like from 2.2.6 to a 3.11 cluster etc..)

This is pending the racking of T264336

Change 670197 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/puppet@production] aqs: make aqs1010 a separate AQS cluster

https://gerrit.wikimedia.org/r/670197

Change 670197 merged by Hnowlan:
[operations/puppet@production] aqs: make aqs1010 a separate AQS cluster

https://gerrit.wikimedia.org/r/670197

Change 671132 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/puppet@production] aqs: add aqs1011 to cassandra 3.11 test cluster

https://gerrit.wikimedia.org/r/671132

Change 672366 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/puppet@production] aqs: move import of ::passwords::aqs

https://gerrit.wikimedia.org/r/672366

Change 672372 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[labs/private@master] passwords::cassandra: add aqs password entry

https://gerrit.wikimedia.org/r/672372

Change 672372 merged by Hnowlan:
[labs/private@master] passwords::cassandra: add aqs password entry

https://gerrit.wikimedia.org/r/672372

Change 672441 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[labs/private@master] aqs: move password to hieradata rather than password module

https://gerrit.wikimedia.org/r/672441

Change 672366 merged by Hnowlan:
[operations/puppet@production] aqs: move import of ::passwords::aqs

https://gerrit.wikimedia.org/r/672366

Change 671132 merged by Hnowlan:
[operations/puppet@production] aqs: add aqs1011 to cassandra 3.11 test cluster, add aqs_next role

https://gerrit.wikimedia.org/r/671132

Change 672441 abandoned by Hnowlan:
[labs/private@master] aqs: move password to hieradata rather than password module

Reason:
jbond did this in another change

https://gerrit.wikimedia.org/r/672441

Change 679295 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[analytics/aqs@master] Add docker-compose environment with cassandra

https://gerrit.wikimedia.org/r/679295

In lieu of having a test environment or a WMCS cluster, we are planning on pursuing a docker-compose environment that will let us do testing and experimentation in a clean dockerised cluster: https://gerrit.wikimedia.org/r/c/analytics/aqs/+/679295

odimitrijevic moved this task from Incoming (new tickets) to Serve on the Data-Engineering board.

@BTullis Is this task still relevant?

BTullis claimed this task.
BTullis triaged this task as Medium priority.
BTullis moved this task from Next Up to Done on the Data-Engineering-Kanban board.

I think that this item can be closed. We used the new cluster as a testing environment as outlined above.
Now the new cluster is in production and we have begun to decommisison the old cluster.

Change 679295 abandoned by Hnowlan:

[analytics/aqs@master] Add docker-compose environment with cassandra

Reason:

Not needed

https://gerrit.wikimedia.org/r/679295