Page MenuHomePhabricator

Improve user management for AQS Cassandra
Open, LowPublic5 Estimate Story Points

Description

After a chat with @JAllemandou and @Eevans we agreed that the AQS clusters should migrate to a better user management scheme. We are currently using the admin 'cassandra' user for Restbase reads and also for writes, that has multiple downsides:

  1. requires QUORUM during user authentication, not really great for performances (as opposed to local one for simple users);
  2. does not protect the system.auth table properly.

The migration procedure should be something like:

(a) Set a new application_username and application_password.

  • This will allow the creation of /etc/cassandra/adduser.cql on each node.

(b) cqlsh -u cassandra -f /etc/cassandra/adduser.cql $HOSTNAME (type passsword when promptedd)

  • will just create the new user on the cluster

(c) Change restbase::cassandra_user

  • will reconfigure Restbase to use it

(d) Set a new super_password for the Cassandra username.
(e) Change the super password in Cassandra to match (d)

We definitely want to do it for aqs100[456] but aqs100[123] will need extra care because live.

Event Timeline

elukey created this task.Aug 4 2016, 8:05 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 4 2016, 8:05 AM
Nuria changed the point value for this task from 0 to 5.Aug 4 2016, 5:19 PM
elukey moved this task from Next Up to In Progress on the Analytics-Kanban board.Aug 9 2016, 7:36 AM
elukey added a comment.Aug 9 2016, 9:15 AM

Interesting reading: https://issues.apache.org/jira/browse/CASSANDRA-5310

QUORUM is only used for default superuser ('cassandra'), for other users ONE is used. You are not supposed to use 'cassandra' user directly, except to create another superuser and use that one from that point on.

This is probably why the password caching increase on aqs100[123] gave so much performance improvements.

Change 303772 had a related patch set uploaded (by Elukey):
Add fake aqs Cassandra user's password

https://gerrit.wikimedia.org/r/303772

Change 303772 merged by Elukey:
Add fake aqs Cassandra user's password

https://gerrit.wikimedia.org/r/303772

Change 303774 had a related patch set uploaded (by Elukey):
Add the configuration needed to prepare a new AQS Cassandra user creation

https://gerrit.wikimedia.org/r/303774

Change 303774 merged by Elukey:
Add the configuration needed to prepare a new AQS Cassandra user creation

https://gerrit.wikimedia.org/r/303774

Change 303783 had a related patch set uploaded (by Elukey):
Include the password::aqs namespace in the AQS role

https://gerrit.wikimedia.org/r/303783

Change 303783 merged by Elukey:
Include the password::aqs namespace in the AQS role

https://gerrit.wikimedia.org/r/303783

Change 303786 had a related patch set uploaded (by Elukey):
Move the include of cassandra/aqs passwords up to solve a priority issue

https://gerrit.wikimedia.org/r/303786

Change 303786 merged by Elukey:
Move the include of cassandra/aqs passwords up to solve a priority issue

https://gerrit.wikimedia.org/r/303786

aqs user added to aqs100[456] and verified that it returns data on each instance with the following query:

elukey@aqs1004:~$ cat showdata.cql
select project from "local_group_default_T_pageviews_per_article_flat".data limit 10;
elukey@aqs1004:~$ cqlsh -u aqs -f showdata.cql aqs1004-a.eqiad.wmnet

Change 303792 had a related patch set uploaded (by Elukey):
Change the AQS restbase user from 'cassandra' to 'aqs'

https://gerrit.wikimedia.org/r/303792

Change 303792 merged by Elukey:
Change the AQS restbase user from 'cassandra' to 'aqs'

https://gerrit.wikimedia.org/r/303792

elukey added a comment.Aug 9 2016, 1:15 PM

New cluster switched, installed the new user in the current one (aqs100[123]) and tested:

elukey@aqs1001:~$ cqlsh -u aqs -f showdata.cql aqs1001.eqiad.wmnet
Password:

 project
---------------
 en.wikisource
  ja.wikipedia
  hu.wikipedia
  fr.wikipedia
  fr.wikipedia
  fr.wikipedia
  fr.wikipedia
 bg.wiktionary
  en.wikipedia
  en.wikipedia

(10 rows)
elukey@aqs1001:~$ cqlsh -u aqs -f showdata.cql aqs1002.eqiad.wmnet
Password:

 project
---------------
 en.wikisource
  ja.wikipedia
  hu.wikipedia
  fr.wikipedia
  fr.wikipedia
  fr.wikipedia
  fr.wikipedia
 bg.wiktionary
  en.wikipedia
  en.wikipedia

(10 rows)
elukey@aqs1001:~$ cqlsh -u aqs -f showdata.cql aqs1003.eqiad.wmnet
Password:

 project
---------------
 en.wikisource
  ja.wikipedia
  hu.wikipedia
  fr.wikipedia
  fr.wikipedia
  fr.wikipedia
  fr.wikipedia
 bg.wiktionary
  en.wikipedia
  en.wikipedia

(10 rows)

Change 303798 had a related patch set uploaded (by Elukey):
Switch the AQS restbase use from 'cassandra' to aqs

https://gerrit.wikimedia.org/r/303798

Mentioned in SAL [2016-08-09T15:59:50Z] <elukey> switching restbase/cassandra user on aqs100[123] to aqs (T142073) - https://gerrit.wikimedia.org/r/303798 will be applied to one node at the time with depool/pool

Change 303798 merged by Elukey:
Switch the AQS restbase use from 'cassandra' to 'aqs'

https://gerrit.wikimedia.org/r/303798

elukey added a comment.Aug 9 2016, 4:36 PM

Remaining steps:

  1. establish how to distribute the new user/password credentials to oozie;
  2. move oozie away from the 'cassandra' user, either using the newly created 'aqs' user or creating a new one;
  3. replace the current cassandra admin password with another one.
elukey moved this task from In Progress to Paused on the Analytics-Kanban board.Aug 29 2016, 3:02 PM
Milimetric moved this task from Incoming to Backlog (Later) on the Analytics board.
elukey moved this task from Backlog to Analytics Backlog on the User-Elukey board.Dec 14 2016, 5:43 PM
Nuria moved this task from Wikistats Production to Dashiki on the Analytics board.May 29 2017, 3:57 PM
Nuria moved this task from Dashiki to Backlog (Later) on the Analytics board.Jul 6 2017, 4:48 PM
elukey moved this task from Analytics Backlog to Backlog on the User-Elukey board.Aug 4 2017, 3:10 PM
elukey moved this task from Backlog to Analytics Backlog on the User-Elukey board.Aug 9 2017, 10:35 AM
elukey moved this task from Backlog to Analytics Backlog on the User-Elukey board.Mar 23 2018, 3:55 PM
mforns lowered the priority of this task from Medium to Low.Apr 16 2018, 4:25 PM
elukey added a parent task: Restricted Task.Jul 25 2018, 8:58 AM
Milimetric renamed this task from Improve user management for AQS to Improve user management for AQS Cassandra.Oct 22 2018, 3:43 PM
elukey moved this task from Analytics Backlog to Backlog on the User-Elukey board.Dec 7 2018, 2:53 PM