Page MenuHomePhabricator

wikireplicas root access
Open, MediumPublic

Description

This task is to explore how we can provide root access to the wikireplica dbs servers in a safe manner. when this was explored previously when this was discussed it was diced that root on theses systems meant root on all production dbs:

Giving root to labsdbs would be equivalent to giving root to all mysql servers for many reasons. No problem with that, but he should be added to the paging system (if he is not already there) and respond to the database alerts.

This was decided at the time to be too much of a risk. However we would like to explore further what theses risks are and if we could mitigate them and restrict users to just the wikireplicas dbs without leaking access to other production hosts. I thin k the best way forward would be:

DBA expand on what the issue is relating to root access and explore options to mitigate this risk enabling wmcs users to have full root on the wikireplicas
cloud-services-team define precisely what permissions are required but missing from the wikireplica hosts. this should allow us to better provision access if full root continues to be unviable

Event Timeline

@jbond As far as I know, the only thing that needs to be run as root within the wikireplicas hosts are the scripts to create the views/indexes (which Data Persistence isn't an owner of). Other than that, there's nothing else that requires root other than (stop/start mariadb and its replication, which requires a mysql prompt, which I guess we could just allow sudo for, in those specific hosts).

We as a team don't own this service, we just own (for now - until this is further clarified) some of its responsibilities, which include stop/start mariadb, upgrades and such and that does require root.
If there is a way to give root to WMCS (or whoever ends up owning this service) but NOT giving root to all db* (or cumin) hosts, I'd be okay with that. That being said, I'd like those having root, to respond to pages and/or assume full ownership of this service (which doesn't mean we, DBAs, won't help if there's a need).

@Marostegui thanks for the response

If there is a way to give root to WMCS (or whoever ends up owning this service) but NOT giving root to all db* (or cumin) hosts

Yes that should definitely be possible however

Giving root to labsdbs would be equivalent to giving root to all mysql servers for many reasons.

i was worried about this point originality brought up by @jcrespo

I'll leave ownership clarifications to @nskaggs i believe there is already an ongoing ownership discussion about this.

ill raise a separate ticket to discuss paging responsibilities however i think its reasonable request

@Marostegui thanks for the response

If there is a way to give root to WMCS (or whoever ends up owning this service) but NOT giving root to all db* (or cumin) hosts

Yes that should definitely be possible however

Giving root to labsdbs would be equivalent to giving root to all mysql servers for many reasons.

i was worried about this point originality brought up by @jcrespo

Not speaking on behalf of Jaime here but, giving my opinion of what I think what was the problem at the time:
At the time we didn't have much separation between the old hosts and production.
However, now, even the root password is different. And the data that arrive to clouddb* hosts is filtered, so having access to the mysql prompt of the replicas doesn't imply having access to any PII.

@Marostegui thanks for the response

If there is a way to give root to WMCS (or whoever ends up owning this service) but NOT giving root to all db* (or cumin) hosts

Yes that should definitely be possible however

Giving root to labsdbs would be equivalent to giving root to all mysql servers for many reasons.

i was worried about this point originality brought up by @jcrespo

Not speaking on behalf of Jaime here but, giving my opinion of what I think what was the problem at the time:
At the time we didn't have much separation between the old hosts and production.
However, now, even the root password is different. And the data that arrive to clouddb* hosts is filtered, so having access to the mysql prompt of the replicas doesn't imply having access to any PII.

Ahh great, will wait for confirmation from @jcrespo but i wondered if it was something like that and glad to here its sorted now :)

Yes, that was exactly what I meant back then. Not only that, passwords used to be written to the filesystem in plain text. Since then, most things may have changed and passwords have been removed in favor of other authentication methods (unix_socket) and passwords changed.

I am no longer involved in DBA work, so I don't know the details of the current state, and I have in high esteem the DBAs, but they are sometimes overloaded with work, and there has been many years of assuming only global roots can access mysql data that I am sure unintended data is still there. Why do I know this? Because I have been there and made those mistakes myself, some may be directly my fault- so this is not an accusation, but the recognition that it is hard to get right. In particular, I would suggest to ask additional input to @Ladsgroup as our grant checking expert in case there is something missing regarding realm separation.

Two examples:

  • I believe the replication password is still shared, which means there is access to production hosts through that account (all events, even private ones potentially still exposed).
  • Non-public data (such as suppressed edits or bans) is possibly still available to roots, just not sensitive one like passwords and ips. I asked a long time ago for an audit from security: T103011 but that has been pending for 8 years, so this is a not trustworthy in general.

Knowing this deficiencies I am not going to be the one that says there are not outstanding issues, but I won't block if someone can take responsibility for that risk.

For example, I've taken a host at random, clouddb1018, I've run:

root@clouddb1018[mysql]> select user, host, password FROM user WHERE user not rlike '[spu][0-9]+' LIMIT 10;

And I recognize a potential (unsure if it is the right one) mediawiki admin password on cloud db, among other (unsalted, easily reversed hash password). Even if that is fixed, who can ensure that is not going to happen again? What monitoring is in place? What filtering? Not only for MySQL, but for puppet secretes in general? Please note I am not trying to be hard to work with, I just want to expose deficiencies/long time held assumptions that may know be known that affect production and that it will be non-trivial (although not impossible) to overcome, specially as some may be result of emergency work under not the best of conditions!

jbond triaged this task as Medium priority.Aug 23 2023, 11:35 AM

Change 923681 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] wmcs: add wmcs-roots to roles where it is missing

https://gerrit.wikimedia.org/r/923681

Our grants are a mess, doubly so in cloud replicas. It's hard to actually remove those because we are not sure if they are actually needed or not. We have cleaned a lot but way more needs to be done. So first thing, I'd really like to remove mw-related grants and users first and then it should be safer to open root in cloud replicas. On top of that, I'd like to clean up some security stuff on two mysql users as well.

If the need is just to run maintain-views, we can add that to sudo policies of wmcs-roots.

Change 923681 merged by Jbond:

[operations/puppet@production] wmcs: add wmcs-roots to roles where it is missing

https://gerrit.wikimedia.org/r/923681