Page MenuHomePhabricator

Redefine mysql GRANTs for wikiadmin
Closed, ResolvedPublic

Description

Due to T157651: sql.php must not run LoadExtensionSchemaUpdates causing the drop of a table in production T249565: Wikidata's wb_items_per_site table has suddenly disappeared, creating DBQueryErrors on page views, one of the action items is to remove DROP and ALTER grant from the wikiadmin user.
This is what wikiadmin user currently have:

GRANT ALL PRIVILEGES ON `%wik%`.*

We should probably limit it to:

  • SELECT
  • UPDATE
  • INSERT
  • DELETE
  • PROCESS
  • REPLICATION CLIENT
  • CREATE
  • CREATE TEMPORARY TABLES
  • INDEX?

Anything else?

Event Timeline

CREATE is already there but I want to emphasize that we need it for two reasons: 1- Sometimes devs, with coordination with DBAs, create tables in production. I have done it before several times. 2- During the outage, one of the reasons I was able to reduce the impact was that I had the CREATE right to create the table. Does this include creating indexes as well, right?

creating indexes as well, right

No, that would require ALTER rights. Creates are a non issue, that is why we let them be created by owners (e.g new extension or new wiki). ALTERS, however, are a source of both performance and replication issues. Maintenance will have to be adapted to create the indexes at the same time than the tables, not after.

hmm, my problem is that the command to create the table is like this:

CREATE TABLE IF NOT EXISTS /*_*/wb_items_per_site (
  ips_row_id                 BIGINT unsigned     NOT NULL PRIMARY KEY AUTO_INCREMENT, -- row ID
  ips_item_id                INT unsigned        NOT NULL, -- Id of the item
  ips_site_id                VARBINARY(32)       NOT NULL, -- Site identifier (global)
  ips_site_page              VARCHAR(310)        NOT NULL -- Prefixed title of the page
) /*$wgDBTableOptions*/;

CREATE UNIQUE INDEX /*i*/wb_ips_item_site_page ON /*_*/wb_items_per_site (ips_site_id, ips_site_page);
CREATE INDEX /*i*/wb_ips_item_id ON /*_*/wb_items_per_site (ips_item_id);

I create the table first then I add the indexes like that. I can't add the indexes as part of creating the table. Maybe I'm missing something obvious here.

This is the wrong direction to attack the problem from, IMO. ALTER is useful but also dangerous; there is no way to separate functionality into non-overlapping "safe" and "not needed" sets.

The real problem is that a dump script should never run with an admin role in the first place. There should probably be a separate wikireader and wikieditor role (at least the latter exists already IIRC), dump scripts and manual debugging by default should use the first (read only + EXPLAIN), purge etc. scripts and manual debugging when explicitly requested would use the second (read/write but no DDL) and the role for schema changes should only be accessible with a separate explicit flag.

I can see that ALTER might be useful sometimes and might require a longer discussion and some other deep changes (more different roles etc), I don't think we should be keeping DROP for any reason, right?

I can't add the indexes as part of creating the table

Why not? This is literally the definition live on the DB:

CREATE TABLE `wb_items_per_site` (
  `ips_row_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `ips_item_id` int(10) unsigned NOT NULL,
  `ips_site_id` varbinary(32) NOT NULL,
  `ips_site_page` varbinary(310) NOT NULL,
  PRIMARY KEY (`ips_row_id`),
  UNIQUE KEY `wb_ips_item_site_page` (`ips_site_id`,`ips_site_page`),
  KEY `wb_ips_item_id` (`ips_item_id`)
)

If it cannot be done it is mediawiki's model fault, it is clearly possible on db, and more efficient/safer.

This is the wrong direction to attack the problem from

Removing wikiadmin grants, or stopping using that account and using 3 or 4 others without those grants are just a semantic difference :-D.

Anything else?

Does CREATE cover CREATE TEMPORARY TABLES? If not, we should most likely include that one too.

There's also INDEX, see below.

While we don't currently use views in production as far as I know, we might add SHOW VIEW in case we ever do.

creating indexes as well, right

No, that would require ALTER rights. Creates are a non issue, that is why we let them be created by owners (e.g new extension or new wiki). ALTERS, however, are a source of both performance and replication issues. Maintenance will have to be adapted to create the indexes at the same time than the tables, not after.

I note https://mariadb.com/kb/en/grant/#table-privileges lists an "INDEX" privilege that allows CREATE INDEX.

MediaWiki's MySQL/MariaDB schemas generally avoid adding indexes in the CREATE TABLE statement, possibly because the same tables.sql is also munged for SQLite and SQLite supports "PRIMARY KEY" and "UNIQUE" but not plain "KEY".

Anything else?

Does CREATE cover CREATE TEMPORARY TABLES? If not, we should most likely include that one too.

No, CREATE does not include temporary tables for that we need CREATE TEMPORARY TABLES. I will add it to the list, thank you.

There's also INDEX, see below.

While we don't currently use views in production as far as I know, we might add SHOW VIEW in case we ever do.

creating indexes as well, right

No, that would require ALTER rights. Creates are a non issue, that is why we let them be created by owners (e.g new extension or new wiki). ALTERS, however, are a source of both performance and replication issues. Maintenance will have to be adapted to create the indexes at the same time than the tables, not after.

I note https://mariadb.com/kb/en/grant/#table-privileges lists an "INDEX" privilege that allows CREATE INDEX.

Yes, INDEX grant would allow us to keep using the way we currently use via CREATE INDEX but not ALTER, it could be a solution for the specific case of needing to keep creating indexes on a separate way than including them on the table structure.
Note, that CREATE INDEX can still create replication issues, but it is of course a lot narrower than ALTER in terms of potential data loss (dropping a column etc) and available operations in general.

I can't add the indexes as part of creating the table

Why not? This is literally the definition live on the DB:

CREATE TABLE `wb_items_per_site` (
  `ips_row_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `ips_item_id` int(10) unsigned NOT NULL,
  `ips_site_id` varbinary(32) NOT NULL,
  `ips_site_page` varbinary(310) NOT NULL,
  PRIMARY KEY (`ips_row_id`),
  UNIQUE KEY `wb_ips_item_site_page` (`ips_site_id`,`ips_site_page`),
  KEY `wb_ips_item_id` (`ips_item_id`)
)

If it cannot be done it is mediawiki's model fault, it is clearly possible on db, and more efficient/safer.

You don't need to move the PK from the first line to the seperate line.

AFAIK, the reason we tend to prefer the CREATE INDEX lines after the CREATE TABLE (rather than inline like that) is for sqlite compatibility, to save some duplication, as both work fine for mysql, but only one form work for sqlite.

But doing a manual transform between the two, when not under pressure, isn't difficult

It seems there is a consensus to drop the DROP right. I suggest starting with this and then moving forward to dropping ALTER later (after investigation).

The command to run:

REVOKE DROP ON `%wik%`.* FROM 'wikiadmin'@'10.%';
REVOKE DROP ON `%wik%`.* FROM 'wikiadmin'@'localhost';

(there are two wikiadmin users, one on localhost, one on 10.%)

If there is no objection by end of next week, I'm gonna slowly roll this out.

That is not going to work everywhere. The problem with the wikiadmin user is that in some places it has GRANT ALL PRIVILEGES ON %wik%.*, so first we'd need to list all the grants like it has in some other places:

| GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, REFERENCES, INDEX, ALTER, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, EVENT, TRIGGER ON `%wik%`.* TO `wikiadmin`@`10.%` |

So we need to double check first which grants we have (normally we should only have 10.% and localhost):

select user,host from user where user like 'wikiadmin';

And then check whether there are all the grants listed or it has ALL PRIVILEGES.

The command to run:

REVOKE DROP ON `%wik%`.* FROM 'wikiadmin'@'10.%';
REVOKE DROP ON `%wik%`.* FROM 'wikiadmin'@'localhost';

(there are two wikiadmin users, one on localhost, one on 10.%)

If there is no objection by end of next week, I'm gonna slowly roll this out.

This ^ would work if there are GRANTS listed, but let's not run that with replication though, instead host by host with binlog disabled.

Hosts with GRANT ALL PRIVILEGES that need fixing first:

s1:
db2097:3311
clouddb1021:3311
clouddb1013:3311

s2:
db1156
db1102:3312

s3:
none

s4:
none

s5:
db2101

s6:
none

s7:
none

s8:
db2100:3318
db1116:3318

x1:
none

x2:
none

pc1:
none

pc2:
none

pc3:
none

es1:
none

es2:
none

es3:
none

es4:
none

es5:
none

The s8 ones seem to be fixed now.

Mentioned in SAL (#wikimedia-operations) [2021-11-15T19:46:37Z] <Amir1> revoked all grants from wikiadmins and gave back explicit list on db2101:3315 (T249683)

I just fixed db2101:3315 (s5 on dbstore_multiinstance). Here are the exact commands for verbosity:

set session sql_log_bin=0;
REVOKE ALL ON `%wik%`.* FROM `wikiadmin`@`10.%`;
GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, REFERENCES, INDEX, ALTER, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, EVENT, TRIGGER ON `%wik%`.* TO `wikiadmin`@`10.%`;

It had a user on 10.% but no user on localhost. So nothing to fix there.

Mentioned in SAL (#wikimedia-operations) [2021-11-15T20:03:24Z] <Amir1> revoked all grants from wikiadmin and gave back an explicit list on db1102:3312 (T249683)

Mentioned in SAL (#wikimedia-operations) [2021-11-15T20:08:07Z] <Amir1> revoked all grants from wikiadmin and gave back an explicit list on clouddb1021:3311 (T249683)

Mentioned in SAL (#wikimedia-operations) [2021-11-15T20:09:09Z] <Amir1> revoked all grants from wikiadmin and gave back an explicit list on clouddb1013:3311 (T249683)

So:

  • s8 and backup source in s1 were already done.
  • Fixed the rest
  • Except sanitarium master of s2

Mentioned in SAL (#wikimedia-operations) [2021-11-18T06:17:18Z] <Amir1> revoked all grants from wikiadmin and gave back an explicit list on db1156 (T249683)

Mentioned in SAL (#wikimedia-operations) [2021-11-18T06:31:46Z] <Amir1> revoked all grants from wikiadmin and gave back an explicit list on db1102:3312 (T249683)

okay, now no wikiadmin user has "ALL" grant anymore. We can now move forward to dropping DROP :) After that we should think about dropping ALTER.

Mentioned in SAL (#wikimedia-operations) [2021-11-22T15:28:52Z] <Amir1> revoking DROP for wikiadmin from db1100 (T249683)

I removed drop from one of replicas of s5, will wait for a couple of days to see if any errors show up.

Ladsgroup changed the task status from Open to Stalled.Nov 23 2021, 10:08 AM
Ladsgroup moved this task from In progress to Blocked on the DBA board.

This is blocked on T296274: Clean up wikiadmin GRANTs mess. Hopefully I will fix that soon or at least the parts that block this (missing or extra appserver GRANTs)

Ladsgroup changed the task status from Stalled to Open.Nov 26 2021, 10:55 AM
Ladsgroup moved this task from Blocked to In progress on the DBA board.

I revoked DROP from all of pooled s5 replicas, I'll wait for a day before moving forward to the rest of the fleet.

Mentioned in SAL (#wikimedia-operations) [2021-11-30T08:14:25Z] <Amir1> revoking DROP from wikiadmin on all pooled replicas (T249683)

I ran it on all pooled replicas in codfw and eqiad. Now I need to run it on masters + backup sources

Change 742675 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[operations/puppet@production] mariadb: Fix production grants

https://gerrit.wikimedia.org/r/742675

Change 742675 merged by Ladsgroup:

[operations/puppet@production] mariadb: Fix production grants

https://gerrit.wikimedia.org/r/742675

Done now. There is no DROP right for wikiadmin. Next stop: ALTER.

Done now. There is no DROP right for wikiadmin. Next stop: ALTER.

Masters included too?

Marostegui lowered the priority of this task from High to Medium.Nov 30 2021, 9:24 AM

Decreasing the priority from high to medium as the DROP one has been removed.

Mentioned in SAL (#wikimedia-operations) [2021-12-13T09:38:01Z] <Amir1> revoking DROP from centralauth grant of wikiadmin (T249683)

Let's remove ALTER grant from wikiadmin

Going to do that if there is no objection (from devs) by the next week.

Revoked ALTER from wikiadmin on s6, going to wait a week before moving to the rest of infra.

Ladsgroup claimed this task.

Done :)