Page MenuHomePhabricator

Drop two now-unused indexes on flaggedrevs table
Closed, DeclinedPublic

Description

With 709218 merged and deployed (since Thursday), two indexes of fr_page_qal_rev and fr_page_qal_time should be now unused. We can drop them if DBAs confirm there is no more indexes reading from them.

Event Timeline

@Ladsgroup would you be able to tell us some wikis were we can test this? My go-to wiki for flaggedrevs is normally dewiki but the indexes there are page_qal_rev and page_qal_time , did you refer to those?

| flaggedrevs | CREATE TABLE `flaggedrevs` (
  `fr_page_id` int(11) NOT NULL DEFAULT 0,
  `fr_rev_id` int(11) NOT NULL DEFAULT 0,
  `fr_user` int(5) NOT NULL DEFAULT 0,
  `fr_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `fr_comment` mediumblob NOT NULL,
  `fr_quality` tinyint(1) NOT NULL DEFAULT 0,
  `fr_tags` mediumblob NOT NULL,
  `fr_text` mediumblob NOT NULL,
  `fr_flags` tinyblob NOT NULL,
  `fr_img_name` varbinary(255) DEFAULT NULL,
  `fr_img_timestamp` varbinary(14) DEFAULT NULL,
  `fr_img_sha1` varbinary(32) DEFAULT NULL,
  `fr_rev_timestamp` varbinary(14) NOT NULL DEFAULT '',
  PRIMARY KEY (`fr_rev_id`),
  KEY `page_qal_rev` (`fr_page_id`,`fr_quality`,`fr_rev_id`),
  KEY `key_timestamp` (`fr_img_sha1`,`fr_img_timestamp`),
  KEY `page_rev` (`fr_page_id`,`fr_rev_id`),
  KEY `page_time` (`fr_page_id`,`fr_rev_timestamp`),
  KEY `page_qal_time` (`fr_page_id`,`fr_quality`,`fr_rev_timestamp`),
  KEY `fr_user` (`fr_user`)
) ENGINE=InnoDB DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED |

If you are talking about those indeed, they do seem unused on db1110 at least:

root@db1110.eqiad.wmnet[sys]> select * from schema_unused_indexes where object_schema like 'dewiki' and object_name like 'flaggedrevs';
+---------------+-------------+---------------+
| object_schema | object_name | index_name    |
+---------------+-------------+---------------+
| dewiki        | flaggedrevs | page_qal_rev  |
| dewiki        | flaggedrevs | page_rev      |
| dewiki        | flaggedrevs | page_time     |
| dewiki        | flaggedrevs | page_qal_time |
+---------------+-------------+---------------+
4 rows in set (0.045 sec)

And a codfw (active host):

root@db2128.codfw.wmnet[sys]> select * from schema_unused_indexes where object_schema like 'dewiki' and object_name like 'flaggedrevs';
+---------------+-------------+---------------+
| object_schema | object_name | index_name    |
+---------------+-------------+---------------+
| dewiki        | flaggedrevs | page_time     |
| dewiki        | flaggedrevs | page_qal_time |
| dewiki        | flaggedrevs | fr_user       |
+---------------+-------------+---------------+
3 rows in set (0.088 sec)

Let me know some other wikis too so I can double check. Despite of the indexes appearing there, we should remove them from some live hosts and see if some queries get mad.

@Ladsgroup would you be able to tell us some wikis were we can test this?

This is useful: https://noc.wikimedia.org/conf/highlight.php?file=dblists/flaggedrevs.dblist

I suggest ruwiki for example

My go-to wiki for flaggedrevs is normally dewiki but the indexes there are page_qal_rev and page_qal_time , did you refer to those?

| flaggedrevs | CREATE TABLE `flaggedrevs` (
  `fr_page_id` int(11) NOT NULL DEFAULT 0,
  `fr_rev_id` int(11) NOT NULL DEFAULT 0,
  `fr_user` int(5) NOT NULL DEFAULT 0,
  `fr_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `fr_comment` mediumblob NOT NULL,
  `fr_quality` tinyint(1) NOT NULL DEFAULT 0,
  `fr_tags` mediumblob NOT NULL,
  `fr_text` mediumblob NOT NULL,
  `fr_flags` tinyblob NOT NULL,
  `fr_img_name` varbinary(255) DEFAULT NULL,
  `fr_img_timestamp` varbinary(14) DEFAULT NULL,
  `fr_img_sha1` varbinary(32) DEFAULT NULL,
  `fr_rev_timestamp` varbinary(14) NOT NULL DEFAULT '',
  PRIMARY KEY (`fr_rev_id`),
  KEY `page_qal_rev` (`fr_page_id`,`fr_quality`,`fr_rev_id`),
  KEY `key_timestamp` (`fr_img_sha1`,`fr_img_timestamp`),
  KEY `page_rev` (`fr_page_id`,`fr_rev_id`),
  KEY `page_time` (`fr_page_id`,`fr_rev_timestamp`),
  KEY `page_qal_time` (`fr_page_id`,`fr_quality`,`fr_rev_timestamp`),
  KEY `fr_user` (`fr_user`)
) ENGINE=InnoDB DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED |

Yes. I think it's just a schema drift

If you are talking about those indeed, they do seem unused on db1110 at least:

root@db1110.eqiad.wmnet[sys]> select * from schema_unused_indexes where object_schema like 'dewiki' and object_name like 'flaggedrevs';
+---------------+-------------+---------------+
| object_schema | object_name | index_name    |
+---------------+-------------+---------------+
| dewiki        | flaggedrevs | page_qal_rev  |
| dewiki        | flaggedrevs | page_rev      |
| dewiki        | flaggedrevs | page_time     |
| dewiki        | flaggedrevs | page_qal_time |
+---------------+-------------+---------------+
4 rows in set (0.045 sec)

And a codfw (active host):

root@db2128.codfw.wmnet[sys]> select * from schema_unused_indexes where object_schema like 'dewiki' and object_name like 'flaggedrevs';
+---------------+-------------+---------------+
| object_schema | object_name | index_name    |
+---------------+-------------+---------------+
| dewiki        | flaggedrevs | page_time     |
| dewiki        | flaggedrevs | page_qal_time |
| dewiki        | flaggedrevs | fr_user       |
+---------------+-------------+---------------+
3 rows in set (0.088 sec)

Interestingly, page_qal_rev is not here. Maybe it needs time to show up in unused indexes? It was deployed in Thursday.

Let me know some other wikis too so I can double check. Despite of the indexes appearing there, we should remove them from some live hosts and see if some queries get mad.

ruwiki, fiwiki, plwiki. These are good.

@Ladsgroup would you be able to tell us some wikis were we can test this?

This is useful: https://noc.wikimedia.org/conf/highlight.php?file=dblists/flaggedrevs.dblist

Oh thanks!

I suggest ruwiki for example

My go-to wiki for flaggedrevs is normally dewiki but the indexes there are page_qal_rev and page_qal_time , did you refer to those?

| flaggedrevs | CREATE TABLE `flaggedrevs` (
  `fr_page_id` int(11) NOT NULL DEFAULT 0,
  `fr_rev_id` int(11) NOT NULL DEFAULT 0,
  `fr_user` int(5) NOT NULL DEFAULT 0,
  `fr_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `fr_comment` mediumblob NOT NULL,
  `fr_quality` tinyint(1) NOT NULL DEFAULT 0,
  `fr_tags` mediumblob NOT NULL,
  `fr_text` mediumblob NOT NULL,
  `fr_flags` tinyblob NOT NULL,
  `fr_img_name` varbinary(255) DEFAULT NULL,
  `fr_img_timestamp` varbinary(14) DEFAULT NULL,
  `fr_img_sha1` varbinary(32) DEFAULT NULL,
  `fr_rev_timestamp` varbinary(14) NOT NULL DEFAULT '',
  PRIMARY KEY (`fr_rev_id`),
  KEY `page_qal_rev` (`fr_page_id`,`fr_quality`,`fr_rev_id`),
  KEY `key_timestamp` (`fr_img_sha1`,`fr_img_timestamp`),
  KEY `page_rev` (`fr_page_id`,`fr_rev_id`),
  KEY `page_time` (`fr_page_id`,`fr_rev_timestamp`),
  KEY `page_qal_time` (`fr_page_id`,`fr_quality`,`fr_rev_timestamp`),
  KEY `fr_user` (`fr_user`)
) ENGINE=InnoDB DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED |

Yes. I think it's just a schema drift

We should try to review and fix those if there are more or at least get them recorded somewhere so we can iterate over it.

If you are talking about those indeed, they do seem unused on db1110 at least:

root@db1110.eqiad.wmnet[sys]> select * from schema_unused_indexes where object_schema like 'dewiki' and object_name like 'flaggedrevs';
+---------------+-------------+---------------+
| object_schema | object_name | index_name    |
+---------------+-------------+---------------+
| dewiki        | flaggedrevs | page_qal_rev  |
| dewiki        | flaggedrevs | page_rev      |
| dewiki        | flaggedrevs | page_time     |
| dewiki        | flaggedrevs | page_qal_time |
+---------------+-------------+---------------+
4 rows in set (0.045 sec)

And a codfw (active host):

root@db2128.codfw.wmnet[sys]> select * from schema_unused_indexes where object_schema like 'dewiki' and object_name like 'flaggedrevs';
+---------------+-------------+---------------+
| object_schema | object_name | index_name    |
+---------------+-------------+---------------+
| dewiki        | flaggedrevs | page_time     |
| dewiki        | flaggedrevs | page_qal_time |
| dewiki        | flaggedrevs | fr_user       |
+---------------+-------------+---------------+
3 rows in set (0.088 sec)

Interestingly, page_qal_rev is not here. Maybe it needs time to show up in unused indexes? It was deployed in Thursday.

Could be - going to give db2128 a restart to get its stats back from 0.

Let me know some other wikis too so I can double check. Despite of the indexes appearing there, we should remove them from some live hosts and see if some queries get mad.

ruwiki, fiwiki, plwiki. These are good.

Sweet. Going to review those too.

Mentioned in SAL (#wikimedia-operations) [2021-08-09T12:38:53Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2128 T288398', diff saved to https://phabricator.wikimedia.org/P16973 and previous config saved to /var/cache/conftool/dbconfig/20210809-123852-marostegui.json

Marostegui added a project: DBA.
Marostegui moved this task from Triage to In progress on the DBA board.

This is ruwiki:

# ./section s6 | grep codfw | egrep -v "db2094|db2095" | while read host port; do echo "$host:$port"; mysql.py -h$host:$port sys -e "select * from schema_unused_indexes where object_schema like 'ruwiki' and object_name like 'flaggedrevs';"| egrep -i "page_qal_rev|page_qal_time" ;done
db2141.codfw.wmnet:3316
ruwiki	flaggedrevs	page_qal_rev
ruwiki	flaggedrevs	page_qal_time
db2129.codfw.wmnet:3306
db2124.codfw.wmnet:3306
ruwiki	flaggedrevs	page_qal_rev
ruwiki	flaggedrevs	page_qal_time
db2117.codfw.wmnet:3306
ruwiki	flaggedrevs	page_qal_time
db2114.codfw.wmnet:3306
ruwiki	flaggedrevs	page_qal_time
db2089.codfw.wmnet:3316
ruwiki	flaggedrevs	page_qal_rev
ruwiki	flaggedrevs	page_qal_time
db2087.codfw.wmnet:3316
ruwiki	flaggedrevs	page_qal_rev
ruwiki	flaggedrevs	page_qal_time
db2076.codfw.wmnet:3306
ruwiki	flaggedrevs	page_qal_time

So interestingly, page_qal_rev isn't always being marked as unused.
The hosts that don't mark it are the ones serving dumps (db2117) and api (db2076, db2114).

If we take a look at dewiki:

# ./section s5 | grep codfw | egrep -v "db2094|db2095" | while read host port; do echo "$host:$port"; mysql.py -h$host:$port sys -e "select * from schema_unused_indexes where object_schema like 'dewiki' and object_name like 'flaggedrevs';"| egrep -i "page_qal_rev|page_qal_time" ;dones5 | grep codfw | egrep -v "db2094|db2095" | while read host port; do echo "$host:$port"; mysql.py -h$host:$port sys -e "select * from schema_unused_indexes where object_schema like 'dewiki' and object_name like 'flaggedrevs';"|
db2137.codfw.wmnet:3315
dewiki	flaggedrevs	page_qal_rev
dewiki	flaggedrevs	page_qal_time
db2128.codfw.wmnet:3306
dewiki	flaggedrevs	page_qal_rev
dewiki	flaggedrevs	page_qal_time
db2123.codfw.wmnet:3306
db2113.codfw.wmnet:3306
dewiki	flaggedrevs	page_qal_rev
dewiki	flaggedrevs	page_qal_time
db2111.codfw.wmnet:3306
dewiki	flaggedrevs	page_qal_rev
dewiki	flaggedrevs	page_qal_time
db2101.codfw.wmnet:3315
dewiki	flaggedrevs	page_qal_rev
dewiki	flaggedrevs	page_qal_time
db2089.codfw.wmnet:3315
dewiki	flaggedrevs	page_qal_rev
dewiki	flaggedrevs	page_qal_time
db2075.codfw.wmnet:3306
dewiki	flaggedrevs	page_qal_time

db2075 serves api along with db2128, we'll see if db2128 (which was just rebooted) marks it or not as unused, if it doesn't...maybe there's some API call query using it?

And plwiki:

# ./section s2 | grep codfw | egrep -v "db2094|db2095" | while read host port; do echo "$host:$port"; mysql.py -h$host:$port sys -e "select * from schema_unused_indexes where object_schema like 'plwiki' and object_name like 'flaggedrevs';"| egrep -i "page_qal_rev|page_qal_time" ;done
db2148.codfw.wmnet:3306
db2138.codfw.wmnet:3312
db2126.codfw.wmnet:3306
db2125.codfw.wmnet:3306
db2107.codfw.wmnet:3306
plwiki	flaggedrevs	page_qal_rev
db2104.codfw.wmnet:3306
db2101.codfw.wmnet:3312
plwiki	flaggedrevs	page_qal_rev
plwiki	flaggedrevs	page_qal_time
db2097.codfw.wmnet:3312
plwiki	flaggedrevs	page_qal_rev
plwiki	flaggedrevs	page_qal_time
db2088.codfw.wmnet:3312

I guess it has a lot less queries so the stats are a bit messed up?

So after a few days, results seem to be the same.

root@db2128.codfw.wmnet[(none)]>  select * from sys.schema_unused_indexes where object_schema like 'dewiki' and object_name like 'flaggedrevs';
+---------------+-------------+---------------+
| object_schema | object_name | index_name    |
+---------------+-------------+---------------+
| dewiki        | flaggedrevs | page_time     |
| dewiki        | flaggedrevs | page_qal_time |
| dewiki        | flaggedrevs | fr_user       |
+---------------+-------------+---------------+
3 rows in set (0.127 sec)

So again, it could be that there are no many queries, or that page_qal_rev is really used by some queries.

Looking at s6, it seems that some hosts do use that one:

# ./section s6 | grep codfw | egrep -v "db2094|db2095" | while read host port; do echo "$host:$port"; mysql.py -h$host:$port sys -e "select * from schema_unused_indexes where object_schema like 'ruwiki' and object_name like 'flaggedrevs';"| egrep -i "page_qal_rev|page_qal_time" ;done
db2141.codfw.wmnet:3316
ruwiki	flaggedrevs	page_qal_rev
ruwiki	flaggedrevs	page_qal_time
db2129.codfw.wmnet:3306
db2124.codfw.wmnet:3306
ruwiki	flaggedrevs	page_qal_rev
ruwiki	flaggedrevs	page_qal_time
db2117.codfw.wmnet:3306
ruwiki	flaggedrevs	page_qal_time
db2114.codfw.wmnet:3306
ruwiki	flaggedrevs	page_qal_time
db2089.codfw.wmnet:3316
ruwiki	flaggedrevs	page_qal_rev
ruwiki	flaggedrevs	page_qal_time
db2087.codfw.wmnet:3316
ruwiki	flaggedrevs	page_qal_rev
ruwiki	flaggedrevs	page_qal_time
db2076.codfw.wmnet:3306
ruwiki	flaggedrevs	page_qal_time
# ./section s5 | grep codfw | egrep -v "db2094|db2095" | while read host port; do echo "$host:$port"; mysql.py -h$host:$port sys -e "select * from schema_unused_indexes where object_schema like 'dewiki' and object_name like 'flaggedrevs';"| egrep -i "page_qal_rev|page_qal_time" ;done
db2137.codfw.wmnet:3315
dewiki	flaggedrevs	page_qal_rev
dewiki	flaggedrevs	page_qal_time
db2128.codfw.wmnet:3306
dewiki	flaggedrevs	page_qal_time
db2123.codfw.wmnet:3306
db2113.codfw.wmnet:3306
dewiki	flaggedrevs	page_qal_rev
dewiki	flaggedrevs	page_qal_time
db2111.codfw.wmnet:3306
dewiki	flaggedrevs	page_qal_rev
dewiki	flaggedrevs	page_qal_time
db2101.codfw.wmnet:3315
dewiki	flaggedrevs	page_qal_rev
dewiki	flaggedrevs	page_qal_time
db2089.codfw.wmnet:3315
dewiki	flaggedrevs	page_qal_rev
dewiki	flaggedrevs	page_qal_time
db2075.codfw.wmnet:3306
dewiki	flaggedrevs	page_qal_time

The stats seem to be the same it looks like it is used by something. Given that we are not under super disk space constraints for this particular table I am inclined to decline this one. Finding the query that uses it might take ages.
We do have space issues on flaggedtemplates though (44GB on dewiki for instance) and.....168G on ruwiki. That one is approaching a very dangerous zone.

@Ladsgroup thoughts?

hmm, okay give me a couple of days or weeks to find out what's reading this index (I want to simplify its schema one by one)

That templates table seems horrifying. I take a look at that mess separately

@Ladsgroup would you be able to tell us some wikis were we can test this?

This is useful: https://noc.wikimedia.org/conf/highlight.php?file=dblists/flaggedrevs.dblist

I suggest ruwiki for example

My go-to wiki for flaggedrevs is normally dewiki but the indexes there are page_qal_rev and page_qal_time , did you refer to those?

| flaggedrevs | CREATE TABLE `flaggedrevs` (
  `fr_page_id` int(11) NOT NULL DEFAULT 0,
  `fr_rev_id` int(11) NOT NULL DEFAULT 0,
  `fr_user` int(5) NOT NULL DEFAULT 0,
  `fr_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `fr_comment` mediumblob NOT NULL,
  `fr_quality` tinyint(1) NOT NULL DEFAULT 0,
  `fr_tags` mediumblob NOT NULL,
  `fr_text` mediumblob NOT NULL,
  `fr_flags` tinyblob NOT NULL,
  `fr_img_name` varbinary(255) DEFAULT NULL,
  `fr_img_timestamp` varbinary(14) DEFAULT NULL,
  `fr_img_sha1` varbinary(32) DEFAULT NULL,
  `fr_rev_timestamp` varbinary(14) NOT NULL DEFAULT '',
  PRIMARY KEY (`fr_rev_id`),
  KEY `page_qal_rev` (`fr_page_id`,`fr_quality`,`fr_rev_id`),
  KEY `key_timestamp` (`fr_img_sha1`,`fr_img_timestamp`),
  KEY `page_rev` (`fr_page_id`,`fr_rev_id`),
  KEY `page_time` (`fr_page_id`,`fr_rev_timestamp`),
  KEY `page_qal_time` (`fr_page_id`,`fr_quality`,`fr_rev_timestamp`),
  KEY `fr_user` (`fr_user`)
) ENGINE=InnoDB DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED |

Yes. I think it's just a schema drift

It must be a drift, because the indexes are renamed without alter statements - https://gerrit.wikimedia.org/r/c/mediawiki/extensions/FlaggedRevs/+/110451

@Ladsgroup I am moving this to Blocked on our dashboard just to reflect that this is not ready (T288398#7288014) - no rush though, this is just for your information.

@Ladsgroup I am moving this to Blocked on our dashboard just to reflect that this is not ready (T288398#7288014) - no rush though, this is just for your information.

Is there a way in the logs to see what queries have used the index?

@Ladsgroup do you think it is worth investing time on this? I am not sure if the cost-benefit is worth it or if we should just decline this task.

Database-wise, yeah it's not worth the effort but I have been pushing to make this extension simpler because due to its unbelievable complexity it's unmaintainable. OTOH, we probably can simply drop columns as well. I can work on that.

ok, let's decline this task. If we happen to be able drop the columns, we can create a schema change task for that whenever the time arrives (no rush)

Just linking T115148, where the same is requested

I think I know why it's being used, it's technically useless but optimizer picks it up because of cardinality in some wikis that used to have non-zero fr_quality (e.g. ruwiki), that's why in dewiki it's never used, in dewiki it was never non-zero. Writing a maint script to clean these rows would fix it.