Page MenuHomePhabricator

Review special replica partitioning of certain tables by `xx_user`
Closed, ResolvedPublic

Description

I came across the fact today that the logging table is partitioned by log_user on certain replicas. Reasonably soon that column will be dropped as part of T188327: Deploy refactored actor storage.

As of this morning we shouldn't have any queries reading log_user anymore (they're using log_actor instead), and the writes will be being turned off over the next few weeks (leaving the column with its default 0 in new rows). If the turning off of the writes goes well, the task for dropping log_user will hopefully be filed in early June.

I see we have paritioning for revision and rev_user on some replicas as well. That table is in the a worse situation with respect to reads (queries formerly targeting rev_user are now targeting the revision_actor_temp table) and is the same for writes. The final dropping of rev_user will not be in June; I have no timeframe for T215466 yet.

If we have other core tables partitioned by an xx_user column, they're likely in the same situation as logging.

I have no idea whether we should partition by something else (log_actor?) or just remove the partitioning. The reasoning for separate replicas in 085e3563ed may still apply even without the partitioning.

Note this task doesn't necessarily require that anything be done now. I just wanted to raise the issue with some lead time before the schema change request.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Thanks for raising this.
Keep in mind that revision or logging are partitioned, as you pointed out, only on the special slaves (those on the recentchanges, watchlist etc groups). But to make it even more interesting, they are only partitioned on certain wikis (big ones normally). For instance it is partitioned on enwiki but not on small wikis like the ones on s3 as those are small enough not to run into problems. What is the baseline to decide which ones need it and which ones don't, is probably (and this is a guess, as I wasn't here when it was decided) query performance under a certain number of rows.
And if that wasn't enough, there are some wikis that have less partitions, for example revision on enwiki vs eswiki.

And of course, those special slaves also have different PKs for revision and ' logging` too.

I am not sure if we should partition something by log_actor but probably worth checking the query performance for the queries that will be arriving to those special slaves (which I guess it was the case for 085e3563ed at the time)

@Marostegui moved this task from In progress to Blocked external/Not db team on the DBA board.

I guess we should ask Performance-Team then?

@Marostegui moved this task from In progress to Blocked external/Not db team on the DBA board.

I guess we should ask Performance-Team then?

I moved it into that column of the DBA dashboard cause we have already given our opinion and we are blocked on a decision to see how/when we can move forward.

So, I guess we'd need to decide what to do with the logging table on the special replicas as we are already at T233135: Schema change for refactored actor and comment storage.
While 085e3563ed0c might still apply, it is also true that we have more powerful hardware (more buffer pool) and specially faster storage SSD across the fleet.
Removing the partitions isn't the only issue, the problem is that we have to also modify the PK of logging table as currently it is:

PRIMARY KEY (`log_id`,`log_user`),

That can lead to all sort of issues with the optimizer, but not necessarily bad anymore. Maybe what we can try is to place a normal (non partitioned replica, and with a log_id PK) into enwiki logging section and see how it performs.

I will try to place a non-partitioned host tomorrow on the logging enwiki section for some time and enable the slow log to see what I get for the logging table so we can see how it would affect and if we need to partition by something else or change the PK.

Marostegui triaged this task as Medium priority.Sep 18 2019, 3:19 PM

Two options- the partitioning is no longer needed due to the new schema (preferred) or b) the partitioning is still needed, partition by actor id. Check on a large enwiki host with low weight.

I have captured 30 minutes of read queries on db1105 (rc on enwiki) and replayed them into db1089 (non partitioned enwiki slave) and nothing has sticked out. Specially nothing related to the logging table.
So I am going to pool db1089 on the logging service for a few hours and see what we get leaving it there for a few hours.

Mentioned in SAL (#wikimedia-operations) [2019-09-19T06:04:41Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Temporarily pool db1089 into enwiki logpager T223151', diff saved to https://phabricator.wikimedia.org/P9130 and previous config saved to /var/cache/conftool/dbconfig/20190919-060440-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2019-09-19T07:22:35Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Give more logpager weight to db1089 T223151', diff saved to https://phabricator.wikimedia.org/P9131 and previous config saved to /var/cache/conftool/dbconfig/20190919-072234-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2019-09-19T13:08:49Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1089 into contributions service T223151', diff saved to https://phabricator.wikimedia.org/P9133 and previous config saved to /var/cache/conftool/dbconfig/20190919-130848-marostegui.json

db1089 has been serving pretty almost 8h now the logpager service handling 50% of its traffic:

"logpager": {
   "db1089": 2,
   "db1099:3311": 1,
   "db1105:3311": 1

There has been no slow queries reported, nothing regarding the query response time has changed as well, and finally, no trace of errors on the logs. So that is looking good. I also found no issues when replaying the traffic captured on db1105 (T223151#5505180).

I have now also included it on the contributions service as well and I will leave it like that until tomorrow.

"contributions": {
  "db1089": 1,
  "db1099:3311": 1,
  "db1105:3311": 1

If everything keeps looking good, I will go ahead and alter one of the existing special slaves and remove its partitioning and change its PK from:
PRIMARY KEY (log_id,log_user),

To:

`PRIMARY KEY (log_id),`

I would create a subtask to track that schema change to do it before (or along with) T233135: Schema change for refactored actor and comment storage

In order to give more context, if partitioning was still needed, query timeouts/long running happen almost immediately for queries like recentchanges/user contributions. I believe specially for enwiki, commonswiki and wikidata. Refactoring, index hints, mariadb version and schema changes may have made it obsolete (that would be great news, 14 special servers less to maintain) ?

After leaving db1089 all night this is the only query reported involving logging table that has some differences:

SELECT /* IndexPager::buildQueryInfo (LogPager) */ /*! STRAIGHT_JOIN */ log_id, log_type, log_action, log_timestamp, log_namespace, log_title, log_params, log_deleted, user_id, user_name, user_editcount, comment_log_comment.comment_text AS `log_comment_text`, comment_log_comment.comment_data AS `log_comment_data`, comment_log_comment.comment_id AS `log_comment_cid`, actor_log_user.actor_user AS `log_user`, actor_log_user.actor_name AS `log_user_text`, log_actor, (SELECT GROUP_CONCAT(ctd_name SEPARATOR ', ') FROM `change_tag` JOIN `change_tag_def` ON ((ct_tag_id=ctd_id)) WHERE ct_log_id=log_id ) AS `ts_tags` FROM `logging` JOIN `comment` `comment_log_comment` ON ((comment_log_comment.comment_id = log_comment_id)) JOIN `actor` `actor_log_user` ON ((actor_log_user.actor_id = log_actor)) LEFT JOIN `user` ON ((user_id=actor_log_user.actor_user)) WHERE (log_type NOT IN ('xx', 'xx', 'xx', 'xx', 'xx')) AND ((log_actor = 'xx')) AND ((log_deleted & 4) = 0) AND (log_type != 'review') AND (log_type != 'xx') AND (log_type != 'xx') AND (log_type != 'xx') ORDER BY log_timestamp DESC LIMIT 51;

This query on db1089 (non partitioned) has a different plan from the one on db1105 (partioned). However, none of the differences affect the log_user field or PK, it is just the times vs actor_time index.
db1105. The query on db1089 takes 8 seconds and on db1105 it takes 0.01 seconds.
These are the explains

db1105

+------+--------------------+---------------------+--------+----------------------------------------------------------+-----------------------+---------+----------------------------------+------+-------------+
| id   | select_type        | table               | type   | possible_keys                                            | key                   | key_len | ref                              | rows | Extra       |
+------+--------------------+---------------------+--------+----------------------------------------------------------+-----------------------+---------+----------------------------------+------+-------------+
|    1 | PRIMARY            | logging             | ref    | type_time,actor_time,log_actor_type_time,log_type_action | actor_time            | 8       | const                            |   10 | Using where |
|    1 | PRIMARY            | comment_log_comment | eq_ref | PRIMARY                                                  | PRIMARY               | 8       | enwiki.logging.log_comment_id    |    1 |             |
|    1 | PRIMARY            | actor_log_user      | const  | PRIMARY                                                  | PRIMARY               | 8       | const                            |    1 |             |
|    1 | PRIMARY            | user                | eq_ref | PRIMARY                                                  | PRIMARY               | 4       | enwiki.actor_log_user.actor_user |    1 | Using where |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | change_tag_log_tag_id,change_tag_tag_id_id               | change_tag_log_tag_id | 5       | enwiki.logging.log_id            |   15 | Using index |
|    2 | DEPENDENT SUBQUERY | change_tag_def      | eq_ref | PRIMARY                                                  | PRIMARY               | 4       | enwiki.change_tag.ct_tag_id      |    1 |             |
+------+--------------------+---------------------+--------+----------------------------------------------------------+-----------------------+---------+----------------------------------+------+-------------+
6 rows in set (0.00 sec)

db1089

+------+--------------------+---------------------+--------+----------------------------------------------------------+-----------------------+---------+----------------------------------+-------+-------------+
| id   | select_type        | table               | type   | possible_keys                                            | key                   | key_len | ref                              | rows  | Extra       |
+------+--------------------+---------------------+--------+----------------------------------------------------------+-----------------------+---------+----------------------------------+-------+-------------+
|    1 | PRIMARY            | logging             | index  | type_time,actor_time,log_actor_type_time,log_type_action | times                 | 16      | NULL                             | 58752 | Using where |
|    1 | PRIMARY            | comment_log_comment | eq_ref | PRIMARY                                                  | PRIMARY               | 8       | enwiki.logging.log_comment_id    |     1 |             |
|    1 | PRIMARY            | actor_log_user      | const  | PRIMARY                                                  | PRIMARY               | 8       | const                            |     1 |             |
|    1 | PRIMARY            | user                | eq_ref | PRIMARY                                                  | PRIMARY               | 4       | enwiki.actor_log_user.actor_user |     1 | Using where |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | change_tag_log_tag_id,change_tag_tag_id_id               | change_tag_log_tag_id | 5       | enwiki.logging.log_id            |    23 | Using index |
|    2 | DEPENDENT SUBQUERY | change_tag_def      | eq_ref | PRIMARY                                                  | PRIMARY               | 4       | enwiki.change_tag.ct_tag_id      |     1 |             |
+------+--------------------+---------------------+--------+----------------------------------------------------------+-----------------------+---------+----------------------------------+-------+-------------+
6 rows in set (0.00 sec)

This same query on a codfw host (non partitioned) uses the log_actor_type_time index, which could mean the optimizer is being dumb again in some cases.

db2102:

+------+--------------------+---------------------+--------+----------------------------------------------------------+-----------------------+---------+----------------------------------+-------+----------------------------------------------------+
| id   | select_type        | table               | type   | possible_keys                                            | key                   | key_len | ref                              | rows  | Extra                                              |
+------+--------------------+---------------------+--------+----------------------------------------------------------+-----------------------+---------+----------------------------------+-------+----------------------------------------------------+
|    1 | PRIMARY            | logging             | range  | type_time,actor_time,log_actor_type_time,log_type_action | log_actor_type_time   | 42      | NULL                             | 56299 | Using index condition; Using where; Using filesort |
|    1 | PRIMARY            | comment_log_comment | eq_ref | PRIMARY                                                  | PRIMARY               | 8       | enwiki.logging.log_comment_id    |     1 |                                                    |
|    1 | PRIMARY            | actor_log_user      | const  | PRIMARY                                                  | PRIMARY               | 8       | const                            |     1 |                                                    |
|    1 | PRIMARY            | user                | eq_ref | PRIMARY                                                  | PRIMARY               | 4       | enwiki.actor_log_user.actor_user |     1 | Using where                                        |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | change_tag_log_tag_id,change_tag_tag_id_id               | change_tag_log_tag_id | 5       | enwiki.logging.log_id            |    14 | Using index                                        |
|    2 | DEPENDENT SUBQUERY | change_tag_def      | eq_ref | PRIMARY                                                  | PRIMARY               | 4       | enwiki.change_tag.ct_tag_id      |     1 |                                                    |
+------+--------------------+---------------------+--------+----------------------------------------------------------+-----------------------+---------+----------------------------------+-------+----------------------------------------------------+
6 rows in set (0.04 sec)

This could be a matter of refreshing the stats on that table anyways. Which is what I will try next, first on db2102 and later on db1089.

Mentioned in SAL (#wikimedia-operations) [2019-09-20T05:27:27Z] <marostegui> Analyze table enwiki.logging on db2102 - T223151

Mentioned in SAL (#wikimedia-operations) [2019-09-20T05:29:04Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Remove db1089 from logpager and contributions after testing, repool back with normal weight on main traffic T223151', diff saved to https://phabricator.wikimedia.org/P9136 and previous config saved to /var/cache/conftool/dbconfig/20190920-052902-marostegui.json

So, running the analyze didn't change things on db2102, it kept choosing log_actor_type_time
However an alter table engine=InnoDB,force made the optimizer to choose times index on that host, which is even a worse plan, the optimizer being dumb again!.

So I don't think running the analyze on db1089 will change things, but I can try next week.

On db1089 the optimizer is being definitely dumb as well, this is the real scan it does:

`
root@db1089.eqiad.wmnet[enwiki]> show explain for 7320686887;
+------+--------------------+---------------------+--------+--------------------------------------------+-----------------------+---------+-------------------------------+----------+----------------+
| id   | select_type        | table               | type   | possible_keys                              | key                   | key_len | ref                           | rows     | Extra          |
+------+--------------------+---------------------+--------+--------------------------------------------+-----------------------+---------+-------------------------------+----------+----------------+
|    1 | PRIMARY            | actor_log_user      | const  | PRIMARY                                    | PRIMARY               | 8       | const                         |        1 | Using filesort |
|    1 | PRIMARY            | user                | const  | PRIMARY                                    | PRIMARY               | 4       | const                         |        1 |                |
|    1 | PRIMARY            | logging             | index  | NULL                                       | times                 | 16      | NULL                          | 78474403 | Using where    |
|    1 | PRIMARY            | comment_log_comment | eq_ref | PRIMARY                                    | PRIMARY               | 8       | enwiki.logging.log_comment_id |        1 |                |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | change_tag_log_tag_id,change_tag_tag_id_id | change_tag_log_tag_id | 5       | enwiki.logging.log_id         |       23 | Using index    |
|    2 | DEPENDENT SUBQUERY | change_tag_def      | eq_ref | PRIMARY                                    | PRIMARY               | 4       | enwiki.change_tag.ct_tag_id   |        1 |                |
+------+--------------------+---------------------+--------+--------------------------------------------+-----------------------+---------+-------------------------------+----------+----------------+
6 rows in set, 1 warning (0.00 sec)

If hinted to use actor_times INDEX the query runs in 0.01 on db1089 like in db1105.

This query seems to be choosing between three indexes depending on the host: times (bad one), actor_time (good one), log_actor_type_time (which doesn't seem as bad as times and the query runs in 0.01 too).
@Anomie should we try to hint the optimizer to choose actor_times or IGNORE times for this query, given that either actor_times or log_actor_type_time makes the query run faster? That would allow us to get rid of the partitioning and the different PK.

Host and indexes used:

db2130.codfw.wmnet:3306
log_actor_type_time
db2116.codfw.wmnet:3306
log_actor_type_time
db2112.codfw.wmnet:3306
log_actor_type_time
db2103.codfw.wmnet:3306
log_actor_type_time
db2102.codfw.wmnet:3306
times
db2097.codfw.wmnet:3311
log_actor_type_time
db2094.codfw.wmnet:3311
times
db2092.codfw.wmnet:3306
times
db2088.codfw.wmnet:3311
actor_time
db2085.codfw.wmnet:3311
actor_time
db2072.codfw.wmnet:3306
times
db2071.codfw.wmnet:3306
times
db2048.codfw.wmnet:3306
times
db1139.eqiad.wmnet:3311
log_actor_type_time
db1134.eqiad.wmnet:3306
actor_time
db1124.eqiad.wmnet:3311
times
db1119.eqiad.wmnet:3306
times
db1118.eqiad.wmnet:3306
times
db1114.eqiad.wmnet:3306
actor_time
db1106.eqiad.wmnet:3306
times
db1105.eqiad.wmnet:3311
actor_time
db1099.eqiad.wmnet:3311
actor_time
db1089.eqiad.wmnet:3306
times
db1083.eqiad.wmnet:3306
times
db1080.eqiad.wmnet:3306
times
db1067.eqiad.wmnet:3306
times

It might be a coincidence, but the 10.3 host (db1114) does use the best index
This is the exact query that was reported: P9137

I have testing also https://mariadb.com/kb/en/library/server-system-variables/#optimizer_use_condition_selectivity which is what MariaDB people recommended to me during our chat at Percona Live, but that change needs a lot more research than just a couple of hours as it can impact query all query plans, performance etc, so I would leave it aside from this particular issue.

For what is worth and to give some perspective, the above query has only been reported 13 times in 24h, so it is not such a big deal.

All that reminds me of T71222#4994266. We might have to add index ignoring to LogPager like we did for ApiQueryLogEvents in rMWe6021abc9c16: ApiQueryLogEvents: Add IGNORE INDEX to avoid MariaDB optimizer bug.

Yeah, I believe so too, so +1 to include the IGNORE.

I am going to create a page on wikitech to document these cases (this and the one you mentioned) so we can check and remove those once 10.3 is deployed (if the optimizer keeps looking good there)

Change 538326 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] LogPager: Add IGNORE INDEX to avoid MariaDB optimizer bug

https://gerrit.wikimedia.org/r/538326

Change 538326 merged by jenkins-bot:
[mediawiki/core@master] LogPager: Add IGNORE INDEX to avoid MariaDB optimizer bug

https://gerrit.wikimedia.org/r/538326

Change 538326 merged by jenkins-bot:
[mediawiki/core@master] LogPager: Add IGNORE INDEX to avoid MariaDB optimizer bug

https://gerrit.wikimedia.org/r/538326

@Marostegui: This patch should be deployed on all Wikimedia wikis by Thursday, somewhere around 19:00–21:00 UTC. If it's blocking something, let me know and we can backport it.

Thanks - no, nothing is blocked. I can start with codfw.

Change 538326 merged by jenkins-bot:
[mediawiki/core@master] LogPager: Add IGNORE INDEX to avoid MariaDB optimizer bug

https://gerrit.wikimedia.org/r/538326

@Marostegui: This patch should be deployed on all Wikimedia wikis by Thursday, somewhere around 19:00–21:00 UTC. If it's blocking something, let me know and we can backport it.

I assume this was finally deployed, right? I am asking because I was reviewing the slow queries just in case, for some reason, the IGNORE was causing something (even though we tested it), and I haven't seen anything related to this specific query (which is good).

Yes, it's included in wmf/1.34.0-wmf.24. As of about 19:23 UTC on 2019-09-26, according to https://wikitech.wikimedia.org/wiki/Server_Admin_Log, that was deployed to all Wikimedia wikis.

So we've got rid of all the partitions on logging at T233625.
Next is revision.
I think I am going to capture some traffic that arrives to special slaves to the revision table and replay it on a non partitioned replica, to see what we get in terms of slow queries from that first approach

I have done a couple of tests, one on enwiki and another one on dewiki.

enwiki
I have captured 118821 selects that arrived to the revision table on the special partioned slave that is currently serving all enwiki (the other one is depooled for maintenance).
Picked a non-partitioned host in codfw and replayed all the selects there, even though the host was cold, nothing has taken more than a few ms to be done. Those are good news.

dewiki
Same test done, captured 156645 selects on the revisiontable on one of the special partitioned slaves for dewiki.
Replayed the traffic on a non partitioned codfw slave, with issues or slow queries observed.

Given that I am doing some maintenance with enwiki at the moment, I am going to to the first real tests on dewiki. I will place a non partitioned slave to serve (firstly with small weight) contributions logpager recentchanges recentchangeslinked watchlist which is exactly what the partitioned slaves serve. If nothing pops up, I will increase the LB weight to leave it at the same level as the other 2 hosts and leave it serving for a few days to see if we see something.
If it goes fine, I will do the same with enwiki

Mentioned in SAL (#wikimedia-operations) [2019-10-17T07:13:09Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db1130 (non partitioned host) into s5 special group with low weight - T223151', diff saved to https://phabricator.wikimedia.org/P9370 and previous config saved to /var/cache/conftool/dbconfig/20191017-071308-marostegui.json

So far no slow queries reported for db1130 on s5. I will probably give it more traffic on Monday, I will leave it pooled like it is now for the weekend:

"contributions": {
  "db1096:3315": 100,
  "db1097:3315": 100,
  "db1130": 50
},
"logpager": {
  "db1096:3315": 100,
  "db1097:3315": 100,
  "db1130": 50
},
"recentchanges": {
  "db1096:3315": 100,
  "db1097:3315": 100,
  "db1130": 50
},
"recentchangeslinked": {
  "db1096:3315": 100,
  "db1097:3315": 100,
  "db1130": 50
},
"watchlist": {
  "db1096:3315": 100,
  "db1097:3315": 100,
  "db1130": 50

Mentioned in SAL (#wikimedia-operations) [2019-10-21T05:58:44Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Give weight 100 to db1130 on s5 to check for slow queries T223151', diff saved to https://phabricator.wikimedia.org/P9402 and previous config saved to /var/cache/conftool/dbconfig/20191021-055843-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2019-10-21T07:06:56Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool non partitioned db1089 into s1 special slaves to check for slow queries T223151', diff saved to https://phabricator.wikimedia.org/P9406 and previous config saved to /var/cache/conftool/dbconfig/20191021-070655-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2019-10-21T07:06:56Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool non partitioned db1089 into s1 special slaves to check for slow queries T223151', diff saved to https://phabricator.wikimedia.org/P9406 and previous config saved to /var/cache/conftool/dbconfig/20191021-070655-marostegui.json

I have pooled db1089 (non partitioned host) with some weight into s1 to see if we get some reports of slow queries arriving to a non partitioned revision table.

So far on enwiki no slow query reported only on db1089.
There is one query being reported on db1089, db1105:3311 and db1099:3311, so if it is being reported on the three of them, that means no regression on a non partitioned host.
The query plan is the same for both non-partitioned and partitioned hosts: {P9407}

Mentioned in SAL (#wikimedia-operations) [2019-10-23T07:04:47Z] <marostegui> Enable slow query log 1/10 on db1089 (enwiki) T223151

So far nothing has shown up on db1089 ( enwiki.revision non partitioned serving special groups), so that's good news.
A few days ago while chatting with Jaime he pointed out that in the past he's also used slow query log (with some sampling) to try to catch or to cover for a wider range of queries.
So I have also done that on db1089 to see if we catch something else on the revision table before given db1089 more weight, to serve 1/3 of the requests to the special groups.

I will report back with what I find on the slow query log, if anything.

Mentioned in SAL (#wikimedia-operations) [2019-10-23T12:27:09Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Remove db1130 from the special slaves group on s5 and leave it back with its original pooling options T223151', diff saved to https://phabricator.wikimedia.org/P9454 and previous config saved to /var/cache/conftool/dbconfig/20191023-122708-marostegui.json

I have removed db1130 from s5 special groups and left it back with its original pooling options.
Nothing has shown up on dewiki after having a non partitioned host serving there for a week.

Let's see what enwiki brings...

Addendum, there is also logging of long running queries killed:

mysql@db1089[ops]> select * FROM ops.event_log;
+-----------+---------------------+---------------------------+-----------------+
| server_id | stamp               | event                     | content         |
+-----------+---------------------+---------------------------+-----------------+
| 171974771 | 2019-10-22 13:48:35 | wmf_slave_wikiuser_sleep  | kill 9746531363 |
| 171974771 | 2019-10-22 13:49:35 | wmf_slave_wikiuser_sleep  | kill 9746566948 |
| 171974771 | 2019-10-22 13:51:35 | wmf_slave_wikiuser_sleep  | kill 9746683663 |
| 171974771 | 2019-10-22 13:54:35 | wmf_slave_wikiuser_sleep  | kill 9746814675 |
| 171974771 | 2019-10-22 13:55:35 | wmf_slave_wikiuser_sleep  | kill 9746883265 |
| 171974771 | 2019-10-22 14:04:05 | wmf_slave_wikiuser_sleep  | kill 9747329948 |
| 171974771 | 2019-10-22 14:10:35 | wmf_slave_wikiuser_sleep  | kill 9747629863 |
| 171974771 | 2019-10-22 14:14:05 | wmf_slave_wikiuser_sleep  | kill 9747812072 |
| 171974771 | 2019-10-22 14:28:05 | wmf_slave_wikiuser_sleep  | kill 9748462422 |
| 171974771 | 2019-10-22 16:04:35 | wmf_slave_wikiuser_sleep  | kill 9753317384 |
| 171974771 | 2019-10-22 23:26:05 | wmf_slave_wikiuser_sleep  | kill 9775220591 |
| 171974771 | 2019-10-23 01:30:35 | wmf_slave_wikiuser_sleep  | kill 9780945455 |
| 171974771 | 2019-10-23 03:02:35 | wmf_slave_wikiuser_sleep  | kill 9785159042 |
| 171974771 | 2019-10-23 08:52:05 | wmf_slave_wikiuser_sleep  | kill 9799620004 |
+-----------+---------------------+---------------------------+-----------------+
14 rows in set (0.00 sec)

More interesting:

root@db1105.eqiad.wmnet[ops]> select * FROM event_log;
+-----------+---------------------+-------------------------------+--------------------------------------------------------------------------------------------------------------------+
| server_id | stamp               | event                         | content                                                                                                            |
+-----------+---------------------+-------------------------------+--------------------------------------------------------------------------------------------------------------------+
| 171974878 | 2019-10-22 13:47:05 | wmf_slave_wikiuser_sleep      | kill 723811                                                                                                        |
| 171974878 | 2019-10-22 13:47:05 | wmf_slave_wikiuser_sleep      | kill 723670                                                                                                        |
| 171974878 | 2019-10-22 15:41:05 | wmf_slave_wikiuser_sleep      | kill 1942043                                                                                                       |
| 171974878 | 2019-10-22 16:04:35 | wmf_slave_wikiuser_sleep      | kill 2190441                                                                                                       |
| 171974878 | 2019-10-22 18:40:03 | wmf_slave_wikiuser_slow (>60) | kill 3921791; SELECT /* SpecialRecentChanges::doMainQuery  */  /*! STRAIGHT_JOIN */ rc_id,rc_timestamp,rc_namespac |
| 171974878 | 2019-10-22 18:40:03 | wmf_slave_wikiuser_slow (>60) | kill 3921791; SELECT /* SpecialRecentChanges::doMainQuery  */  /*! STRAIGHT_JOIN */ rc_id,rc_timestamp,rc_namespac |
| 171974878 | 2019-10-22 18:46:33 | wmf_slave_wikiuser_slow (>60) | kill 3992966; SELECT /* IndexPager::buildQueryInfo (contributions page filtered for namespace or RevisionDeleted e |
| 171974878 | 2019-10-22 20:12:33 | wmf_slave_wikiuser_slow (>60) | kill 4963594; SELECT /* IndexPager::buildQueryInfo (contributions page filtered for namespace or RevisionDeleted e |
| 171974878 | 2019-10-22 23:28:05 | wmf_slave_wikiuser_sleep      | kill 7071901                                                                                                       |
| 171974878 | 2019-10-23 03:32:05 | wmf_slave_wikiuser_sleep      | kill 9481993                                                                                                       |
| 171974878 | 2019-10-23 08:47:05 | wmf_slave_wikiuser_sleep      | kill 12477905                                                                                                      |
| 171974878 | 2019-10-23 08:47:35 | wmf_slave_wikiuser_sleep      | kill 12482807                                                                                                      |
| 171974878 | 2019-10-23 12:35:35 | wmf_slave_wikiuser_sleep      | kill 14826935                                                                                                      |
+-----------+---------------------+-------------------------------+--------------------------------------------------------------------------------------------------------------------+
13 rows in set (0.00 sec)

The first analysis of the slow query log on a non partitioned enwiki slave after around 6 hours of logging (1/10 queries - around 20M queries logged) report no slow queries for the revision table, which correlates with the tests at T223151#5580185 where traffic capture on a non partitioned slave was replayed into db1089.

There is one query that uses a different query plan on a non partitioned host than on a partitioned host, but the runtime is just 0.0 vs 0.09, so not a big deal real in this case, and it can also be a matter of table statistics:

db1099:3311

root@db1099.eqiad.wmnet[enwiki]> explain SELECT /* Title::getFirstRevision xx */  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,actor_rev_user.actor_user AS `rev_user`,actor_rev_user.actor_name AS `rev_user_text`,temp_rev_user.revactor_actor AS `rev_actor`  FROM `revision` IGNORE INDEX (rev_timestamp) JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) JOIN `revision_actor_temp` `temp_rev_user` ON ((temp_rev_user.revactor_rev = rev_id)) JOIN `actor` `actor_rev_user` ON ((actor_rev_user.actor_id = temp_rev_user.revactor_actor))   WHERE rev_page = '18762503'  ORDER BY rev_timestamp ASC, rev_id ASC LIMIT 1;
+------+-------------+---------------------+--------+--------------------------------------------------------+----------------+---------+-----------------------------------------------+------+-------------+
| id   | select_type | table               | type   | possible_keys                                          | key            | key_len | ref                                           | rows | Extra       |
+------+-------------+---------------------+--------+--------------------------------------------------------+----------------+---------+-----------------------------------------------+------+-------------+
|    1 | SIMPLE      | revision            | ref    | PRIMARY,page_timestamp,page_user_timestamp,rev_page_id | page_timestamp | 4       | const                                         | 5100 | Using where |
|    1 | SIMPLE      | temp_rev_user       | ref    | PRIMARY,revactor_rev,actor_timestamp                   | PRIMARY        | 4       | enwiki.revision.rev_id                        |    1 | Using index |
|    1 | SIMPLE      | actor_rev_user      | eq_ref | PRIMARY                                                | PRIMARY        | 8       | enwiki.temp_rev_user.revactor_actor           |    1 |             |
|    1 | SIMPLE      | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                 | PRIMARY        | 4       | enwiki.revision.rev_id                        |    1 | Using index |
|    1 | SIMPLE      | comment_rev_comment | eq_ref | PRIMARY                                                | PRIMARY        | 8       | enwiki.temp_rev_comment.revcomment_comment_id |    1 |             |
+------+-------------+---------------------+--------+--------------------------------------------------------+----------------+---------+-----------------------------------------------+------+-------------+
5 rows in set (0.00 sec)

db1089 (non partitioned):

root@db1089.eqiad.wmnet[enwiki]> explain SELECT /* Title::getFirstRevision xx */  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,actor_rev_user.actor_user AS `rev_user`,actor_rev_user.actor_name AS `rev_user_text`,temp_rev_user.revactor_actor AS `rev_actor`  FROM `revision` IGNORE INDEX (rev_timestamp) JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) JOIN `revision_actor_temp` `temp_rev_user` ON ((temp_rev_user.revactor_rev = rev_id)) JOIN `actor` `actor_rev_user` ON ((actor_rev_user.actor_id = temp_rev_user.revactor_actor))   WHERE rev_page = '18762503'  ORDER BY rev_timestamp ASC, rev_id ASC LIMIT 1;
+------+-------------+---------------------+--------+--------------------------------------------------------+---------------------+---------+-----------------------------------------------+-------+-----------------------------+
| id   | select_type | table               | type   | possible_keys                                          | key                 | key_len | ref                                           | rows  | Extra                       |
+------+-------------+---------------------+--------+--------------------------------------------------------+---------------------+---------+-----------------------------------------------+-------+-----------------------------+
|    1 | SIMPLE      | revision            | ref    | PRIMARY,page_timestamp,page_user_timestamp,rev_page_id | page_user_timestamp | 4       | const                                         | 25706 | Using where; Using filesort |
|    1 | SIMPLE      | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                 | PRIMARY             | 4       | enwiki.revision.rev_id                        |     1 | Using index                 |
|    1 | SIMPLE      | comment_rev_comment | eq_ref | PRIMARY                                                | PRIMARY             | 8       | enwiki.temp_rev_comment.revcomment_comment_id |     1 |                             |
|    1 | SIMPLE      | temp_rev_user       | ref    | PRIMARY,revactor_rev,actor_timestamp                   | PRIMARY             | 4       | enwiki.revision.rev_id                        |     1 | Using index                 |
|    1 | SIMPLE      | actor_rev_user      | eq_ref | PRIMARY                                                | PRIMARY             | 8       | enwiki.temp_rev_user.revactor_actor           |     1 |                             |
+------+-------------+---------------------+--------+--------------------------------------------------------+---------------------+---------+-----------------------------------------------+-------+-----------------------------+
5 rows in set (0.00 sec)

To be even more on the safe side, I have replayed all the SELECTs captured on that log for the revision table into db1089 and nothing has shown up as "stuck" or taking time to be run.

To narrow things a bit further, from the top50 slowest queries from all the ones captured, there are just 3 appearing from revision table, and they actually take 0.03 to run and the query plan is the same on db1089 and db1105 (partitioned).
So in general, no regression has been observed so far.

I am going to leave the slow query log enabled for a few more hours (probably till tomorrow morning) with a 1/20 sampling rate on db1089.

Addendum, there is also logging of long running queries killed:

mysql@db1089[ops]> select * FROM ops.event_log;
+-----------+---------------------+---------------------------+-----------------+
| server_id | stamp               | event                     | content         |
+-----------+---------------------+---------------------------+-----------------+
| 171974771 | 2019-10-22 13:48:35 | wmf_slave_wikiuser_sleep  | kill 9746531363 |
| 171974771 | 2019-10-22 13:49:35 | wmf_slave_wikiuser_sleep  | kill 9746566948 |
| 171974771 | 2019-10-22 13:51:35 | wmf_slave_wikiuser_sleep  | kill 9746683663 |
| 171974771 | 2019-10-22 13:54:35 | wmf_slave_wikiuser_sleep  | kill 9746814675 |
| 171974771 | 2019-10-22 13:55:35 | wmf_slave_wikiuser_sleep  | kill 9746883265 |
| 171974771 | 2019-10-22 14:04:05 | wmf_slave_wikiuser_sleep  | kill 9747329948 |
| 171974771 | 2019-10-22 14:10:35 | wmf_slave_wikiuser_sleep  | kill 9747629863 |
| 171974771 | 2019-10-22 14:14:05 | wmf_slave_wikiuser_sleep  | kill 9747812072 |
| 171974771 | 2019-10-22 14:28:05 | wmf_slave_wikiuser_sleep  | kill 9748462422 |
| 171974771 | 2019-10-22 16:04:35 | wmf_slave_wikiuser_sleep  | kill 9753317384 |
| 171974771 | 2019-10-22 23:26:05 | wmf_slave_wikiuser_sleep  | kill 9775220591 |
| 171974771 | 2019-10-23 01:30:35 | wmf_slave_wikiuser_sleep  | kill 9780945455 |
| 171974771 | 2019-10-23 03:02:35 | wmf_slave_wikiuser_sleep  | kill 9785159042 |
| 171974771 | 2019-10-23 08:52:05 | wmf_slave_wikiuser_sleep  | kill 9799620004 |
+-----------+---------------------+---------------------------+-----------------+
14 rows in set (0.00 sec)

More interesting:

root@db1105.eqiad.wmnet[ops]> select * FROM event_log;
+-----------+---------------------+-------------------------------+--------------------------------------------------------------------------------------------------------------------+
| server_id | stamp               | event                         | content                                                                                                            |
+-----------+---------------------+-------------------------------+--------------------------------------------------------------------------------------------------------------------+
| 171974878 | 2019-10-22 13:47:05 | wmf_slave_wikiuser_sleep      | kill 723811                                                                                                        |
| 171974878 | 2019-10-22 13:47:05 | wmf_slave_wikiuser_sleep      | kill 723670                                                                                                        |
| 171974878 | 2019-10-22 15:41:05 | wmf_slave_wikiuser_sleep      | kill 1942043                                                                                                       |
| 171974878 | 2019-10-22 16:04:35 | wmf_slave_wikiuser_sleep      | kill 2190441                                                                                                       |
| 171974878 | 2019-10-22 18:40:03 | wmf_slave_wikiuser_slow (>60) | kill 3921791; SELECT /* SpecialRecentChanges::doMainQuery  */  /*! STRAIGHT_JOIN */ rc_id,rc_timestamp,rc_namespac |
| 171974878 | 2019-10-22 18:40:03 | wmf_slave_wikiuser_slow (>60) | kill 3921791; SELECT /* SpecialRecentChanges::doMainQuery  */  /*! STRAIGHT_JOIN */ rc_id,rc_timestamp,rc_namespac |
| 171974878 | 2019-10-22 18:46:33 | wmf_slave_wikiuser_slow (>60) | kill 3992966; SELECT /* IndexPager::buildQueryInfo (contributions page filtered for namespace or RevisionDeleted e |
| 171974878 | 2019-10-22 20:12:33 | wmf_slave_wikiuser_slow (>60) | kill 4963594; SELECT /* IndexPager::buildQueryInfo (contributions page filtered for namespace or RevisionDeleted e |
| 171974878 | 2019-10-22 23:28:05 | wmf_slave_wikiuser_sleep      | kill 7071901                                                                                                       |
| 171974878 | 2019-10-23 03:32:05 | wmf_slave_wikiuser_sleep      | kill 9481993                                                                                                       |
| 171974878 | 2019-10-23 08:47:05 | wmf_slave_wikiuser_sleep      | kill 12477905                                                                                                      |
| 171974878 | 2019-10-23 08:47:35 | wmf_slave_wikiuser_sleep      | kill 12482807                                                                                                      |
| 171974878 | 2019-10-23 12:35:35 | wmf_slave_wikiuser_sleep      | kill 14826935                                                                                                      |
+-----------+---------------------+-------------------------------+--------------------------------------------------------------------------------------------------------------------+
13 rows in set (0.00 sec)

Thanks for that - I have also been looking into that, and it looks like that db1089 performs in a very similar way (according to the kills) or slightly better.

Mentioned in SAL (#wikimedia-operations) [2019-10-23T15:32:11Z] <marostegui> Enable slow query log 1/20 on db1089 (enwiki) T223151

There is one query that uses a different query plan on a non partitioned host than on a partitioned host, but the runtime is just 0.0 vs 0.09, so not a big deal real in this case, and it can also be a matter of table statistics:

db1099:3311

root@db1099.eqiad.wmnet[enwiki]> explain SELECT /* Title::getFirstRevision xx */  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,actor_rev_user.actor_user AS `rev_user`,actor_rev_user.actor_name AS `rev_user_text`,temp_rev_user.revactor_actor AS `rev_actor`  FROM `revision` IGNORE INDEX (rev_timestamp) JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) JOIN `revision_actor_temp` `temp_rev_user` ON ((temp_rev_user.revactor_rev = rev_id)) JOIN `actor` `actor_rev_user` ON ((actor_rev_user.actor_id = temp_rev_user.revactor_actor))   WHERE rev_page = '18762503'  ORDER BY rev_timestamp ASC, rev_id ASC LIMIT 1;
+------+-------------+---------------------+--------+--------------------------------------------------------+----------------+---------+-----------------------------------------------+------+-------------+
| id   | select_type | table               | type   | possible_keys                                          | key            | key_len | ref                                           | rows | Extra       |
+------+-------------+---------------------+--------+--------------------------------------------------------+----------------+---------+-----------------------------------------------+------+-------------+
|    1 | SIMPLE      | revision            | ref    | PRIMARY,page_timestamp,page_user_timestamp,rev_page_id | page_timestamp | 4       | const                                         | 5100 | Using where |
|    1 | SIMPLE      | temp_rev_user       | ref    | PRIMARY,revactor_rev,actor_timestamp                   | PRIMARY        | 4       | enwiki.revision.rev_id                        |    1 | Using index |
|    1 | SIMPLE      | actor_rev_user      | eq_ref | PRIMARY                                                | PRIMARY        | 8       | enwiki.temp_rev_user.revactor_actor           |    1 |             |
|    1 | SIMPLE      | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                 | PRIMARY        | 4       | enwiki.revision.rev_id                        |    1 | Using index |
|    1 | SIMPLE      | comment_rev_comment | eq_ref | PRIMARY                                                | PRIMARY        | 8       | enwiki.temp_rev_comment.revcomment_comment_id |    1 |             |
+------+-------------+---------------------+--------+--------------------------------------------------------+----------------+---------+-----------------------------------------------+------+-------------+
5 rows in set (0.00 sec)

db1089 (non partitioned):

root@db1089.eqiad.wmnet[enwiki]> explain SELECT /* Title::getFirstRevision xx */  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,actor_rev_user.actor_user AS `rev_user`,actor_rev_user.actor_name AS `rev_user_text`,temp_rev_user.revactor_actor AS `rev_actor`  FROM `revision` IGNORE INDEX (rev_timestamp) JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) JOIN `revision_actor_temp` `temp_rev_user` ON ((temp_rev_user.revactor_rev = rev_id)) JOIN `actor` `actor_rev_user` ON ((actor_rev_user.actor_id = temp_rev_user.revactor_actor))   WHERE rev_page = '18762503'  ORDER BY rev_timestamp ASC, rev_id ASC LIMIT 1;
+------+-------------+---------------------+--------+--------------------------------------------------------+---------------------+---------+-----------------------------------------------+-------+-----------------------------+
| id   | select_type | table               | type   | possible_keys                                          | key                 | key_len | ref                                           | rows  | Extra                       |
+------+-------------+---------------------+--------+--------------------------------------------------------+---------------------+---------+-----------------------------------------------+-------+-----------------------------+
|    1 | SIMPLE      | revision            | ref    | PRIMARY,page_timestamp,page_user_timestamp,rev_page_id | page_user_timestamp | 4       | const                                         | 25706 | Using where; Using filesort |
|    1 | SIMPLE      | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                 | PRIMARY             | 4       | enwiki.revision.rev_id                        |     1 | Using index                 |
|    1 | SIMPLE      | comment_rev_comment | eq_ref | PRIMARY                                                | PRIMARY             | 8       | enwiki.temp_rev_comment.revcomment_comment_id |     1 |                             |
|    1 | SIMPLE      | temp_rev_user       | ref    | PRIMARY,revactor_rev,actor_timestamp                   | PRIMARY             | 4       | enwiki.revision.rev_id                        |     1 | Using index                 |
|    1 | SIMPLE      | actor_rev_user      | eq_ref | PRIMARY                                                | PRIMARY             | 8       | enwiki.temp_rev_user.revactor_actor           |     1 |                             |
+------+-------------+---------------------+--------+--------------------------------------------------------+---------------------+---------+-----------------------------------------------+-------+-----------------------------+
5 rows in set (0.00 sec)

There's no reason for db1089 not to use the other plan, the rev_user column included in the page_user_timestamp index isn't referenced anywhere. Seems like one of those cases where it just chooses a wrong plan for no obvious reason. If it becomes a problem we could always ignore the index or force the right one.

Mentioned in SAL (#wikimedia-operations) [2019-10-24T04:59:25Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Remove db1089 from special slaves group and leave it with its original pooling options T223151', diff saved to https://phabricator.wikimedia.org/P9461 and previous config saved to /var/cache/conftool/dbconfig/20191024-045924-marostegui.json

It is one of our favourite cases where some hosts use one index and some others use another:

dbstore1003.eqiad.wmnet:3311
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_user_timestamp	4	const	24848	Using where; Using filesort
db2130.codfw.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_timestamp	4	const	25896	Using where
db2116.codfw.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_timestamp	4	const	25896	Using where
db2112.codfw.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_user_timestamp	4	const	23624	Using where; Using filesort
db2103.codfw.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_user_timestamp	4	const	21700	Using where; Using filesort
db2102.codfw.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_user_timestamp	4	const	27758	Using where; Using filesort
db2097.codfw.wmnet:3311
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_timestamp	4	const	25896	Using where
db2094.codfw.wmnet:3311
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_timestamp	4	const	25550	Using where
db2092.codfw.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_user_timestamp	4	const	25350	Using where; Using filesort
db2088.codfw.wmnet:3311
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,rev_page_id,page_user_timestamp	page_timestamp	4	const	5124	Using where
db2085.codfw.wmnet:3311
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,rev_page_id,page_user_timestamp	page_timestamp	4	const	5025	Using where
db2072.codfw.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_timestamp	4	const	25896	Using where
db2071.codfw.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_user_timestamp	4	const	25350	Using where; Using filesort
db2048.codfw.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_user_timestamp	4	const	23924	Using where; Using filesort
db1139.eqiad.wmnet:3311
1	SIMPLE	revision	ref	PRIMARY,rev_page_id,page_timestamp,page_user_timestamp	page_user_timestamp	4	const	25614	Using where; Using filesort
db1134.eqiad.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_timestamp	4	const	25896	Using where
db1124.eqiad.wmnet:3311
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_timestamp	4	const	25550	Using where
db1119.eqiad.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,rev_page_id,page_timestamp,page_user_timestamp	page_timestamp	4	const	27054	Using where
db1118.eqiad.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,rev_page_id,page_timestamp,page_user_timestamp	page_user_timestamp	4	const	24504	Using where; Using filesort
db1114.eqiad.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,rev_page_id,page_timestamp,page_user_timestamp	page_timestamp	4	const	24974	Using where
db1106.eqiad.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,rev_page_id,page_timestamp,page_user_timestamp	page_timestamp	4	const	27216	Using where
db1105.eqiad.wmnet:3311
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_timestamp	4	const	5214	Using where
db1099.eqiad.wmnet:3311
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_timestamp	4	const	5098	Using where
db1089.eqiad.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_user_timestamp	4	const	25706	Using where; Using filesort
db1083.eqiad.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_timestamp	4	const	25896	Using where
db1080.eqiad.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,page_timestamp,page_user_timestamp,rev_page_id	page_timestamp	4	const	24130	Using where
db1067.eqiad.wmnet:3306
1	SIMPLE	revision	ref	PRIMARY,rev_page_id,page_timestamp,page_user_timestamp	page_timestamp	4	const	25896	Using where

It doesn't really worry me too much as the query is really fast anyways. I am going to analyze table on one of the hosts that use page_user_timestamp and see what happens. If it keeps like that, I will create a separate task tagged with mariadb-optimizer-bug just to track it

Mentioned in SAL (#wikimedia-operations) [2019-10-24T05:18:34Z] <marostegui> Run analyze enwiki.revision on db2092 T223151

I have analyzed all the queries that went overnight till around 8AM and were logged.
Nothing apart from the already reported query at T223151#5599272 shows up.
So I think everything is looking good with a non partitioned slave on dewiki and enwiki. I am going on holidays today, so I have removed db1089 from the special group. But once I am back, I am going to pool it again for a few more days with a higher weight and leave it for some more days and if all goes good, I will go ahead and start dropping partitions from the revision table across the wikis.

The analyze table on db2092 for revision didn't help with the query plan. I have created a different task to track that T236376: SELECT /* Title::getFirstRevision */ sometimes using page_user_timestamp index instead of page_timestamp
It doesn't require anything at the moment, as it is not causing any issues or blocking the work on this task.

Mentioned in SAL (#wikimedia-operations) [2019-11-14T08:00:39Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Pool a non partitioned slave db1089 on special groups for s1 T223151', diff saved to https://phabricator.wikimedia.org/P9632 and previous config saved to /var/cache/conftool/dbconfig/20191114-080038-marostegui.json

I have analyzed all the queries that went overnight till around 8AM and were logged.
Nothing apart from the already reported query at T223151#5599272 shows up.
So I think everything is looking good with a non partitioned slave on dewiki and enwiki. I am going on holidays today, so I have removed db1089 from the special group. But once I am back, I am going to pool it again for a few more days with a higher weight and leave it for some more days and if all goes good, I will go ahead and start dropping partitions from the revision table across the wikis.

Repooled db1089 (non partitioned slave) into s1 special group

Mentioned in SAL (#wikimedia-operations) [2019-11-14T09:20:07Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Full weight to db1089 on special groups for s1 T223151', diff saved to https://phabricator.wikimedia.org/P9635 and previous config saved to /var/cache/conftool/dbconfig/20191114-092006-marostegui.json

db1089 (non partitioned enwiki) has been serving like this for 8 days already:

"contributions": {
  "db1089": 100,
  "db1099:3311": 100,
  "db1105:3311": 100
},
"logpager": {
  "db1089": 100,
  "db1099:3311": 100,
  "db1105:3311": 100
},
"recentchanges": {
  "db1089": 100,
  "db1099:3311": 100,
  "db1105:3311": 100
},
"recentchangeslinked": {
  "db1089": 100,
  "db1099:3311": 100,
  "db1105:3311": 100
},
"watchlist": {
  "db1089": 100,
  "db1099:3311": 100,
  "db1105:3311": 100

Nothing seems to be arising: no slow queries specifically for db1089, errors etc.

Before this iteration, db1089 served enwiki the same way for 4 days (21st Oct - 24th Oct) without issues.
We also tested db1130 into dewiki for 7 days (17th Oct - 23rd Oct) - nothing showed up.
The only thing we've seen is tracked at T236376: SELECT /* Title::getFirstRevision */ sometimes using page_user_timestamp index instead of page_timestamp (which is just another case of the optimizer being dump all over s1, not just db1089).

I think I am going to leave db1089 serving for another week, and if nothing happens, I am in favour of start removing partitions from revision table from db1099:3311 (that means leaving 2 non partitioned slaves on enwiki (db1089, db1099:3311) and just 1 partitioned on enwiki (db1105:3311) - and keep monitoring again for some more time).

Nothing has showed up with db1089, so I believe we should slowly start moving forward to remove revision table partitions from those wikis where it exists.
I have created T239453 to track/discuss

Marostegui claimed this task.

I am going to consider this task resolved.
We removed the partitions and changed PK from logging table at T233625
And we are going to do the same for revision table at T239453

Please reopen if needed!