Page MenuHomePhabricator

API problem with usercontribs using `rev_user_text` rather than `rev_user`: Only use 'contributions' replica if querying by user ID
Closed, ResolvedPublic

Event Timeline

Kizule subscribed.
This comment was removed by Kizule.

Uhh I see now, I can reproduce this.

Anomie moved this task from Unsorted to Needs details or plan on the MediaWiki-Action-API board.
Anomie subscribed.

TL;DR: The thing to do here is probably to stop using the 'contributions' replica group in ApiQueryUserContribs, at least when we're not using rev_user to specify the user.

The query here is

SELECT /* ApiQueryUserContribs::execute  */  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,rev_user,rev_user_text,NULL AS `rev_actor`,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len  FROM `revision` JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) INNER JOIN `page` ON ((page_id = rev_page))   WHERE ((rev_user_text = 'Paweł Ziemian BOT')) AND ((rev_deleted & 4) != 4)  ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 11

It'll be running in the 'contributions' group, so either db1103:3312 or db1105:3312 on plwiki. That's probably the problem, since it's using rev_user_text rather than rev_user so it doesn't mesh with the partitioning. For some reason this user makes it choose a poor plan:

wikiadmin@10.64.0.164(plwiki)> explain SELECT /* ApiQueryUserContribs::execute  */  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,rev_user,rev_user_text,NULL AS `rev_actor`,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len  FROM `revision` JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) INNER JOIN `page` ON ((page_id = rev_page))   WHERE ((rev_user_text = 'Paweł Ziemian BOT')) AND ((rev_deleted & 4) != 4)  ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 11;
+------+-------------+---------------------+--------+---------------------------------------------------------------------------+----------------+---------+-----------------------------------------------+---------+---------------------------------+
| id   | select_type | table               | type   | possible_keys                                                             | key            | key_len | ref                                           | rows    | Extra                           |
+------+-------------+---------------------+--------+---------------------------------------------------------------------------+----------------+---------+-----------------------------------------------+---------+---------------------------------+
|    1 | SIMPLE      | page                | ALL    | PRIMARY                                                                   | NULL           | NULL    | NULL                                          | 3093051 | Using temporary; Using filesort |
|    1 | SIMPLE      | revision            | ref    | PRIMARY,page_timestamp,usertext_timestamp,page_user_timestamp,rev_page_id | page_timestamp | 4       | plwiki.page.page_id                           |       1 | Using where                     |
|    1 | SIMPLE      | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                                    | PRIMARY        | 4       | plwiki.revision.rev_id                        |       1 | Using index                     |
|    1 | SIMPLE      | comment_rev_comment | eq_ref | PRIMARY                                                                   | PRIMARY        | 8       | plwiki.temp_rev_comment.revcomment_comment_id |       1 |                                 |
+------+-------------+---------------------+--------+---------------------------------------------------------------------------+----------------+---------+-----------------------------------------------+---------+---------------------------------+

While on a normal API replica it uses the plan I'd expect:

wikiadmin@10.64.0.204(plwiki)> explain SELECT /* ApiQueryUserContribs::execute  */  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,rev_user,rev_user_text,NULL AS `rev_actor`,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len  FROM `revision` JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) INNER JOIN `page` ON ((page_id = rev_page))   WHERE ((rev_user_text = 'Paweł Ziemian BOT')) AND ((rev_deleted & 4) != 4)  ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 11;
+------+-------------+---------------------+--------+---------------------------------------------------------------------------+--------------------+---------+-----------------------------------------------+---------+-------------+
| id   | select_type | table               | type   | possible_keys                                                             | key                | key_len | ref                                           | rows    | Extra       |
+------+-------------+---------------------+--------+---------------------------------------------------------------------------+--------------------+---------+-----------------------------------------------+---------+-------------+
|    1 | SIMPLE      | revision            | ref    | PRIMARY,page_timestamp,usertext_timestamp,page_user_timestamp,rev_page_id | usertext_timestamp | 257     | const                                         | 1722194 | Using where |
|    1 | SIMPLE      | page                | eq_ref | PRIMARY                                                                   | PRIMARY            | 4       | plwiki.revision.rev_page                      |       1 |             |
|    1 | SIMPLE      | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                                    | PRIMARY            | 4       | plwiki.revision.rev_id                        |       1 | Using index |
|    1 | SIMPLE      | comment_rev_comment | eq_ref | PRIMARY                                                                   | PRIMARY            | 8       | plwiki.temp_rev_comment.revcomment_comment_id |       1 |             |
+------+-------------+---------------------+--------+---------------------------------------------------------------------------+--------------------+---------+-----------------------------------------------+---------+-------------+

BTW, in the near-ish future it'll be using revcomment_actor instead:

SELECT  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,actor_rev_user.actor_user AS `rev_user`,actor_rev_user.actor_name AS `rev_user_text`,temp_rev_user.revactor_actor AS `rev_actor`,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len  FROM `revision` JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) JOIN `revision_actor_temp` `temp_rev_user` ON ((temp_rev_user.revactor_rev = rev_id)) JOIN `actor` `actor_rev_user` ON ((actor_rev_user.actor_id = temp_rev_user.revactor_actor)) INNER JOIN `page` ON ((page_id = rev_page))   WHERE ((temp_rev_user.revactor_actor = '1676')) AND ((rev_deleted & 4) != 4)  ORDER BY revactor_timestamp DESC,revactor_rev DESC LIMIT 11

For this query it does the same (good) plan on both the 'api' and 'contributions' groups:

wikiadmin@10.64.0.204(plwiki)> explain SELECT  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,actor_rev_user.actor_user AS `rev_user`,actor_rev_user.actor_name AS `rev_user_text`,temp_rev_user.revactor_actor AS `rev_actor`,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len  FROM `revision` JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) JOIN `revision_actor_temp` `temp_rev_user` ON ((temp_rev_user.revactor_rev = rev_id)) JOIN `actor` `actor_rev_user` ON ((actor_rev_user.actor_id = temp_rev_user.revactor_actor)) INNER JOIN `page` ON ((page_id = rev_page))   WHERE ((temp_rev_user.revactor_actor = '1676')) AND ((rev_deleted & 4) != 4)  ORDER BY revactor_timestamp DESC,revactor_rev DESC LIMIT 11;
+------+-------------+---------------------+--------+--------------------------------------------------------+-----------------+---------+-----------------------------------------------+---------+--------------------------+
| id   | select_type | table               | type   | possible_keys                                          | key             | key_len | ref                                           | rows    | Extra                    |
+------+-------------+---------------------+--------+--------------------------------------------------------+-----------------+---------+-----------------------------------------------+---------+--------------------------+
|    1 | SIMPLE      | actor_rev_user      | const  | PRIMARY                                                | PRIMARY         | 8       | const                                         |       1 |                          |
|    1 | SIMPLE      | temp_rev_user       | ref    | PRIMARY,revactor_rev,actor_timestamp                   | actor_timestamp | 8       | const                                         | 1718122 | Using where; Using index |
|    1 | SIMPLE      | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                 | PRIMARY         | 4       | plwiki.temp_rev_user.revactor_rev             |       1 | Using index              |
|    1 | SIMPLE      | comment_rev_comment | eq_ref | PRIMARY                                                | PRIMARY         | 8       | plwiki.temp_rev_comment.revcomment_comment_id |       1 |                          |
|    1 | SIMPLE      | revision            | eq_ref | PRIMARY,page_timestamp,page_user_timestamp,rev_page_id | PRIMARY         | 4       | plwiki.temp_rev_user.revactor_rev             |       1 | Using where              |
|    1 | SIMPLE      | page                | eq_ref | PRIMARY                                                | PRIMARY         | 4       | plwiki.revision.rev_page                      |       1 |                          |
+------+-------------+---------------------+--------+--------------------------------------------------------+-----------------+---------+-----------------------------------------------+---------+--------------------------+

Although I suppose there may be other users where it decides to do a poor plan.

Speaking of the partitioning on the 'contributions' group, I wonder whether we'll start running into similar issues in the other things using that group once we flip the actor migration switch so queries that are using rev_user start using revactor_actor.

For s2 we can probably decrease the main traffic weight for the rc replicas (db1103 and db1105) as the other hosts I think will have no problem to assume the traffic, but this is another case where "special" slaves are a snowflake and bite us :-(

Marostegui moved this task from Triage to In progress on the DBA board.

Change 491901 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Not use db1103,5:3312 on main traffic

https://gerrit.wikimedia.org/r/491901

This is bad, I mentioned before that it was already bad to have special slaves, but it is even worse if the special slaves also cannot handle "main" traffic.

Change 491993 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] ApiQueryUserContribs: Only use 'contributions' replica if querying by user ID

https://gerrit.wikimedia.org/r/491993

Change 491993 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] ApiQueryUserContribs: Only use 'contributions' replica if querying by user ID

https://gerrit.wikimedia.org/r/491993

Thanks for that @Anomie!
I have a question (let me know if you'd like me to ask this on the gerrit patch or here is fine).
What if those replicas are not available? will that fallback to any other available replica or will just thrown an error?
It is unlikely that we don't have any contributors replica available, but I thought I would ask :-)

Here is good.

Looking at the code, it looks like it does fall back to the "main" group if there aren't any usable replicas in the specified group.

Thanks for checking although now that I think about it, it is pretty much the same thing, it will timeout anyways (as we have seen) :-)

Aklapper renamed this task from API problem with usercontribs to API problem with usercontribs using `rev_user_text` rather than `rev_user`: Only use 'contributions' replica if querying by user ID.Feb 22 2019, 2:59 AM

Change 491993 merged by jenkins-bot:
[mediawiki/core@master] ApiQueryUserContribs: Only use 'contributions' replica if querying by user ID

https://gerrit.wikimedia.org/r/491993

Change 491901 abandoned by Marostegui:
db-eqiad.php: Not use db1103,5:3312 on main traffic

Reason:
Better approach: https://gerrit.wikimedia.org/r/491993

https://gerrit.wikimedia.org/r/491901

Anomie claimed this task.

The fix should be deployed to Wikimedia wikis with 1.33.0-wmf.19, which for plwiki is scheduled for 2019-02-28. If anyone wants it out sooner, they can follow the instructions at https://wikitech.wikimedia.org/wiki/SWAT_deploys#How_to_submit_a_patch_for_SWAT to backport the patch to 1.33.0-wmf.18.