Page MenuHomePhabricator

Bot contributions page in Catalan wikipedia not displayed
Closed, ResolvedPublicBUG REPORT

Description

Hello.

When we try to visualize https://ca.wikipedia.org/wiki/Especial:Contribucions/Rebot (2 300 000 editions) or https://ca.wikipedia.org/wiki/Especial:Contribucions/EVA3.0_(bot) (2 700 000 editions), which are the two most active bots in Catalan wikipedia, we get this error:

To avoid creating high database load, this query was aborted because the duration exceeded the limit. If you are reading many items at once, try doing multiple smaller operations instead.

[35b94b32-fe36-48ae-abb5-0c41d1ef192e] 2022-05-01 10:58:49: Excepció fatal del tipus 
 Wikimedia\Rdbms\DBQueryTimeoutError»

It does not happen with other bots or users with less editions (in Catalan wikipedia) and it does not happen with bots with similar number of editions in other wikis, i.e. https://commons.wikimedia.org/wiki/Special:Contributions/SchlurcherBot.

Would it be possible to fix it, please?

Event Timeline

Paucabot renamed this task from Bot contributions page in catalan wikipedia not displayed to Bot contributions page in Catalan wikipedia not displayed.May 1 2022, 11:11 AM
Paucabot updated the task description. (Show Details)

I get same error on arwp with Contributions/JarBot.

Restricted Application added a subscriber: alaa. · View Herald TranscriptMay 3 2022, 8:53 AM

The query isn't using the index for page, and thus performing a full table scan which takes more than 30 seconds, which is the limit we have before killing the query:

root@db1136.eqiad.wmnet[cawiki]> explain SELECT  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,actor_rev_user.actor_user AS `rev_user`,actor_rev_user.actor_name AS `rev_user_text`,rev_actor,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len,user_name,page_is_new,(SELECT  GROUP_CONCAT(ctd_name SEPARATOR ',')  FROM `change_tag` JOIN `change_tag_def` ON ((ct_tag_id=ctd_id))   WHERE ct_rev_id=rev_id  ) AS `ts_tags`,ores_damaging_cls.oresc_probability AS `ores_damaging_score`,0.728 AS `ores_damaging_threshold`  FROM `revision` JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) JOIN `actor` `actor_rev_user` ON ((actor_rev_user.actor_id = rev_actor)) JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((actor_rev_user.actor_user != 0) AND (user_id = actor_rev_user.actor_user)) LEFT JOIN `ores_classification` `ores_damaging_cls` ON (ores_damaging_cls.oresc_model = 7 AND (ores_damaging_cls.oresc_rev=rev_id) AND ores_damaging_cls.oresc_class = 1)   WHERE ((rev_actor = 5684)) AND ((rev_deleted & 4) = 0)  ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 51 ;
+------+--------------------+---------------------+--------+-------------------------------------------------------------------------+--------------------------+---------+-----------------------------------------------+---------+---------------------------------+
| id   | select_type        | table               | type   | possible_keys                                                           | key                      | key_len | ref                                           | rows    | Extra                           |
+------+--------------------+---------------------+--------+-------------------------------------------------------------------------+--------------------------+---------+-----------------------------------------------+---------+---------------------------------+
|    1 | PRIMARY            | actor_rev_user      | const  | PRIMARY                                                                 | PRIMARY                  | 8       | const                                         | 1       | Using temporary; Using filesort |
|    1 | PRIMARY            | user                | const  | PRIMARY                                                                 | PRIMARY                  | 4       | const                                         | 1       |                                 |
|    1 | PRIMARY            | page                | ALL    | PRIMARY                                                                 | NULL                     | NULL    | NULL                                          | 1752576 |                                 |
|    1 | PRIMARY            | revision            | ref    | PRIMARY,rev_actor_timestamp,rev_page_actor_timestamp,rev_page_timestamp | rev_page_actor_timestamp | 12      | cawiki.page.page_id,const                     | 1       | Using where                     |
|    1 | PRIMARY            | ores_damaging_cls   | eq_ref | oresc_rev_model_class,oresc_model_class_prob                            | oresc_rev_model_class    | 7       | cawiki.revision.rev_id,const,const            | 1       |                                 |
|    1 | PRIMARY            | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                                  | PRIMARY                  | 4       | cawiki.revision.rev_id                        | 1       | Using index                     |
|    1 | PRIMARY            | comment_rev_comment | eq_ref | PRIMARY                                                                 | PRIMARY                  | 8       | cawiki.temp_rev_comment.revcomment_comment_id | 1       |                                 |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | ct_rev_tag_id,ct_tag_id_id                                              | ct_rev_tag_id            | 5       | cawiki.revision.rev_id                        | 1       | Using index                     |
|    2 | DEPENDENT SUBQUERY | change_tag_def      | eq_ref | PRIMARY                                                                 | PRIMARY                  | 4       | cawiki.change_tag.ct_tag_id                   | 1       |                                 |
+------+--------------------+---------------------+--------+-------------------------------------------------------------------------+--------------------------+---------+-----------------------------------------------+---------+---------------------------------+
9 rows in set (0.002 sec)

The query behaves exactly the same on all s7 hosts, so it is not just a host.

The show explain matches the explain:

mysql:root@localhost [(none)]> show explain for 20762571;
+------+--------------------+---------------------+--------+-------------------------------------------------------------------------+--------------------------+---------+-----------------------------------------------+---------+---------------------------------+
| id   | select_type        | table               | type   | possible_keys                                                           | key                      | key_len | ref                                           | rows    | Extra                           |
+------+--------------------+---------------------+--------+-------------------------------------------------------------------------+--------------------------+---------+-----------------------------------------------+---------+---------------------------------+
|    1 | PRIMARY            | actor_rev_user      | const  | PRIMARY                                                                 | PRIMARY                  | 8       | const                                         |       1 | Using temporary; Using filesort |
|    1 | PRIMARY            | user                | const  | PRIMARY                                                                 | PRIMARY                  | 4       | const                                         |       1 |                                 |
|    1 | PRIMARY            | page                | ALL    | PRIMARY                                                                 | NULL                     | NULL    | NULL                                          | 1633066 |                                 |
|    1 | PRIMARY            | revision            | ref    | PRIMARY,rev_actor_timestamp,rev_page_actor_timestamp,rev_page_timestamp | rev_page_actor_timestamp | 12      | cawiki.page.page_id,const                     |       1 | Using where                     |
|    1 | PRIMARY            | ores_damaging_cls   | eq_ref | oresc_rev_model_class,oresc_model_class_prob                            | oresc_rev_model_class    | 7       | cawiki.revision.rev_id,const,const            |       1 |                                 |
|    1 | PRIMARY            | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                                  | PRIMARY                  | 4       | cawiki.revision.rev_id                        |       1 | Using index                     |
|    1 | PRIMARY            | comment_rev_comment | eq_ref | PRIMARY                                                                 | PRIMARY                  | 8       | cawiki.temp_rev_comment.revcomment_comment_id |       1 |                                 |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | ct_rev_tag_id,ct_tag_id_id                                              | ct_rev_tag_id            | 5       | cawiki.revision.rev_id                        |       1 | Using index                     |
|    2 | DEPENDENT SUBQUERY | change_tag_def      | eq_ref | PRIMARY                                                                 | PRIMARY                  | 4       | cawiki.change_tag.ct_tag_id                   |       1 |                                 |
+------+--------------------+---------------------+--------+-------------------------------------------------------------------------+--------------------------+---------+-----------------------------------------------+---------+---------------------------------+
9 rows in set, 1 warning (0.001 sec)

The query takes around 90 seconds to run, so even if SET STATEMENT max_statement_time=30 wasn't set, it would have been killed by the query killer anyways, as it kills queries that take longer than 60 seconds to run.

Forcing page (PRIMARY) doesn't really make any difference.

In general we're always going to have these problems, we might fix this query, but for bots or users with a large number of contributions, we will end up having issues like this. Probably we should try to split this query into smaller but faster ones, rather than one but large. Platform Engineering is that something doable within MW? Also @Ladsgroup thoughts?

So in general Special:Contributions with a mix of complex filters can lead to slow queries and timeouts but the weird case here is that Special:Contributions without any filters must be supported and it even has a dedicated index (INDEX rev_actor_timestamp (rev_actor, rev_timestamp, rev_id)), picking up page table first and then querying with revision index of rev_page_actor_timestamp is the wrong action from the query planner. It can be that due to populating rev_actor, it doesn't have the correct statistics and an optimize table revision would hopefully fix it. If not, I check what could have caused it. In that case, probably making the join on page LEFT JOIN and adding page_id IS NOT NULL to conditions might hint to the planner it shouldn't pick page table first. Or straight join. Will see.

Forcing the right index makes it respond with 0.04 seconds:

MariaDB [cawiki]> SELECT  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,actor_rev_user.actor_user AS `rev_user`,actor_rev_user.actor_name AS `rev_user_text`,rev_actor,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len,user_name,page_is_new,(SELECT  GROUP_CONCAT(ctd_name SEPARATOR ',')  FROM `change_tag` JOIN `change_tag_def` ON ((ct_tag_id=ctd_id))   WHERE ct_rev_id=rev_id  ) AS `ts_tags`,ores_damaging_cls.oresc_probability AS `ores_damaging_score`,0.728 AS `ores_damaging_threshold`  FROM `revision` USE INDEX (rev_actor_timestamp) JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) JOIN `actor` `actor_rev_user` ON ((actor_rev_user.actor_id = rev_actor)) JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((actor_rev_user.actor_user != 0) AND (user_id = actor_rev_user.actor_user)) LEFT JOIN `ores_classification` `ores_damaging_cls` ON (ores_damaging_cls.oresc_model = 7 AND (ores_damaging_cls.oresc_rev=rev_id) AND ores_damaging_cls.oresc_class = 1)   WHERE ((rev_actor = 5684)) AND ((rev_deleted & 4) = 0)  ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 51 ;

FWIW, I tried optimize table on cawiki in a codfw db and it didn't fix the problem

Probably the only solution is to force it in code. Cause even if the stats refresh worked, it could stop working anytime really

Yeah, I'll patch it tomorrow. It wouldn't work in wikis that read from the temp table but we stopped reading from that table (I think that's why this bug started in the first place).

Change 792213 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@master] ContribsPager: Update index hint to use revision table instead of temp

https://gerrit.wikimedia.org/r/792213

Change 792474 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@wmf/1.39.0-wmf.10] ContribsPager: Update index hint to use revision table in READ NEW

https://gerrit.wikimedia.org/r/792474

Change 792475 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@wmf/1.39.0-wmf.12] ContribsPager: Update index hint to use revision table in READ NEW

https://gerrit.wikimedia.org/r/792475

Change 792213 merged by jenkins-bot:

[mediawiki/core@master] ContribsPager: Update index hint to use revision table in READ NEW

https://gerrit.wikimedia.org/r/792213

Change 792475 merged by jenkins-bot:

[mediawiki/core@wmf/1.39.0-wmf.12] ContribsPager: Update index hint to use revision table in READ NEW

https://gerrit.wikimedia.org/r/792475

Change 792474 merged by jenkins-bot:

[mediawiki/core@wmf/1.39.0-wmf.10] ContribsPager: Update index hint to use revision table in READ NEW

https://gerrit.wikimedia.org/r/792474

Mentioned in SAL (#wikimedia-operations) [2022-05-17T08:54:31Z] <ladsgroup@deploy1002> Synchronized php-1.39.0-wmf.12/includes/specials/pagers/ContribsPager.php: Backport: [[gerrit:792475|ContribsPager: Update index hint to use revision table in READ NEW (T307295)]] (duration: 00m 56s)

Ladsgroup claimed this task.
Ladsgroup moved this task from Triage to Done on the DBA board.

Mentioned in SAL (#wikimedia-operations) [2022-05-17T08:59:04Z] <ladsgroup@deploy1002> Synchronized php-1.39.0-wmf.10/includes/specials/pagers/ContribsPager.php: Backport: [[gerrit:792474|ContribsPager: Update index hint to use revision table in READ NEW (T307295)]] (duration: 00m 53s)