Page MenuHomePhabricator

Pages Created queries timing out
Closed, ResolvedPublic

Description

/var/www/vendor/doctrine/dbal/lib/Doctrine/DBAL/Driver/AbstractMySQLDriver.php: 115 - An exception occurred while executing '            (SELECT DISTINCT page_namespace AS namespace, 'rev' AS type, page_title AS page_title,                page_len, page_is_redirect, rev_timestamp AS rev_timestamp,                rev_user, rev_user_text AS username, rev_len, rev_id             FROM `arwiki_p`.`page`            JOIN `arwiki_p`.`revision_userindex` ON page_id = rev_page                        WHERE  rev_user = '314820' AND rev_timestamp > 1  AND rev_parent_id = '0'  AND page_namespace = '0'   AND page_is_redirect = '0'                         )            UNION            (SELECT a.ar_namespace AS namespace, 'arc' AS type, a.ar_title AS page_title,                0 AS page_len, '0' AS page_is_redirect, MIN(a.ar_timestamp) AS rev_timestamp,                a.ar_user AS rev_user, a.ar_user_text AS username, a.ar_len AS rev_len,                a.ar_rev_id AS rev_id             FROM `arwiki_p`.`archive_userindex` a            JOIN            (                SELECT b.ar_namespace, b.ar_title                FROM `arwiki_p`.`archive_userindex` AS b                LEFT JOIN `arwiki_p`.`logging_userindex` ON log_namespace = b.ar_namespace AND log_title = b.ar_title                    AND log_user = b.ar_user AND (log_action = 'move' OR log_action = 'move_redir')                WHERE  ar_user = '314820' AND ar_timestamp > 1  AND b.ar_parent_id = '0'  AND ar_namespace = '0'  AND log_action IS NULL            ) AS c ON c.ar_namespace= a.ar_namespace AND c.ar_title = a.ar_title            GROUP BY a.ar_namespace, a.ar_title            HAVING  rev_user = '314820'             )            ':SQLSTATE[HY000]: General error: 2013 Lost connection to MySQL server during query

URL: http://xtools.wmflabs.org/pages/ar.wikipedia.org/علاء

Please provide any further details here

XTools version: 3.1.12-ab399de

Event Timeline

It's not my first time with this error!

more than 2 times in one day!

capture-20171013-235447.png (530×1 px, 121 KB)

Matthewrbowker moved this task from Backlog to General / other on the XTools board.
Matthewrbowker subscribed.

Yep, looks like we're hitting the query killer.

# 2017-10-13T21:52:44 KILL 7348013 (Query 160 sec) (SELECT DISTINCT page_namespace AS namespace, 'rev' AS type, page_title AS page_title,
                page_len, page_is_redirect, rev_timestamp AS rev_timestamp,
                rev_user, rev_user_text AS username, rev_len, rev_id 
            FROM `arwiki_p`.`page`
            JOIN `arwiki_p`.`revision_userindex` ON page_id = rev_page
            
            WHERE  rev_user = '314820' AND rev_timestamp > 1  AND rev_parent_id = '0'  AND page_namespace = '0'   AND page_is_redirect = '0' 
            
            )

            UNION

            (SELECT a.ar_namespace AS namespace, 'arc' AS type, a.ar_title AS page_title,
                0 AS page_len, '0' AS page_is_redirect, MIN(a.ar_timestamp) AS rev_timestamp,
                a.ar_user AS rev_user, a.ar_user_text AS username, a.ar_len AS rev_len,
                a.ar_rev_id AS rev_id 
            FROM `arwiki_p`.`archive_userindex` a
            JOIN
            (
                SELECT b.ar_namespace, b.ar_title
                FROM `arwiki_p`.`archive_userindex` AS b
                LEFT JOIN `arwiki_p`.`logging_userindex` ON log_namespace = b.ar_namespace AND log_title = b.ar_title
                    AND log_user = b.ar_user AND (log_action = 'move' OR log_action = 'move_redir')
                WHERE  ar_user = '314820' AND ar_timestamp > 1  AND b.ar_parent_id = '0'  AND ar_namespace = '0'  AND log_action IS NULL
            ) AS c ON c.ar_namespace= a.ar_namespace AND c.ar_title = a.ar_title
            GROUP BY a.ar_namespace, a.ar_title
            HAVING  rev_user = '314820' 
            )

However, this page loads successfully for me, execution time 132.686 seconds. We're right at the line, which means further query optimization is necessary here.

It's the latter half (deleted pages) that's slowing it down. I can't remember exactly but I think the inner SELECT makes sure the user wasn't credited with deletions as a result of a page move. This was mostly stolen from the old XTools, so correct me if I'm wrong! It seems like this clause isn't super important, so maybe we can remove it (on the new replicas):

MariaDB [enwiki_p]> SELECT ar_namespace AS namespace, 'arc' AS type, ar_title AS page_title, 
       0 AS page_len, '0' AS page_is_redirect, MIN(ar_timestamp) AS rev_timestamp,
       ar_user AS rev_user, ar_user_text AS username, ar_len AS rev_len,
       ar_rev_id AS rev_id 
FROM `arwiki_p`.`archive_userindex` a
WHERE ar_user = '314820'\G;
*************************** 1. row ***************************
       namespace: 0
            type: arc
      page_title: بالمال_تستطيع_ان_تدخل_الجنة
        page_len: 0
page_is_redirect: 0
   rev_timestamp: 20150124195114
        rev_user: 314820
        username: علاء
         rev_len: 359
          rev_id: 14817206
1 row in set (0.02 sec)

vs.

MariaDB [enwiki_p]> (SELECT a.ar_namespace AS namespace, 'arc' AS type, a.ar_title AS page_title,
    ->                 0 AS page_len, '0' AS page_is_redirect, MIN(a.ar_timestamp) AS rev_timestamp,
    ->                 a.ar_user AS rev_user, a.ar_user_text AS username, a.ar_len AS rev_len,
    ->                 a.ar_rev_id AS rev_id
    ->             FROM `arwiki_p`.`archive_userindex` a
    ->             JOIN
    ->             (
    ->                 SELECT b.ar_namespace, b.ar_title
    ->                 FROM `arwiki_p`.`archive_userindex` AS b
    ->                 LEFT JOIN `arwiki_p`.`logging_userindex` ON log_namespace = b.ar_namespace AND log_title = b.ar_title
    ->                     AND log_user = b.ar_user AND (log_action = 'move' OR log_action = 'move_redir')
    ->                 WHERE  ar_user = '314820' AND ar_timestamp > 1  AND b.ar_parent_id = '0'  AND ar_namespace = '0'  AND log_action IS NULL
    ->             ) AS c ON c.ar_namespace= a.ar_namespace AND c.ar_title = a.ar_title
    ->             GROUP BY a.ar_namespace, a.ar_title
    ->             HAVING  rev_user = '314820'
    ->             );
+-----------+------+--------------+----------+------------------+----------------+----------+----------+---------+----------+
| namespace | type | page_title   | page_len | page_is_redirect | rev_timestamp  | rev_user | username | rev_len | rev_id   |
+-----------+------+--------------+----------+------------------+----------------+----------+----------+---------+----------+
|         0 | arc  | تجربة22      |        0 | 0                | 20170307122534 |   314820 | علاء     |      26 | 22704217 |
+-----------+------+--------------+----------+------------------+----------------+----------+----------+---------+----------+
1 row in set (2 min 46.88 sec)
MusikAnimal renamed this task from 500: Internal Server Error to Pages Created queries timing out.Oct 14 2017, 6:53 PM

I think this does it: fc1b8ec

For User:علاء it runs at about 14 seconds: https://xtools-dev.wmflabs.org/pages/ar.wikipedia.org/%D8%B9%D9%84%D8%A7%D8%A1

From my testing the results are actually improved -- with the old query, some of the deleted pages the user created weren't showing up. The live pages part of the query was left unchanged, so we shouldn't see any different results there.

This does not fix T177677, which happens with users who created a bajillion pages, and evidently the results can't be put into memory and shown on a single page. E.g. the task I merged into this one was for dewiki's User:Informationswiedergutmachung, and that one is still running out of memory.

MusikAnimal moved this task from Pending deployment to Complete on the XTools board.