Page MenuHomePhabricator

Slow running query on cawiki.labsdb; incomplete/missing table stats?
Closed, ResolvedPublic

Description

Reporting a slow-running query that is I suspect due to faulty/missing database table statistics.

tools.tb-dev@tools-bastion-01:~/public_html/RLRL$ mysql -h cawiki.labsdb
MariaDB [(none)]> use p50380g50491__rlrl_cawiki_p;
MariaDB [p50380g50491__rlrl_cawiki_p]>
SELECT sug_orig_ns, sug_orig, sug_new_ns, sug_new
FROM p50380g50491__rlrl_cawiki_p.suggestions s
WHERE NOT EXISTS (
  SELECT 1
  FROM cawiki_p.pagelinks
  INNER JOIN cawiki_p.page ON pl_from = page_id
  WHERE pl_namespace = s.sug_orig_ns
  AND   pl_title = s.sug_orig
  AND   page_namespace in (0, 10, 14)
  AND   page_title NOT LIKE '%/sandbox'
  AND   page_title NOT LIKE '%/testcases' )
AND s.sug_status > 0;

The plan generated for this query is rather odd:

1PRIMARYsALLNULLNULLNULLNULL3599Using where
2MATERIALIZEDpagelinksindexpl_from,pl_namespace,pl_backlinks_namespacepl_namespace265NULL48688676Using index
2MATERIALIZEDpageeq_refPRIMARY,name_title,page_random,page_len,page_redirect_namespace_enPRIMARY4cawiki.pagelinks.pl_from1Using where

Event Timeline

Tb assigned this task to jcrespo.
Tb raised the priority of this task from to Needs Triage.
Tb updated the task description. (Show Details)
Tb added a project: Toolforge.
Tb subscribed.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I think it should work now, can you confirm it?

mysql> EXPLAIN SELECT sug_orig_ns, sug_orig, sug_new_ns, sug_new
    -> FROM p50380g50491__rlrl_cawiki_p.suggestions s
    -> WHERE NOT EXISTS (
    ->   SELECT 1
    ->   FROM cawiki_p.pagelinks
    ->   INNER JOIN cawiki_p.page ON pl_from = page_id
    ->   WHERE pl_namespace = s.sug_orig_ns
    ->   AND   pl_title = s.sug_orig
    ->   AND   page_namespace in (0, 10, 14)
    ->   AND   page_title NOT LIKE '%/sandbox'
    ->   AND   page_title NOT LIKE '%/testcases' )
    -> AND s.sug_status > 0\e
    -> \G
*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: s
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3599
        Extra: Using where
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: pagelinks
         type: ref
possible_keys: pl_from,pl_namespace,pl_backlinks_namespace
          key: pl_backlinks_namespace
      key_len: 261
          ref: func,func
         rows: 68
        Extra: Using index
*************************** 3. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: page
         type: eq_ref
possible_keys: PRIMARY,name_title,page_random,page_len,page_redirect_namespace_len
          key: PRIMARY
      key_len: 4
          ref: cawiki.pagelinks.pl_from
         rows: 1
        Extra: Using where
3 rows in set (0.01 sec)
jcrespo set Security to None.
jcrespo moved this task from Triage to Done on the DBA board.

Perfect ta. Runtime for the query has reduced from ~840 hours to 0.89 seconds.

~840 hours to 0.89 seconds.

Lol. I will put it on my resume.

Happy to help. Keep reporting issues you find! :-)