Page MenuHomePhabricator

MW backlinks indexes need reordering
Closed, ResolvedPublic

Description

Currently there is a particular query mostly from google or msn bot hits where the DB can choose the wrong index resulting in a slow full index scan or filesort:

SELECT /* SpecialWhatLinksHere::showIndirectLinks msnbot-65-52-110-240.search.msn.com */
   page_id, page_namespace, page_title, rd_from
FROM
   (SELECT pl_from, rd_from FROM `pagelinks`
    INNER JOIN `page` ON ((pl_from = page_id))
    LEFT JOIN `redirect` ON ((rd_from = pl_from) AND rd_title = 'Writing_an_article' AND (rd_interwiki = '' OR rd_interwiki IS NULL) AND rd_namespace = '4')
    WHERE pl_namespace = '4' AND pl_title = 'Writing_an_article' ORDER BY pl_from LIMIT 102
) `temp_backlink_range`
INNER JOIN `page` ON ((pl_from = page_id))
ORDER BY page_id LIMIT 51

The pagelinks tables has:

CREATE INDEX /*i*/pl_namespace ON /*_*/pagelinks (pl_namespace,pl_title,pl_from);
CREATE INDEX /*i*/pl_backlinks_namespace ON /*_*/pagelinks (pl_namespace,pl_title,pl_from_namespace,pl_from);

Having pl_namespace first in both indexes isn't that useful and the DB can get confused in edge cases. More flexible to have pl_namespace_from first for pl_backlinks_namespace:

CREATE INDEX /*i*/pl_namespace ON /*_*/pagelinks (pl_namespace,pl_title,pl_from);
CREATE INDEX /*i*/pl_backlinks_namespace ON /*_*/pagelinks (pl_from_namespace,pl_namespace,pl_title,pl_from);

The templatelinks and imagelinks tables have the same general problem. Have incrementally applied this change to S1 and S2 slaves on all three tables over the last ~2 months. Seems fine, so it should be formalized.

Details

Related Changes in Gerrit:

Event Timeline

Springle claimed this task.
Springle raised the priority of this task from to Needs Triage.
Springle updated the task description. (Show Details)
Springle subscribed.
Springle set Security to None.
gerritbot subscribed.

Change 190774 had a related patch set uploaded (by Springle):
reorder backlinks indexes T89630

https://gerrit.wikimedia.org/r/190774

Patch-For-Review

Change 190774 had a related patch set uploaded (by TTO):
Reorder backlinks indexes

https://gerrit.wikimedia.org/r/190774

Patch-For-Review

Change 190774 merged by jenkins-bot:
Reorder backlinks indexes

https://gerrit.wikimedia.org/r/190774

Can this task be closed? Or is it held open pending updating of Wikimedia databases? Is there a task for that?