Page MenuHomePhabricator

Optimize querying the page table by namespace
Open, Needs TriagePublic

Description

In analytical queries, it is common to want to restrict the results to edits in pages of a particular namespace.

For instance, take the following query which aims to find users with most edits in the Template namespace (so we can identify new rising stars and ask them to help with other templates as well):

select actor_name
from revision_userindex
join actor_revision
  on rev_actor = actor_id
join page
  on rev_page = page_id
where
  page_namespace = 10
  and rev_timestamp > '20200401000000'
group by actor_id
order by count(*) desc

EXPLAIN results show that this query uses the existing name_title index (which is on page_namespace, page_title) but still ends up using filesort which is undesirable.

Is it because the index is not just on page_namespace? Would having an index just on that column be beneficial?

Other advice on how to further enhance the query is appreciated as well.