This is split off of the parent task: T141961: Improve Special:PageTranslation list.
The goals is to improve performance by exploring what options are available in the database layer, in order to decide what can be done in the front-end.
Special:PageTranslation query
Background
Purpose: provide a list of translatable pages grouped by status with some metadata.
Queried data:
Page table: page_id, page_namespace, page_title, page_latest
revtag table: rt_type, rt_revision
translate_metadata table: various fields
Query plan:
Part 1:
page + revtag, join on page_id=rt_page
Note: Returns two rows per translatable page to fetch MAX(rt_revision) for rt_type in (tp:mark, tp:tag).
Part 2:
translate_metadata: select by message group id
Derived data:
Page status by comparing page_latest and rt_revision (for each of the two possible types)
Challenges
We list all translatable pages, but to do so we fetch 2xcount(translatable pages) rows. We would like to add pagination, but we cannot do it effectively because grouping is done client side based on derived data.
Quick win: Can we return only one row per page?
Ideal solution: Can we move data derivation and grouping to SQL side?
Do we need to remove grouping? Is any kind of efficient filtering possible?
As a side note, translate_metadata table is queried separately. Ideally it would be joined too, but there is no field to join upon. Though in theory group id can be formed as "page-{Prefixed Title}".