In T349653 we added support for finding an archived thread by limiting our search to headings which have appeared in the current page's history. However in production it would take too long to back-fill our database with an index of all old revisions, so we have limited ourselves to current revisions. With this restriction we will need to perform the search in other ways.
Description
Details
| Subject | Repo | Branch | Lines +/- | |
|---|---|---|---|---|
| Add more search methods to findNewestRevisionsByHeading | mediawiki/extensions/DiscussionTools | master | +81 -11 |
Related Objects
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Open | None | T356576 Optimize queries in ThreadItemStore::find*() methods | |||
| Open | None | T356575 Add tests for ThreadItemStore::find*() methods | |||
| Resolved | Esanders | T356276 Permalink redirecting: Support plain section links on wikis which haven't indexed their full history |
- Mentioned In
- T356063: Deploy talk page permalinks to all wikis except en.wiki
T349653: Permalink redirecting: Support plain section links - Mentioned Here
- T356575: Add tests for ThreadItemStore::find*() methods
T356576: Optimize queries in ThreadItemStore::find*() methods
P56169 T356276 queries
T349653: Permalink redirecting: Support plain section links
Event Timeline
Change 994746 had a related patch set uploaded (by Esanders; author: Esanders):
[mediawiki/extensions/DiscussionTools@master] Add more search methods to findNewestRevisionsByHeading
Change 994746 merged by jenkins-bot:
[mediawiki/extensions/DiscussionTools@master] Add more search methods to findNewestRevisionsByHeading
The code generates some ugly-looking SQL queries, so I'll post them here for review and future reference.
Although they look scary, I think they're good enough both in theory (queries are using indexes, no filesorts) and in practice (I ran them on the enwiki analytics replicas and they took ~0.1 s, even when returning no results, which should be the worst case).
But there are some inefficiencies – we query several tables multiple times to fetch the itr_parent__itid_itemid field which we don't use, and we also query them again in the big itid_itemid IN … subquery at the end which could be merged with the main query. It seems like the new SelectQueryBuilder makes it too easy to build up queries and has allowed us to catch a mild case of ORM-itis. We should try to avoid that, and hopefully it doesn't make the code generating these queries too messy with special cases.
I filed you some follow-up work :)
- T356575: Add tests for ThreadItemStore::find*() methods
- T356576: Optimize queries in ThreadItemStore::find*() methods
(The first query has already been like this even before the changes, and it's mostly my fault, so I think this doesn't need to block this patch.)
(By the way, I formatted the SQL code below with https://nene.github.io/prettier-sql-playground/, it's the best SQL code formatter I've found.)
| 1 | SELECT |
|---|---|
| 2 | itr_id, |
| 3 | it_itemname, |
| 4 | it_timestamp, |
| 5 | it_actor, |
| 6 | itid_itemid, |
| 7 | itr_parent_id, |
| 8 | itr_transcludedfrom, |
| 9 | itr_level, |
| 10 | itr_headinglevel, |
| 11 | itr_revision_id, |
| 12 | page_id, |
| 13 | page_namespace, |
| 14 | page_title, |
| 15 | page_is_redirect, |
| 16 | page_is_new, |
| 17 | page_touched, |
| 18 | page_links_updated, |
| 19 | page_latest, |
| 20 | page_len, |
| 21 | page_content_model, |
| 22 | page_lang, |
| 23 | actor_id, |
| 24 | actor_name, |
| 25 | actor_user, |
| 26 | itr_parent__itid_itemid |
| 27 | FROM |
| 28 | `discussiontools_items` |
| 29 | JOIN `discussiontools_item_pages` ON (itp_items_id = it_id) |
| 30 | JOIN `discussiontools_item_revisions` |
| 31 | ON ((itr_items_id = it_id) AND (itr_revision_id = itp_newest_revision_id)) |
| 32 | JOIN `discussiontools_item_ids` ON (itid_id = itr_itemid_id) |
| 33 | LEFT JOIN `page` ON (page_id = itr_transcludedfrom) |
| 34 | LEFT JOIN `actor` ON (actor_id = it_actor) |
| 35 | LEFT JOIN ( |
| 36 | SELECT |
| 37 | itr_id AS `itr_parent__itr_id`, |
| 38 | itid_itemid AS `itr_parent__itid_itemid` |
| 39 | FROM |
| 40 | `discussiontools_items` |
| 41 | JOIN `discussiontools_item_pages` ON (itp_items_id = it_id) |
| 42 | JOIN `discussiontools_item_revisions` |
| 43 | ON ( |
| 44 | (itr_items_id = it_id) |
| 45 | AND (itr_revision_id = itp_newest_revision_id) |
| 46 | ) |
| 47 | JOIN `discussiontools_item_ids` ON (itid_id = itr_itemid_id) |
| 48 | ) AS `sqb1` |
| 49 | ON (itr_parent_id = itr_parent__itr_id) |
| 50 | WHERE |
| 51 | ( |
| 52 | itid_itemid IN ( |
| 53 | SELECT itid_itemid |
| 54 | FROM |
| 55 | `discussiontools_items` |
| 56 | JOIN `discussiontools_item_pages` ON (itp_items_id = it_id) |
| 57 | JOIN `discussiontools_item_revisions` |
| 58 | ON ( |
| 59 | (itr_items_id = it_id) |
| 60 | AND (itr_revision_id = itp_newest_revision_id) |
| 61 | ) |
| 62 | JOIN `discussiontools_item_ids` ON (itid_id = itr_itemid_id) |
| 63 | JOIN `revision` ON (rev_id = itr_revision_id) |
| 64 | WHERE (itid_itemid LIKE 'h-B-%' ESCAPE '`') AND rev_page = 1940 |
| 65 | ) |
| 66 | ) |
| 67 | LIMIT 50; |
| 68 | |
| 69 | +------+-------------+--------------------------------+--------+------------------------------------------------------+---------------------------------+---------+----------------------------------------------------------------------------------------------------------+------+------------------------------+ |
| 70 | | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | |
| 71 | +------+-------------+--------------------------------+--------+------------------------------------------------------+---------------------------------+---------+----------------------------------------------------------------------------------------------------------+------+------------------------------+ |
| 72 | | 1 | PRIMARY | revision | ref | PRIMARY,rev_page_actor_timestamp,rev_page_timestamp | rev_page_timestamp | 4 | const | 153 | Using index; Start temporary | |
| 73 | | 1 | PRIMARY | discussiontools_item_revisions | ref | itr_itemid_id_revision_id,itr_revision_id | itr_revision_id | 8 | enwiki.revision.rev_id | 7 | Using index condition | |
| 74 | | 1 | PRIMARY | discussiontools_items | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_items_id | 1 | Using where; Using index | |
| 75 | | 1 | PRIMARY | discussiontools_item_pages | eq_ref | itp_items_id_page_id,itp_items_id_newest_revision_id | itp_items_id_newest_revision_id | 12 | enwiki.discussiontools_items.it_id,enwiki.discussiontools_item_revisions.itr_revision_id | 1 | Using index | |
| 76 | | 1 | PRIMARY | discussiontools_item_ids | eq_ref | PRIMARY,itid_itemid | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_itemid_id | 1 | Using where | |
| 77 | | 1 | PRIMARY | discussiontools_item_ids | eq_ref | PRIMARY,itid_itemid | itid_itemid | 257 | enwiki.discussiontools_item_ids.itid_itemid | 1 | Using index; End temporary | |
| 78 | | 1 | PRIMARY | discussiontools_item_revisions | ref | itr_itemid_id_revision_id,itr_revision_id | itr_itemid_id_revision_id | 4 | enwiki.discussiontools_item_ids.itid_id | 1 | Using index condition | |
| 79 | | 1 | PRIMARY | discussiontools_items | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_items_id | 1 | Using where | |
| 80 | | 1 | PRIMARY | discussiontools_item_pages | eq_ref | itp_items_id_page_id,itp_items_id_newest_revision_id | itp_items_id_newest_revision_id | 12 | enwiki.discussiontools_item_revisions.itr_items_id,enwiki.discussiontools_item_revisions.itr_revision_id | 1 | Using where; Using index | |
| 81 | | 1 | PRIMARY | page | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_transcludedfrom | 1 | Using where | |
| 82 | | 1 | PRIMARY | actor | eq_ref | PRIMARY | PRIMARY | 8 | enwiki.discussiontools_items.it_actor | 1 | Using where | |
| 83 | | 1 | PRIMARY | discussiontools_item_revisions | eq_ref | PRIMARY,itr_itemid_id_revision_id,itr_revision_id | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_parent_id | 1 | Using where | |
| 84 | | 1 | PRIMARY | discussiontools_items | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_items_id | 1 | Using where; Using index | |
| 85 | | 1 | PRIMARY | discussiontools_item_pages | eq_ref | itp_items_id_page_id,itp_items_id_newest_revision_id | itp_items_id_newest_revision_id | 12 | enwiki.discussiontools_items.it_id,enwiki.discussiontools_item_revisions.itr_revision_id | 1 | Using index | |
| 86 | | 1 | PRIMARY | discussiontools_item_ids | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_itemid_id | 1 | Using where | |
| 87 | +------+-------------+--------------------------------+--------+------------------------------------------------------+---------------------------------+---------+----------------------------------------------------------------------------------------------------------+------+------------------------------+ |
| 88 | |
| 89 | |
| 90 | SELECT |
| 91 | itr_id, |
| 92 | it_itemname, |
| 93 | it_timestamp, |
| 94 | it_actor, |
| 95 | itid_itemid, |
| 96 | itr_parent_id, |
| 97 | itr_transcludedfrom, |
| 98 | itr_level, |
| 99 | itr_headinglevel, |
| 100 | itr_revision_id, |
| 101 | page_id, |
| 102 | page_namespace, |
| 103 | page_title, |
| 104 | page_is_redirect, |
| 105 | page_is_new, |
| 106 | page_touched, |
| 107 | page_links_updated, |
| 108 | page_latest, |
| 109 | page_len, |
| 110 | page_content_model, |
| 111 | page_lang, |
| 112 | actor_id, |
| 113 | actor_name, |
| 114 | actor_user, |
| 115 | itr_parent__itid_itemid |
| 116 | FROM |
| 117 | `discussiontools_items` |
| 118 | JOIN `discussiontools_item_pages` ON (itp_items_id = it_id) |
| 119 | JOIN `discussiontools_item_revisions` |
| 120 | ON ((itr_items_id = it_id) AND (itr_revision_id = itp_newest_revision_id)) |
| 121 | JOIN `discussiontools_item_ids` ON (itid_id = itr_itemid_id) |
| 122 | LEFT JOIN `page` ON (page_id = itr_transcludedfrom) |
| 123 | LEFT JOIN `actor` ON (actor_id = it_actor) |
| 124 | LEFT JOIN ( |
| 125 | SELECT |
| 126 | itr_id AS `itr_parent__itr_id`, |
| 127 | itid_itemid AS `itr_parent__itid_itemid` |
| 128 | FROM |
| 129 | `discussiontools_items` |
| 130 | JOIN `discussiontools_item_pages` ON (itp_items_id = it_id) |
| 131 | JOIN `discussiontools_item_revisions` |
| 132 | ON ( |
| 133 | (itr_items_id = it_id) |
| 134 | AND (itr_revision_id = itp_newest_revision_id) |
| 135 | ) |
| 136 | JOIN `discussiontools_item_ids` ON (itid_id = itr_itemid_id) |
| 137 | ) AS `sqb1` |
| 138 | ON (itr_parent_id = itr_parent__itr_id) |
| 139 | WHERE |
| 140 | ( |
| 141 | itid_itemid IN ( |
| 142 | SELECT itid_itemid |
| 143 | FROM |
| 144 | `discussiontools_items` |
| 145 | JOIN `discussiontools_item_pages` ON (itp_items_id = it_id) |
| 146 | JOIN `discussiontools_item_revisions` |
| 147 | ON ( |
| 148 | (itr_items_id = it_id) |
| 149 | AND (itr_revision_id = itp_newest_revision_id) |
| 150 | ) |
| 151 | JOIN `discussiontools_item_ids` ON (itid_id = itr_itemid_id) |
| 152 | JOIN `page` ON (page_id = itp_page_id) |
| 153 | WHERE |
| 154 | (itid_itemid LIKE 'h-B-%' ESCAPE '`') |
| 155 | AND (page_title LIKE 'ThreadItemStore2/%' ESCAPE '`') |
| 156 | AND page_namespace = 1 |
| 157 | ) |
| 158 | ) |
| 159 | LIMIT 50; |
| 160 | |
| 161 | +------+-------------+--------------------------------+--------+------------------------------------------------------+---------------------------------+---------+------------------------------------------------------------------------------------------+------+--------------------------------------------------------------+ |
| 162 | | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | |
| 163 | +------+-------------+--------------------------------+--------+------------------------------------------------------+---------------------------------+---------+------------------------------------------------------------------------------------------+------+--------------------------------------------------------------+ |
| 164 | | 1 | PRIMARY | page | range | PRIMARY,page_name_title | page_name_title | 261 | NULL | 1 | Using where; Using index; Start temporary | |
| 165 | | 1 | PRIMARY | discussiontools_item_ids | range | PRIMARY,itid_itemid | itid_itemid | 257 | NULL | 5296 | Using where; Using index; Using join buffer (flat, BNL join) | |
| 166 | | 1 | PRIMARY | discussiontools_item_ids | eq_ref | PRIMARY,itid_itemid | itid_itemid | 257 | enwiki.discussiontools_item_ids.itid_itemid | 1 | Using index | |
| 167 | | 1 | PRIMARY | discussiontools_item_revisions | ref | itr_itemid_id_revision_id,itr_revision_id | itr_itemid_id_revision_id | 4 | enwiki.discussiontools_item_ids.itid_id | 1 | Using index condition | |
| 168 | | 1 | PRIMARY | discussiontools_items | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_items_id | 1 | Using where; Using index | |
| 169 | | 1 | PRIMARY | discussiontools_item_pages | eq_ref | itp_items_id_page_id,itp_items_id_newest_revision_id | itp_items_id_page_id | 12 | enwiki.discussiontools_items.it_id,enwiki.page.page_id | 1 | Using index condition; Using where; End temporary | |
| 170 | | 1 | PRIMARY | discussiontools_item_revisions | ref | itr_itemid_id_revision_id,itr_revision_id | itr_itemid_id_revision_id | 4 | enwiki.discussiontools_item_ids.itid_id | 1 | Using index condition | |
| 171 | | 1 | PRIMARY | discussiontools_items | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_items_id | 1 | Using where | |
| 172 | | 1 | PRIMARY | discussiontools_item_pages | eq_ref | itp_items_id_page_id,itp_items_id_newest_revision_id | itp_items_id_newest_revision_id | 12 | enwiki.discussiontools_items.it_id,enwiki.discussiontools_item_revisions.itr_revision_id | 1 | Using index | |
| 173 | | 1 | PRIMARY | page | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_transcludedfrom | 1 | Using where | |
| 174 | | 1 | PRIMARY | actor | eq_ref | PRIMARY | PRIMARY | 8 | enwiki.discussiontools_items.it_actor | 1 | Using where | |
| 175 | | 1 | PRIMARY | discussiontools_item_revisions | eq_ref | PRIMARY,itr_itemid_id_revision_id,itr_revision_id | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_parent_id | 1 | Using where | |
| 176 | | 1 | PRIMARY | discussiontools_items | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_items_id | 1 | Using where; Using index | |
| 177 | | 1 | PRIMARY | discussiontools_item_pages | eq_ref | itp_items_id_page_id,itp_items_id_newest_revision_id | itp_items_id_newest_revision_id | 12 | enwiki.discussiontools_items.it_id,enwiki.discussiontools_item_revisions.itr_revision_id | 1 | Using index | |
| 178 | | 1 | PRIMARY | discussiontools_item_ids | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_itemid_id | 1 | Using where | |
| 179 | +------+-------------+--------------------------------+--------+------------------------------------------------------+---------------------------------+---------+------------------------------------------------------------------------------------------+------+--------------------------------------------------------------+ |
| 180 | |
| 181 | |
| 182 | SELECT itid_itemid AS `value` |
| 183 | FROM |
| 184 | `discussiontools_items` |
| 185 | JOIN `discussiontools_item_pages` ON (itp_items_id = it_id) |
| 186 | JOIN `discussiontools_item_revisions` |
| 187 | ON ((itr_items_id = it_id) AND (itr_revision_id = itp_newest_revision_id)) |
| 188 | JOIN `discussiontools_item_ids` ON (itid_id = itr_itemid_id) |
| 189 | JOIN `page` ON ((page_id = itp_page_id) AND (page_latest = itr_revision_id)) |
| 190 | WHERE (itid_itemid LIKE 'h-B-%' ESCAPE '`') |
| 191 | LIMIT 2; |
| 192 | |
| 193 | +------+-------------+--------------------------------+--------+------------------------------------------------------+---------------------------------+---------+------------------------------------------------------------------------------------------+------+--------------------------+ |
| 194 | | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | |
| 195 | +------+-------------+--------------------------------+--------+------------------------------------------------------+---------------------------------+---------+------------------------------------------------------------------------------------------+------+--------------------------+ |
| 196 | | 1 | SIMPLE | discussiontools_item_ids | range | PRIMARY,itid_itemid | itid_itemid | 257 | NULL | 5296 | Using where; Using index | |
| 197 | | 1 | SIMPLE | discussiontools_item_revisions | ref | itr_itemid_id_revision_id,itr_revision_id | itr_itemid_id_revision_id | 4 | enwiki.discussiontools_item_ids.itid_id | 1 | Using index condition | |
| 198 | | 1 | SIMPLE | discussiontools_items | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.discussiontools_item_revisions.itr_items_id | 1 | Using where; Using index | |
| 199 | | 1 | SIMPLE | discussiontools_item_pages | eq_ref | itp_items_id_page_id,itp_items_id_newest_revision_id | itp_items_id_newest_revision_id | 12 | enwiki.discussiontools_items.it_id,enwiki.discussiontools_item_revisions.itr_revision_id | 1 | | |
| 200 | | 1 | SIMPLE | page | eq_ref | PRIMARY | PRIMARY | 4 | enwiki.discussiontools_item_pages.itp_page_id | 1 | Using where | |
| 201 | +------+-------------+--------------------------------+--------+------------------------------------------------------+---------------------------------+---------+------------------------------------------------------------------------------------------+------+--------------------------+ |
| 202 |
