Page MenuHomePhabricator

SpecialWhatLinksHere::showIndirectLinks is broken
Closed, ResolvedPublic

Description

Setting as NDA only because while it is active, it could be used for DOS.

Unauthenticated users can ask for 1-hour long queries, efectively denying the service of a database server by just hitting fast enough.

Hits 	Tmax 	Tavg 	Tsum 	Hosts 	Users 	Schemas 
18	1,945	1,332	23,984	db1083	wikiuser	enwiki
SELECT /* SpecialWhatLinksHere::showIndirectLinks <ip> */ page_id, page_namespace, page_title, rd_from, page_is_redirect FROM (SELECT /*! STRAIGHT_JOIN */ pl_from, rd_from FROM `pagelinks` LEFT JOIN `redirect` ON ((rd_from = pl_from) AND rd_title = 'Contents' AND (rd_interwiki = '' OR rd_interwiki IS NULL) AND rd_namespace = '12') INNER JOIN `page` ON ((pl_from = page_id)) WHERE pl_namespace = '12' AND pl_title = 'Contents' AND (rd_from is NOT NULL) ORDER BY pl_from LIMIT 102 ) `temp_backlink_range` INNER JOIN `page` ON ((pl_from = page_id)) ORDER BY page_id LIMIT 51 /* c6af36508337896c95736155b3658e6c db1083 enwiki 1945s */
6	53	15	91	db1083	wikiuser	enwiki
SELECT /* SpecialWhatLinksHere::showIndirectLinks <ip> */ page_id, page_namespace, page_title, rd_from, page_is_redirect FROM (SELECT /*! STRAIGHT_JOIN */ pl_from, rd_from FROM `pagelinks` LEFT JOIN `redirect` ON ((rd_from = pl_from) AND rd_title = 'Contents' AND (rd_interwiki = '' OR rd_interwiki IS NULL) AND rd_namespace = '12') INNER JOIN `page` ON ((pl_from = page_id)) WHERE pl_namespace = '12' AND pl_title = 'Contents' ORDER BY pl_from LIMIT 102 ) `temp_backlink_range` INNER JOIN `page` ON ((pl_from = page_id)) ORDER BY page_id LIMIT 51 /* 413819f8da117fd955cc1a72fece6ccf db1083 enwiki 4s */

Those queries look terrible, BTW.

Event Timeline

jcrespo created this object with visibility "WMF-NDA (Project)".
jcrespo created this object with edit policy "WMF-NDA (Project)".

More than possibly related: T106682

I've setup a watchdog on db1083 to kill long running queries, but this should be fixed on code.

Marostegui added subscribers: Anomie, Marostegui.

We should definitely try to fix these queries - is this something Platform Engineering can help with or should we tag some other team(s)?

root@db1081.eqiad.wmnet[commonswiki]> explain SELECT /* SpecialWhatLinksHere::showIndirectLinks */ page_id, page_namespace, page_title, rd_from, page_is_redirect FROM (SELECT /*! STRAIGHT_JOIN */ pl_from, rd_from FROM `pagelinks` LEFT JOIN `redirect` ON ((rd_from = pl_from) AND rd_title = 'Community_portal' AND (rd_interwiki = '' OR rd_interwiki IS NULL) AND rd_namespace = 4) JOIN `page` ON ((pl_from = page_id)) WHERE pl_namespace = 4 AND pl_title = 'Community_portal' AND (rd_from is NOT NULL) ORDER BY pl_from LIMIT 102 ) `temp_backlink_range` JOIN `page` ON ((pl_from = page_id)) ORDER BY page_id LIMIT 51;
+------+-------------+------------+--------+----------------------+--------------+---------+-------------------------------+----------+---------------------------------+
| id   | select_type | table      | type   | possible_keys        | key          | key_len | ref                           | rows     | Extra                           |
+------+-------------+------------+--------+----------------------+--------------+---------+-------------------------------+----------+---------------------------------+
|    1 | PRIMARY     | <derived2> | ALL    | NULL                 | NULL         | NULL    | NULL                          |      102 | Using temporary; Using filesort |
|    1 | PRIMARY     | page       | eq_ref | PRIMARY              | PRIMARY      | 4       | temp_backlink_range.pl_from   |        1 |                                 |
|    2 | DERIVED     | pagelinks  | ref    | PRIMARY,pl_namespace | pl_namespace | 261     | const,const                   | 16852036 | Using where; Using index        |
|    2 | DERIVED     | redirect   | eq_ref | PRIMARY,rd_ns_title  | PRIMARY      | 4       | commonswiki.pagelinks.pl_from |        1 | Using where                     |
|    2 | DERIVED     | page       | eq_ref | PRIMARY              | PRIMARY      | 4       | commonswiki.pagelinks.pl_from |        1 | Using index                     |
+------+-------------+------------+--------+----------------------+--------------+---------+-------------------------------+----------+---------------------------------+
5 rows in set (0.00 sec)

We should definitely try to fix these queries - is this something Platform Engineering can help with or should we tag some other team(s)?

We're good as a starting point.

This specific query looks like it's running into the over-filtering problem: it's trying to select the tiny subset of links to a heavily-linked page that are redirects. Removing the STRAIGHT_JOIN would allow it to use a much better plan, but we'll need to avoid recreating T106682.

I'll poke at this for a bit longer and see what I can come up with.

Oops, this was private (but not security). The patch was https://gerrit.wikimedia.org/r/c/mediawiki/core/+/571739

This seems done now except for possible backporting.

We can open it now and resolve it, is there any private data left on the ticket (I will check now too)?

Not that I can see, ip were redacted and the query doesn't have anything private, correct me if wrong. If agree, you can resolve it too.

Ups, I missed:

except for possible backporting

So everything applies, except I would wait until things are fully on production.

CCicalese_WMF added a subscriber: CCicalese_WMF.

Marking as Resolved as it is in the Done column. Feel free to reopen if there is remaining work.

@CCicalese_WMF normally the NDA (security) tickets are open up (made public) upon resolution, unless there is any kind of private information that would prevent that.

CCicalese_WMF changed the visibility from "WMF-NDA (Project)" to "Public (No Login Required)".Apr 29 2020, 1:50 PM
CCicalese_WMF changed the edit policy from "WMF-NDA (Project)" to "All Users".

Thank you, @jcrespo. Making public per T144235#6092134.