Page MenuHomePhabricator

Deletion log excerpt (mw-warning-with-logexcerpt) not shown when only curid given and page has been deleted
Open, MediumPublic

Description

With this <https://de.wikipedia.org/w/index.php?curid=1774043 Permalink> (missing original linktext) you will get the "Ungültiger Titel"-page (Bad title) instead of an page containig a "missing mw-warning-with-logexcerpt mw-content-ltr"-log-info-section.

this behavior is no good idea in case of permalinks.

hint: curid 1774043 was :de:Diskussion:Holocaust/Archiv2 https://de.wikipedia.org/wiki/Diskussion:Holocaust/Archiv2


Version: 1.24rc
Severity: minor
URL: https://de.wikipedia.org/w/index.php?curid=1774043

Details

Reference
bz71578

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:58 AM
bzimport set Reference to bz71578.
bzimport added a subscriber: Unknown Object (MLST).

Thanks for taking the time to report this!

Is that a general problem "on deleted pages"? Are there more examples?

I tried to rephase the summary; let me know if I'm mis-stating something.

The nice behavior would be to look up the title in MediaWiki::parseTitle() and from there MediaWiki would handle it the same way as if a title was specified. Or maybe even to make it redirect to the title.

Unfortunately looking up the title from a deleted page id is messy. page table records are deleted physically; page ID is stored in the revision table but not indexed and overwritten with the new ID on undeletion (cf. T28123); the only reliable method seems to be to find the last delete log event with the given page id. Not sure how effective that is, though; that would be something like SELECT page_id FROM logging WHERE log_page = $page_id AND log_type = 'delete' AND log_action = 'delete' ORDER BY log_timestamp DESC LIMIT 1 (or maybe order by log_page) and the relevant indexes are (log_page, log_timestamp) and (log_type, log_action, log_timestamp), so there is no single covering index, and a single page can have lots of log records. (Tens of thousands in extreme cases, I'd guess? The number of patrol log records would be proportional with the number of revisions.)

@jcrespo what do you think of the previous comment? Is it feasible to use that query, or to add a new index that covers it (although I don't think the impact of this bug would justify adding a new index to one of our largest tables)? This would only get invoked for URLs with ?curid=XXX and no page name, so the query would not be invoked often.

Please note patch https://gerrit.wikimedia.org/r/#/c/239319/

Please have performance / security on the loop for changes related to this,
they will have more information regarding ongoing concerns. Once they
provide feedback, I will be happy to help with schema changes if needed,
although by the look of it, T64615 may be a soft-blocker (although probably
easier to fix for a specific query).

Thanks for pointing that out. 404 pages can easily be reached by bots, while curid URLs cannot, so I don't think DOS by well-meaning but poorly behaving bots is a concern here... I'll make sure to add you and the perf team for code review though.