Page MenuHomePhabricator

Fatal error from Title->getTalkPage (Special:WhatLinksHere, Special:Contributions, …)
Open, HighPublic

Description

Error

Request URL: /w/index.php?title=Special%3AWhatLinksHere&target=%23&namespace=
Request ID: XSd-GApAAD4AAIar6t4AAACQ

message
MWException: Can't determine talk page associated with relative section link
trace
#0 /srv/mediawiki/php-1.34.0-wmf.13/includes/Title.php(1549): NamespaceInfo->getTalkPage(Title)
#1 /srv/mediawiki/php-1.34.0-wmf.13/includes/skins/Skin.php(282): Title->getTalkPage()
#2 /srv/mediawiki/php-1.34.0-wmf.13/includes/skins/Skin.php(164): Skin->preloadExistence()
#3 /srv/mediawiki/php-1.34.0-wmf.13/skins/Vector/includes/SkinVector.php(68): Skin->initPage(OutputPage)
#4 /srv/mediawiki/php-1.34.0-wmf.13/includes/skins/SkinTemplate.php(216): SkinVector->initPage(OutputPage)
#5 /srv/mediawiki/php-1.34.0-wmf.13/includes/OutputPage.php(2580): SkinTemplate->outputPage()
#6 /srv/mediawiki/php-1.34.0-wmf.13/includes/MediaWiki.php(891): OutputPage->output(boolean)
#7 /srv/mediawiki/php-1.34.0-wmf.13/includes/MediaWiki.php(903): Closure$MediaWiki::main()
#8 /srv/mediawiki/php-1.34.0-wmf.13/includes/MediaWiki.php(515): MediaWiki->main()
#9 /srv/mediawiki/php-1.34.0-wmf.13/index.php(42): MediaWiki->run()
#10 /srv/mediawiki/w/index.php(3): include(string)

Impact

  • Increase MW fatal levels, and potentially more alarms or aborted deployment as such.
  • Users no longer given informational warning or empty results where none found titles, but instead facing a generic system error page.

Notes

New regression in 1.34-wmf.13.

See also T227700.

Event Timeline

Krinkle created this task.Jul 11 2019, 8:01 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 11 2019, 8:01 PM

Change 525703 had a related patch set uploaded (by MaxSem; owner: MaxSem):
[mediawiki/core@master] Fix exception when viewing special pages with relative related titles

https://gerrit.wikimedia.org/r/525703

Change 525703 merged by jenkins-bot:
[mediawiki/core@master] Fix exception when viewing special pages with relative related titles

https://gerrit.wikimedia.org/r/525703

Pchelolo added a subscriber: Pchelolo.EditedAug 2 2019, 6:39 PM

Seems like the above patch has not fixed it. Both repeating the API request and trying to check links to se: results in a different exception now:

#0 /srv/mediawiki/php-1.34.0-wmf.16/includes/Title.php(1550): NamespaceInfo->getTalkPage(Title)
#1 /srv/mediawiki/php-1.34.0-wmf.16/includes/skins/SkinTemplate.php(910): Title->getTalkPage()
#2 /srv/mediawiki/php-1.34.0-wmf.16/includes/skins/SkinTemplate.php(448): SkinTemplate->buildContentNavigationUrls()
#3 /srv/mediawiki/php-1.34.0-wmf.16/includes/skins/SkinTemplate.php(217): SkinTemplate->prepareQuickTemplate()
#4 /srv/mediawiki/php-1.34.0-wmf.16/includes/OutputPage.php(2580): SkinTemplate->outputPage()
#5 /srv/mediawiki/php-1.34.0-wmf.16/includes/MediaWiki.php(899): OutputPage->output(boolean)
#6 /srv/mediawiki/php-1.34.0-wmf.16/includes/MediaWiki.php(911): Closure$MediaWiki::main()
#7 /srv/mediawiki/php-1.34.0-wmf.16/includes/MediaWiki.php(523): MediaWiki->main()
#8 /srv/mediawiki/php-1.34.0-wmf.16/index.php(42): MediaWiki->run()
#9 /srv/mediawiki/w/index.php(3): include(string)
#10 {main}

https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-2019.08.02/mediawiki?id=AWxTnhqvCfBFUvHR8018&_g=h@8b5b71a

A bit of investigation has found a deeper issue.

The doc for Title::newFromText states that

	 * Title objects returned by this method are guaranteed to be valid, and
	 * thus return true from the isValid() method.

However, Title::newFromText('#')->isValid() == false.

This comment was removed by Pchelolo.

I believe that the inconsistency in the title validity checking and title creation is the root cause of this, but it's a larger issue. Filed T229705

The behaviour of Title::newFromText('#') is a pre-existing issue, but note that the higher-level behaviour of Special:WhatLinksHere and other features is not pre-existing (I think). It's most likely fall-out from the NamespaceInfo refactor (e30cb5ba, etc.), which made the Title class much more strict than what most of our code has been assuming for a decade.

As such, it has broken stuff all over the place. This task is regression number 10, I think?

We can keep plugging these as high-priority regressions (assuming that's possible and feasible), but just to ask out loud - Have we considered reverting this new restriction for the time being? I don't think the refactor was intended to change behaviour or add new restrictions, and the fall-out wasn't planned for either, I think. If this new restriction really is important to have in the short-term, perhaps make it a warning first, research what would violate it (in a non-breaking way), plan to have them all fixed (within CPT, or within the affected teams, notified ahead of time), and then at a later point re-enable the strict breakage.

Or, maybe this is the last breakage we'll find and it can just be fixed...

daniel added a comment.EditedAug 5 2019, 8:48 AM

I don't think the refactor was intended to change behaviour or add new restrictions, and the fall-out wasn't planned for either, I think.

This is correct. However, reverting to a stage where we were directly violating guarantees given by the contract of methods on the Title object isn't great either. The only alternative to throwing an exception seems to be returning null, which will likely cause exceptions later. What else can we do? What's the talk page of #foo? Talk:#foo? That will trigger exceptions as well, since it's not a valid title...

I'm convinced that we can't get around fixing the semantics of Title (or replace Title), and we can't do that without exposing issues by logging them. We should probably try to do that without triggering fatal exceptions, at least for now. I'm a big fan of "failing fast and being safe", but it shouldn't cause this level of disruption. So perhaps we could, for now, log instead of throw.

But if we log, and don't throw, then we still have to return something. So what shall we return as the talk page of #foo?

Krinkle added a comment.EditedAug 6 2019, 11:32 AM

But if we log, and don't throw, then we still have to return something. So what shall we return as the talk page of #foo?

I don't know the right answer off-hand, other than: whatever it used to return before the refactor?

Or, if that's complex and unreasonable, perhaps something like Special:BadTitle/Invalid_talk_page (and other descriptive case-by-case subpage names)

I don't know the right answer off-hand, other than: whatever it used to return before the refactor?

It returned an invalid Title. But we don't want to re-introduce the cyclic dependency between NamespaceInfo and Title, so this isn't really an option.

Or, if that's complex and unreasonable, perhaps something like Special:BadTitle/Invalid_talk_page (and other descriptive case-by-case subpage names)

Yes, that could work.

Unfortunately, this is only a solution for this specific case. I don't see an obvious way to become more backwards compatible overall, automatically. For that, TitleValue would need to stop throwing when constructed with invalid arguments.

Maybe we could introduce TitleValue::lenientNew() as a factory method, which doesn't throw but logs and returns a dummy instead, based on the Special:BadTitle/some message pattern. Does that sound reasonable?

mmodell changed the subtype of this task from "Task" to "Production Error".Wed, Aug 28, 11:06 PM

I don't know the right answer off-hand, other than: whatever it used to return before the refactor?

It returned an invalid Title. But we don't want to re-introduce the cyclic dependency between NamespaceInfo and Title, so this isn't really an option.

I'm not seeing how an invalid Title object relates to NamespaceInfo. Can you clarify?

Regarding the cyclic dependency – I assume you're referring to a potential future where NamespaceInfo::getTalkPage would defer to Title::getTalkPage. Note that that was never the case afaik. The NamespaceInfo class has only recently been introduced. Changing this part of the refactor back how it was before would not re-introduce a dependency - the whole class didn't even exist.

This particular sub-feature could in theory be moved back to where it was in Title. The newer NamespaceInfo method could be removed again until we are done with detecting, deprecating, warning, migrating, and removing the problematic behaviour of the old Title method. After that, the Title method can be re-converted to a simple NamespaceInfo call. We could even keep the newer NamespaceInfo method as-is while we do that (and not use it in Title for a while).

This seems like the natural way we'd normally go about breaking changes. But, I don't have a particular preference for it. What other options do we have that achieve the goal of 1) restore compat with existing code, 2) soft-detect violations of the stricter rules we want to introduce 3) migrate then remove support for the old way.

Krinkle renamed this task from Fatal error on Special:WhatLinksHere when typing a hash symbol to Fatal error from Title->getTalkPage (Special:WhatLinksHere, Special:Contributions, …).EditedThu, Aug 29, 12:13 PM
  • Exception ID: XWe@1gpAAD4AAEyD9EkAAAAL
  • URL: /w/index.php?associated=1&contribs=user&end=1&hideMinor=1&limit=50&namespace=1&nsInvert=1&start=1&tagfilter=17&target=-1%20OR%203*2>(0%2b5%2b814-814)%20--%20&title=Special:Contributions&topOnly=1
trace
Bad value for parameter $title: invalid name '-1_OR_3*2>(0+5+814-814)_--_'

#0 /srv/mediawiki/php-1.34.0-wmf.19/includes/title/TitleValue.php(104): Wikimedia\Assert\Assert::parameter(boolean, string, string)
#1 /srv/mediawiki/php-1.34.0-wmf.19/includes/title/NamespaceInfo.php(206): TitleValue->__construct(integer, string)
#2 /srv/mediawiki/php-1.34.0-wmf.19/includes/Title.php(1551): NamespaceInfo->getTalkPage(Title)
#3 /srv/mediawiki/php-1.34.0-wmf.19/includes/user/User.php(4342): Title->getTalkPage()
#4 /srv/mediawiki/php-1.34.0-wmf.19/includes/specials/SpecialContributions.php(310): User->getTalkPage()
#5 /srv/mediawiki/php-1.34.0-wmf.19/includes/specials/SpecialContributions.php(98): SpecialContributions->contributionsSub(User)
#6 /srv/mediawiki/php-1.34.0-wmf.19/includes/specialpage/SpecialPage.php(571): SpecialContributions->execute(NULL)

I'm not seeing how an invalid Title object relates to NamespaceInfo. Can you clarify?

NamespaceInfo can't return the equivalent invalid TitleValue, because TitleValue refuses to be invalid. So NamespaceInfo would have to return a Title object with invalid content. But that means NamespaceInfo knows about Title, which would be a cyclic dependency, because Title already depends on NamespaceInfo.

This particular sub-feature could in theory be moved back to where it was in Title.

We could put the old code back into the deprecated method in Title, and log a warning when returning an invalid Title. That sounds like a decent solution.

The newer NamespaceInfo method could be removed again until we are done with detecting, deprecating, warning, migrating, and removing the problematic behaviour of the old Title method.

If we removed the methods from NamespaceInfo, what would we migrate to? It should remain in place, so we can update code to use that instead of Title.

This seems like the natural way we'd normally go about breaking changes. But, I don't have a particular preference for it. What other options do we have that achieve the goal of 1) restore compat with existing code, 2) soft-detect violations of the stricter rules we want to introduce 3) migrate then remove support for the old way.

I agree in principle, except though that existing code didn't actually *expect* to get an invalid Title. As far as I can tell, existing code just assumes the Title will be valid, resulting in odd behavior (and sometimes errors). But yes, we can go back from fatal errors to undefined behavior and broken output like broken links in the skin, if that is preferred. (This may sound snarky, but it's not intended to be - I honestly don't know which we should prefer).

Change 533208 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] Restore the old behavior of Title::getTalkPage()

https://gerrit.wikimedia.org/r/533208

we can go back from fatal errors to undefined behavior and broken output [..] (This may sound snarky, but it's not intended to be [..]

No worries, I understand and recognise it's not pretty. I think we should follow some kind of graceful because that means the warnings can be discovered and worked on at a comfortable pace for CPT and/or teams working on tasks spawned from it - instead of UBNs due to producing HTTP 5xx responses that can block the train, can make Scap abort random deployments based on user traffic, and may alert SRE due to Icgina alerting raised MediaWiki fatal rates based on user traffic.

Uh, I can't assign this task to anyone. And I can't set the priority.
Is that because it's an "error"?

Jdforrester-WMF changed the subtype of this task from "Production Error" to "Task".Thu, Aug 29, 4:40 PM
Jdforrester-WMF added a subscriber: Jdforrester-WMF.

Yup. Top-tip: Task types are useless.

daniel claimed this task.Thu, Aug 29, 5:20 PM
Jdforrester-WMF raised the priority of this task from Normal to High.Thu, Aug 29, 11:03 PM

Change 533208 merged by jenkins-bot:
[mediawiki/core@master] Title::getTalkPage(): Restore behavior of interwiki-prefixed & fragment-only titles

https://gerrit.wikimedia.org/r/533208