Page MenuHomePhabricator

Caching and batching of "known" link status and other link info
Open, Needs TriagePublic

Description

T101348 notes that GlobalUserPage is slow. But the problem is broader than GlobalUserPage. The extensions that hook TitleIsAlwaysKnown are:

  • GlobalUserPage
    • Slow
    • A shadow namespace using ArticleFromTitle and WikiPageFactory
  • SemanticMediaWiki
    • Probably OK but complains of hackishness
  • WikimediaIncubator
    • Probably OK, but maybe it would check for page existence if that wasn't so slow.
    • Uses ShowMissingArticle
  • SharedHelpPages
    • Slow
  • Uses ArticleFromTitle, WikiPageFactory, a few other hooks to override everything in a namespace
  • SocialProfile
    • Slow code is commented out.
  • Translate
    • Slow.
    • Makes some special page subpage links into red links.
  • WikiMirror
    • Slow
    • Uses WikiPageFactory to override arbitrary pages.

In 2015, Tim suggested that we add a hook to LinkHolderArray. But there are also a lot of callers of LinkBatch followed by Title::isKnown(), and there are standalone callers of Title::isKnown(). We don't want to have three separate hooks. Extensions want a single simple interface which lets them provide knownness for a batch of titles. If Title::isKnown() is called by itself, it should just call the extension with a single title in the array.

Several extensions would benefit from being able to register a namespace or subpage root, so that they would only be called for titles that match their prefix. The prefix is often only known at runtime due to being localised or configured.

Extensions that use TitleIsAlwaysKnown are usually overriding the content of the page referred to by the title. Maybe there is an argument for putting both content and page existence into the same interface. But the project is more feasible if that is out of scope for now.

HtmlPageLinkRendererBegin and HtmlPageLinkRendererEnd are conceptually closer to link knownness than content overrides. The extensions that use them are:

  • Wikibase
    • Actually does an existence check and various other slow things
    • Custom default link text
  • BlueSpiceFoundation
    • ?
  • DisplayTitle
    • Custom default link text
  • AnonPrivacy
    • Magics away IP addresses from user links and user tool links. Changes href, text, etc.
  • FlexDiagrams
    • Customizes red link href
  • HelpPages
    • Slow: Seems identical to SharedHelpPages except using a different hook.
  • Link_Attributes
    • Fast modification of attribs
  • NamespacePopups
    • Slow: checks existence of a page that is not in the batch. Overwrites the whole HTML.
  • PageForms
    • Another red link href hack
  • TinyMCE
    • Another red link href hack
  • UnifiedExtensionForFemiwiki
    • Basically disables all red links
  • MirahezeMagic
    • Extends interwiki link concept; fast

Sketch proposal

  • Extensions register link handler objects in extension.json with ObjectFactory style syntax
  • There could also be core link handlers. For example, LinkBatch currently calls GenderCache, but gender caching could instead be implemented as a link handler.
  • Link handler objects provide prefix matching info via an accessor.
  • A service stores handler objects and prefixes. Maybe the service is LinkRenderer, or maybe it is a new service.
  • Preload operation: the service preloads link information for an array of titles.
  • Render operation: if the link handler opted in during preload, the service calls the link handler again during link rendering, passing the previously preloaded info. Preloading is guaranteed -- the extension will always get its preload info on render.

Event Timeline

Legoktm raised the priority of this task from to Medium.
Legoktm updated the task description. (Show Details)
Legoktm added a subscriber: Legoktm.
Tgr raised the priority of this task from Medium to Needs Triage.Feb 19 2015, 6:50 AM

I'm going to flesh this out a bit.

tstarling renamed this task from Implement batch lookups of globaluserpage existence when changing link colors to Caching and batching of "known" link status and other link info.Jun 30 2021, 6:42 AM
tstarling updated the task description. (Show Details)
Restricted Application added subscribers: RhinosF1, Reception123. · View Herald Transcript

https://gerrit.wikimedia.org/r/c/mediawiki/core/+/310306 is what I had worked on years ago, but it never left WIP for reasons I don't remember.