Page MenuHomePhabricator

Achieve predictable MediaWiki routing and cacheable skin data
Open, MediumPublic

Description

This is a rough sketch to seek input from other developers.

Objective

Enable MediaWiki to respond to page views in a fairly performant manner for all users, both those with sessions (e.g. registered and logged-in) and those without sessions (unregistered/anonymous).

This is motivated by the folllowing high-level goals:

  • Equitable performance. Decrease the gap between page load time for registered and unregistered users. Unregistered users generally receive pages from our CDN. Registered users must interact with the application servers to render their pages, however this need not be an expensive and slow operation. As of July 2022, MediaWiki as deployed at WMF, takes ~200ms to render a desktop page view, and ~270ms to render a mobile page view (regardless of whether a CDN cache miss for unregistered, or registered).
  • Improve resiliancy against on-going attacks. Rendering articles is the most common operation we perform for our CDN, and this fact is core to the Foundation mission. (The frequency of article requests is not a technical side-effect of secondary features or of how we implement things, such as may be the case with jobrunner or API calls; rather pageviews are themselves core to our mission). This operation should be among the cheapest and most optimimised, and yet has become one of the most expensive. Speeding up backend page view rendering means we won't be as susceptible to certain kinds of DDOS attacks. The cost of such request would be much lower, especially repeated CDN-cache misses for what are internally the same page+skin combination (regardless of exact URL).
  • Simplify developer experience. Today we have a lot of flexibility in the skin, and looking at examples of existing code can give the wrong idea. The caches are hidden behind what appear to be (and often still are) expensive service calls. This means "follow the example" leads by default to more inline service calls that are late-discovered by the runtime, uncached, and unbatched, and subject to global state. Doing the "right" thing requires a steeper learning curve, and also tends to bottleneck with input from the Performance Team. By moving these to an upfront declared portion of the Skin class, the default place is naturally cached, and naturally permitted to utilize global state, and thus naturally encourages the "right" thing. Bypassing that layer would stand out in code review. It also means we automatically cache the longer tail of smaller computations as well, for which today we would likely not bother creating an ad-hoc cache for manually.
  • Future exploration. There are numerous client-side and infastructure performance improvements dependent on a predictable URL router, as well as a wide range of possible UX explorations. This includes:
    • More effective CDN caching by normalizing URLs based on declared routes (T310087).
    • Sending preload headers from the edge based on declared routes (e.g flush appshell or send preload skin headers on page views, currently not possible since we have pageview-like route bypasses like Special:Export/:title).
    • PWA or offline web app for mobile, without the need for a fully developed native app. To explore those in a sustainable manner, I consider it a prerequisite that the skin (or more precisely, the skins we deploy at WMF) not depend on directly contacting internal databases or backend services during the rendering step, but rather are modelled such that this data is injected as part of a single upfront payload to the skin service class. Whether, why, and how to potentiallly expose that payload to an external re-implementation of the skin is for later consideration. Doing so today would inevitably lead to broken contracts, out-of-sync business logic, and a stuck MVP that can never serve beyond a mostly passive/read-only experience.

Outcomes

Today we generally compute skin data "at the last minute" and thus rely solely on Varnish to essentially "memcache" this skin data even though it doesn't vary between users. This has two negative side-effects: 1) We're only achieving our perfomance goals today so long as we keep a very high Varnish CDN cache-hit ratio. and 2) All registered/logged-in traffic, as well as any cache miss from unregistered users, does not meet our goal.

By letting MW cache the skin data explicitly, we can let Varnish be focussed on and optimised for serving the scale that is Wikipedia production traffic; instead of also having to implicitly be responsible for meeting individual page view performance goals. (And systemically excluding registered users from that goal.)

Technical outcomes:

  • Routing logic will be aware of which urls are page views, and which are not. (Currently a small number of legacy special pages bypass the route by retroactively disabling OutputPage mid-way through the response.)
  • The Skin is given (cacheable) data about the page and user. It will produce its HTML output without performing additional backend queries. Therefore, any DB calls skins currently make for page- or user-related data must be moved to pre-cache accessors before the "render" and "getTemplateData" step.

Small but impactful changes

  1. Add a router that will control which output handler (Action or SpecialPage class) is invoked, e.g. no surprise OutputPage::disable() half-way through the request, such as in SpecialExport and RawAction currently do. This would be similar to what MediaWiki.php does already, but based on a declarative registry instead of being procedural, such that the result of the registrations can be expressed in data and e.g. in the future also be exported as a JSON payload containing URL patterns.
  2. Skin templates for the default mobile/desktop skin at WMF must be reduced to only string concatenation and simple boolean/iteration logic based on received data. E.g. as simple as a Mustache template, and preferably as actual Mustache such that this restriction is naturally promoted ("easy to do the right thing").
  3. Data flows only in one direction to the Skin. No more two-way communication from Skin back to OutputPage and other service classes. OutputPage injects all relevant data (maybe some of it lazy-computed) into the Skin class. Skin classes may combine it with its receied User info and transform properties (rename, reduce, derive, etc.), and then invokes the Skin template with it.
  4. Skin hooks that WMF production extensions rely are audited. Any hooks that are currently essentially "unrestricted callables with global context returning uncacheable raw html" should be replaced with new hooks that are "callables that return deterministic and cacheable HTML or template data". They can still vary by user and page (e.g. "What links here", "My contributions"), but any user-variable decisions about page-specific data must be made in the skin template by combining the two datasets rather than hooks doing this already in a way that makes the returned data uncachable as either "by user" only "or by skin/page" only. If extension hooks require additional information, they can do so, but must use hook behind the Skin or ParserOutput cache, not after it. E.g. they could compute the information via a hook during page parsing, or when generating the userinfo blob or skin metadata blob.

Implementation

It is likely that the outcome of this will be that third-party site admins can effectively achieve performance close to what $wgUseFileCache offers today, but for both logged-ou and logged-in users alike. The minimal "post-cache" run-time that we invoke after reading FileCache, would now apply the Skin logic at runtime instead of before cache time. (Today, by running it before the cache, the result is unusable for logged-in users). Potentially it might also obsolete the need for FileCache, or possibly keeping it around only to help increase server capacity for logged-out users (as opposed to for page loading speed).

The role of FileCache would then essentially be obsoleted by a slightly larger ParserOutput cache (ParserCache), additional memcached keys, or the MainStash. Alternatively, we may change FileCache to be more like a companion to ParserCache that caches only the extra data needed by skins, or store the supplemental data in MainStash or memcached..

Steps:

  • Deprecate OutputPage::disable().
  • Decide on where and then implement a user-data-for-skin accessor. E.g. this could take the form of a mostly-empty service class with a getter method that runs a "filter"-style hook for extensions (as per T212482) in the parent class. Moving this to a central place would make it easier to identify common needs and offer batching or caching as-needed. The move itself would not change the latency cost. If we find the user-data-for-skin payload is e.g. more than a few milliseconds on most page views, we may want to consider precomputing and cache it, e.g. in the session data, or as separate sessionstore key, or in Memcached. If we cache it, then expiration contracts and/or cache propagation (WANCache touch keys?) will become important. User data and user group membership can change, as well as the underlying user rights associated with a given group can be reconfigured.
  • Decide on where and then implement a cached page-data-for-skins. This would be per-page-revision and user-agnostic. E.g. each skin hooks while parsing and naturally stored in ParserOutput, or as its own blob per-skin in MainStash or Memcached. This too would include a filter-style hook for extensions to fetch or compute additional data that they might then later instruct the skin to display.
  • Move existing logic in core Skin, WMF-deployed skins/extensions to be computed during one of these two hooks, thus having data flow in one direction from OutputPage to Skin.
  • Convert one Skin (e.g. Vector) to become a template, as example. (Note, there is no pressure to remove support for PHP-based skins, the only requirement is that it can handle responses in front of the page cache instead of behind it, e.g. using data properties instead of run-time database queries and WikiPage methods calls).
  • Implement a router that supports both paths and query parameters. See also MediaWiki::parseTitle() and other methods in MediaWiki.php. The decision to not use OutputPage for a response must be made there, before any Action or SpecialPage is chosen. E.g. it should be known without executing any output handlers whether /wiki/Foo?action=edit&search=bar is a page view, an edit action, or a search query.
Sketch
/**
 * Today.
 */
class Skin {
  // Usually via:
   // - SkinTemplate::outputPage, SkinTemplate::initPage, SkinTemplate::generateHTML,
   // - SkinTemplate::getTemplateData, SkinTemplate::prepareQuickTemplate.
  function outputPage() {
    $this->buildContentNav();
    $this->getTemplateData();
  }
  function getTemplateData() {
    return [
      .. => Title->getCategories(),
    ];
  }
  function buildContentNav() {
    DatabaseBlock::newLoad();
    TalkPageNotificationManager->dbCheckNewUserMessages();
    UserEditTracker::getUserEditCount();
    UserGroupManager->getUserGroupMemberships();
    Authority->probablyCan( 'read' );
    Authority->probablyCan( 'edit' );
    $hooks->onSkinTemplateNavigationUniversal();
  }
}

class Echo\Hooks {
  function onSkinTemplateNavigationUniversal() {
    MWEchoNotifUser->getMessageCount();
  }
}

/**
 * Possible future
 */
class Skin {
  function outputPage($pageInfo, $userInfo) {
    $this->setPageInfo($pageInfo);
    $this->setUserInfo($userInfo); // or pass down as params?
    $this->buildContentNav();
    $this->getTemplateData();
  }

  public function getTemplateData() {
    return [
      .. => $this->pageInfo['categories'],
    ];
  }

  public function buildContentNav() {
    $this->userInfo['block'] // array|false
    $this->userInfo['newtalk']; // array|false
    $this->userInfo['editcount']; // int
    $this->userInfo['rights'] // string[]
    $hooks->onSkinTemplateNavigationUniversal();
  }
}

class PageInfo service {
  $pageInfo['categories'] = Title->getCategories();
}

class UserInfo service {
  TalkPageNotificationManager->userHasNewMessages();
  $userInfo['newtalk'] = [ .. ];
  // move logic from Skin->getNewtalks();
  // replace Skin->getNewtalks() with return $userInfo['newtalk'].
}

class Echo\Hooks {
  function onUserInfo() {
    $userInfo['echo-messagecount'] = MWEchoNotifUser->getMessageCount();
  }

  function onSkinTemplateNavigationUniversal() {
    $userInfo['echo-messagecount'];
  }
}

What this task is not

This task does not change the architecture of MediaWiki or the skin system. It also does not change how MediaWiki is used or deployed at WMF.

This work covered by this task resolves long-standing technical debt in MediaWiki for the purposes of improved rendering performance and reduced overall complexity and maintenance cost for the skin system – whilst remaining fully backwards compatible.

The intended result is improved performance of rendering page views in MediaWiki core, both out-of-the-box and at WMF.

The works covered by this task should in my opinion be considered a prerequisite for the following proposals. Some of the below might be possible without this task, although I believe doing so would incur significant amounts of maintenance overhead and technical debt that can and should be avoided (through this task).

  • Add a way for MediaWiki to render wiki page, actions, and special pages without a skin. – T114596
  • Re-implement MediaWiki skin rendering for WMF-deployed skins in a micro-service that runs near the CDN. – T111588
  • Re-implement MediaWiki skin rendering for WMF-deployed skins in a way that can be run offline and client-side using service workes. – T106099.

Related Objects

StatusSubtypeAssignedTask
OpenNone
ResolvedNone
ResolvedJdlrobson
ResolvedSpikeJdlrobson
ResolvedAmmarpad
ResolvedJdlrobson
ResolvedAddshore
ResolvedJdlrobson
ResolvedAmmarpad
ResolvedPwirth
ResolvedAmmarpad
ResolvedKrinkle
ResolvedKrinkle
ResolvedJdlrobson
ResolvedJdrewniak
Resolved Niedzielski
OpenNone
OpenNone
OpenNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 428856 merged by jenkins-bot:
[mediawiki/core@master] skins: Move default style modules to getDefaultModules

https://gerrit.wikimedia.org/r/428856

Change 460406 had a related patch set uploaded (by Krinkle; owner: C. Scott Ananian):
[mediawiki/core@master] WIP: hard-deprecate adding modules from BeforePageDisplay hook

https://gerrit.wikimedia.org/r/460406

Change 485998 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/skins/Vector@master] Start extracting rendering from PHP into Mustache

https://gerrit.wikimedia.org/r/485998

Change 485998 merged by jenkins-bot:
[mediawiki/skins/Vector@master] Start extracting rendering from PHP into Mustache

https://gerrit.wikimedia.org/r/485998

Change 485998 merged by jenkins-bot:
[mediawiki/skins/Vector@master] Start extracting rendering from PHP into Mustache https://gerrit.wikimedia.org/r/485998

(Patch was reverted in https://gerrit.wikimedia.org/r/c/mediawiki/skins/Vector/+/486874)

Change 544378 had a related patch set uploaded (by Krinkle; owner: Derick Alangi):
[mediawiki/core@master] skins: Deprecate getDynamicStylesheetQuery, makeI18nUrl and makeNSUrl

https://gerrit.wikimedia.org/r/544378

Change 544378 merged by jenkins-bot:
[mediawiki/core@master] skins: Deprecate getDynamicStylesheetQuery, makeI18nUrl and makeNSUrl

https://gerrit.wikimedia.org/r/544378

Convert one Skin (e.g. Vector) to become a template, as example. (Note, there is no pressure to remove support for PHP-based skins, the only requirement is that it can handle responses in front of the page cache instead of behind it, e.g. using data properties instead of run-time database queries and WikiPage methods calls).

@Krinkle is this done? As we go into 1.36 we'll have Vector using SkinMustache and the method Skin::getTemplateData will be providing all the relevant data for rendering the associated template. I'm curious if there's any tangible tasks you'd like to work on over the next year

Make data flow one-direction between OutputPage and Skin.
Convert Skin hooks.

I'm very curious about these 2 bullet points and whether there are any actionables here right now that I can make a case for as we work on desktop refresh.

Convert one Skin (e.g. Vector) to become a template, as example. […]

@Krinkle is this done? As we go into 1.36 we'll have Vector using SkinMustache and the method Skin::getTemplateData will be providing all the relevant data for rendering the associated template. I'm curious if there's any tangible tasks you'd like to work on over the next year

I think from a high-level this is basically done. Making SkinMustache a stable API and updating Vector to use it is is essential for the mid-long term to reduce maintenance cost for everyone and for other skins to enjoy the same benefits. But for the purpose of this task I think we can consider that part done.

Make data flow one-direction between OutputPage and Skin.
Convert Skin hooks.

I'm very curious about these 2 bullet points and whether there are any actionables here right now that I can make a case for as we work on desktop refresh.

The task description elaborates on these a bit:

Task description:
  1. Data flow in one direction only. No more two-way communication between Skin and OutputPage. OutputPage injects all relevant data (maybe some of it lazy-computed) into the Skin class. Skin class may combine it with User info and transform properties (renaming, deriving, etc.), and then invokes the Skin template with it.
  2. Skin hooks must change from being "callables with global context returning uncacheable raw html" to instead be "callables that return cacheable HTML or template partials". They can still vary by user and page (e.g. "What links here", "My contributions"), but the decision must be made in Mustache syntax using only the available user/page information. This way it can be rendered in a different server process, or even client-side, and still work and have all the information (T106099). It also means that on page views, all available meta data is naturally already fetched and cacheable. If extension hooks require additional information, they can do so, but must use hook behind the cache, not in front of it. E.g. to do their query when the page cache blob is generated. Not on-demand for every page view.

For the data flow issue, you can uncover some actionables by stepping through (e.g. statically by reading code) Skin::execute and looking for any method calls along the way that aren't injected template data and aren't trivial pure-function helper methods.

For each of those things we can can determine:

  • … whether it's needed e.g. perhaps a simpler approach can work. As an example, the skin is currently querying the database for page metadata for Wikipedia:About, Wikipedia:General_disclaimer, and Wikipedia:Contact_us on every page view. It uses this information to decide whether to style the link as a redirect, or with the user's stub threshold, or as red link. This is due to all of LinkRenderer being pulled in. It's a product decision I suppose, but if you (plural) decide that it's okay for these to just always render as standard blue links (as they do today) this would remove a significant obstacle to making skins more stateless. We can't of course take that easy path on everything. There'll be a few hard problems to solve.
  • … whether they can be computed based solely on information already known to OutputPage or cached ParserOutput for the current page, in which case it should be retreived from there directly.
  • … whether it's something we could trivially expose as part of OutputPage and/or precomputed through ParserOutput. For example, right now the Skin performs its own ad-hoc database query to find out when the page was last modified. This seems like a common enough need that the skin shouldn't have to know how to get this. Even if we end up with a mechanism that is fragmented by skin to precompute some skin-speciifc data, this seems like it is common enough not to need to go through that mechanism.
  • … if it's none of the above, then move it its computation logic to a central place within the Skin (sub) class, e.g. a method not unlike prepareQuickTemplate but that returns a simple array for internal use. For example, if lastModified were a unique need custom to Vector, we'd move it to that new method so that it's easier to deal with long-term and to find what these use cases are then port them to something potentially cached on a per-page bases alongside parser cache or something.

I created a prototype to explore remaining issues with an API driven frontend https://www.mediawiki.org/wiki/Skin:Alexandria
More later.

Wrote up https://phabricator.wikimedia.org/phame/post/view/243/creating_a_vue.js_based_skin_with_server_side_rendering/

Untagging from the core skin architecture project as it's a bit too vague for me. If there's any tasks that can work towards this goal feel free to create them as subtasks with that tag.

Krinkle renamed this task from Prepare MediaWiki for API-driven frontend to Achieve predictable MediaWiki routing and cacheable skin data.Jul 20 2022, 11:18 PM
Krinkle updated the task description. (Show Details)
Krinkle removed subscribers: Mholloway, GWicke.

If this doesn't at this point meet the criteria of MediaWiki-Core-Skin-Architecture then that tag should be deleted or renamed.