Page MenuHomePhabricator

TDMP DR: Provide for asynchronously-available MediaWiki parser content fragments / components
Open, Needs TriagePublic

Description

Decision Statement Overview

What is the problem or opportunity?

MediaWiki's core feature is a content framework through which end users can invoke features inline within the "wikitext" markup syntax, which trigger the Parser to cause things to be included into the HTML output, or change the behaviour of the page in some other manner. This is currently entirely synchronous. We would like the ability to add asynchronous content fragments to the parser output. Any changes we would make should be backwards-compatible, without requiring any changes in teams with existing uses of the parser system.
Background:

Over the years, we at Wikimedia have used the current system to great effect, both within MediaWiki itself such as through categories, content transclusion ('templates'), and render-time- evaluated 'parser' functions, and with MediaWiki extensions and through them server-executed code, such as embedded video playback, graph rendering (presently decommissioned), and Lua scripting.

However, all content inclusions are currently synchronous, either injected into the content at parse / render time, or take the form of an HTML link to media. This second option is evaluated by the consumer's browser at read time, and so allows extensions to point to an item that potentially doesn't exist, or at least doesn't exist yet. The most high-profile use of this quasi-asynchronous inclusion feature is in the TimedMediaHandler extension, which queues up video transcoding on upload. Users can add wikitext to embed the video before it becomes available to view, and the embedding is re-evaluated as to whether the target exists whenever the page is rendered. Such a function is not possible for injected content fragments, however.

The two main performance-related limiting features within the parser are a limit on the size of the input wikitext (2MiB) and a limit on the number of template invocations on a page (10,000). Later, a scripting augmentation (the Scribunto extension) was implemented to replace many of the template invocations, and thus make the 10,000 template calls go further.

For Wikifunctions, we would like to change this. Wikifunctions will allow for inline invocation of function calls on Wikimedia wikis' pages, using code defined on Wikifunctions and executed on the back-end service to be spliced into the page in the invocation point. We would like the function execution to be asynchronously triggered on page parse, function, or input update, with content fragments injected when they become available, and with stale fragments or, potentially, placeholders shown in their place where necessary.

For a worked example, a page might use a Wikifunctions invocation to fetch the population of a city over time from Wikidata, and calculate the pattern in drift, and turn that into some prose. Wikitext of {{#function|Z12345|en|Q12345|5 years}} might result in the content fragment "The city has an official population of 5.4 million, up by 1.4% over the five years from 2014–2019.". When the result changes, because an input was changed (e.g. from 5 years to 10), because the Wikidata item is updated (e.g. to add a new value), or because the code implementing the function was changed (e.g. fixed to support time spans of centuries), the rendered version of the article would continue to include the old version of the fragment for a few seconds until the new one was available to replace it.

The team does not have a particular solution in mind for how to build this, and is looking to work with partner teams to explore possible performant, sustainable, roadmap-aligned ways by which to achieve this aim.

What does the future look like if this is achieved?

For Wikifunctions, in-page calls would be executed asynchronously without blocking the initial page render. The return values would be cached and injected into the page after it is generated (or later in the synchronous generation if the responses are fast enough, e.g. less than 100ms). Users could add a number of function calls without hitting arbitrary limits. Pages using too many function calls would degrade gracefully to readers with a placeholder, rather than failing to load.

More widely, beyond the work of the Abstract Wikipedia Team itself, there are several potential forms of asynchronous content creation, from which other features and extensions could benefit:

Asynchronous content creation could theoretically also be extended to template calls, but we understand that such a feature re-use would first need the "balanced templates" pre-TDMP RfC to be implemented to avoid secondary impacts, which itself is awaiting the final replacement of the legacy Parser with Parsoid RfC.

Another example would be Wikidata query result pages, which often take up to a minute. These could be integrated 'live' from the query endpoint, rather than wikitext pages being updated via a user-maintained bot running on a cron job.

Most wide-rangingly and longer-term, this would be a big step towards giving MediaWiki the capability to asynchronously compose content in general. This would be a major enabler for combining the output from multiple, event-driven systems.

What happens if we do nothing?

Calls to Wikifunctions would be synchronous, and thus slow down the generation of pages, making the sites using Wikifunctions slower. Some function calls would hit the cache, but others wouldn't, providing inconsistent performance for users. On some pages on e.g. the English Wikipedia, logged-in users would sometimes have to wait multiple seconds for the page to load.

One alternative route to address this need would be a large capital outlay on additional production servers, but this would be very expensive, increase the environmental impact unnecessarily, and would only support relatively limited uses.

For those wikis where we were deployed, we would necessarily impose very strict time limits on the nature of the function calls we would allow users to add to a page, reducing the value to editors and thus readers by limiting complexity to more trivial cases.

This would be an unacceptable performance outcome, and would effectively prevent the production enablement of Wikifunctions calls, and of the future Abstract Wikipedia service.

The opportunities for communities to share functions between them to replace local Lua modules is lost, so smaller communities either have to continue to invent their own, remember to copy other wikis' features, or never know of the options available to them.

Any additional background or context to provide?

Abstract Wikipedia planning home page

Event Timeline

Jenlenfantwright updated the task description. (Show Details)
Jenlenfantwright updated the task description. (Show Details)
Jenlenfantwright updated the task description. (Show Details)

A somewhat similar problem is how to include metadata about the content in the page HTML when that metadata needs to be calculated in not-quite-real-time. An example is T213505: RfC: OpenGraph descriptions in wiki pages but there are all kinds of other potential use cases, e.g. around machine learning (such as altering page presentation based on the ORES rating of the revision). This doesn't involve the parser, which make it much simpler, but one shared aspect of the two problems is the need for refreshing the edge cache (and MediaWiki's HTML cache, if enabled) when the final HTML has been produced. It would be nice if the mechanism chosen for that would be sufficiently generic.

I know of two existing features that could benefit from this:

  • <math> tags (Math extension), when using MathML rendering. The actual rendering happens in a separate service, contacted via command line or via HTTP requests. Currently it waits on them synchronously, and to improve performance slightly, does weird hacky stuff to batch them (which causes bugs when incompletely parsed result is used, e.g. T242327).
  • Image thumbnails, when using $wgInstantCommons. Instant Commons uses synchronous HTTP requests to fetch image dimensions from Commons, and relies on caching to only be excruciatingly slow the first time they're fetched (and the caching is currently broken: T235551).

T249419: RFC: Render data visualizations on the server (Extension:Graph / Graphoid) is also somewhat related. It's similar to the Math use case, except that the graph definition is too large to be included in the thumbnail / iframe URL, so it needs to be retrieved via a side channel, which causes lots of complications.

Heck, we'd like the main article content to be async as well if possible. Parsoid used to generate all content *at the time an edit was saved* ensuring that the content fetch was always fast because it was coming directly from RESTBase. When moving into core, a key problem is how to handle edge cases where Parsoid rendering is 'too slow' to happen synchronously from the front end. The same approach (serve placeholder w/in front end time limit, asynchronously replace with the rendered article when complete) would help here as well. (You might notice that in some sense "render an article" is equivalent to "render an inclusion, where the included content is the article".)

This begs an obvious question about recursion. What if the async content references other asynchronous content?

Maps was another instance mentioned earlier today: apparently most of its rendering is currently "fast enough" but the case might also arrive where rendering a particular maps tile (or etc) might exceed the current "front end request time limit".

And that brings us back to the fundamental question which is: how do we prevent a DoS? If a page can include an arbitrary number of expensive-to-compute functions, how do we prevent one page queueing an escalating number of update requests? (Or worse, update cycles!) The proposal seems to mention that we 'cut off' rendering at some point, but I think we need to be very clear about how, given a large # of jobs of uncertain expense from a large # of users, those jobs are prioritized onto available rendering hardware -- and where the queue is ultimately cut off. (Note: we don't know how expensive a task will be before we try to execute it; we'd probably like to execute jobs in parallel but that can "break the bank" of a resource limit very quickly; multiple users may request pages all of which depend on a specific render task, how does that affect its priority; if we ultimately abort a task because it runs over a resource limit, do we 'give back' the time we spend on it to some other task on that page?)

Finally, I'll note that the basic async update mechanism here could enable all sorts of responsive UX. If I have a box that renders to either "yes" or "no" and I let user actions flip that status and have that newly-rendered content asynchronously propagate to pages without the user needing to force a reload, I've created a collaborative checklist, or an interactive guessing game. Even if we don't immediately *intend* to create such features, I suspect our clever users will quickly figure out how to create these sorts of UXes by abusing whatever features we do provide.

I would recommend limiting discussion of asynchronous methods of rendering to wikifunctions specifically. There are a series of technical challenges presented by this approach at the level of:

  • edge caching
  • thundering herd protection
  • resource starvation
  • data consistency
  • scalability (as this is more or less wikidata on steroids)

On the surface, it's hard for me to justify creating a new approach when a lot of pages, if completely uncached, already require 30-60 seconds to render. If we limit the number of wikifunction calls per page (something that is a requirement if we don't want it to explode in our hands, because editors malicious or not abuse it) and we use strict timeouts for their calls, I think rendering a page that has just been edited or is rarely visited (thus no cache whatsoever) is not a problem.

For change propagation from wikifunctions to the main wikis I don't see a big difference between the problem of asynchronously rendering these functions and the problem of dispatching changes from wikidata. Admittedly we've learned a lot from the mistakes we've made originally in updating wiki pages from wikidata changes, and we can build upon that.

I want to underline, in particular, that this sentence in the decision statement overview:

Users could add a number of function calls without hitting arbitrary limits. Pages using too many function calls would degrade gracefully to readers with a placeholder, rather than failing to load

this would risk reproducing the cascading failures we had with the old RESTBASE/parsoid chain that would, for some pages, spawn hundreds of calls to the mediawiki api (for rendering lua functions) and kill the whole cluster fast.

Imagine a malicious editor that adds 10k wikifunctions call to their user sandbox, then re-renders it continuously. At best, that's a DDOS against wikifunctions and the jobqueue. So, given we will have to introduce limits to the number of functions we call from a page, I don't think that argument holds.

What I am saying is: while it's easy to see the advantages of asynchronous rendering, there are a lot of complexities to consider. I can expand on my point above if needed but I'm unsure this is the right venue for that. But my main point is: let's not take such an approach and extend it to everything out of the door; we should first test it with something new and for now small (wikifunctions) and expand on the lessons learned there, instead of immediately expanding it to more stuff.

Jdforrester-WMF renamed this task from Provide for asynchronously-available MediaWiki parser content fragments / components to TDMP DR: Provide for asynchronously-available MediaWiki parser content fragments / components.Dec 22 2021, 5:51 PM
  • <math> tags (Math extension), when using MathML rendering. The actual rendering happens in a separate service, contacted via command line or via HTTP requests. Currently it waits on them synchronously, and to improve performance slightly, does weird hacky stuff to batch them (which causes bugs when incompletely parsed result is used, e.g. T242327).

@daniel and others are currently working on reimplementing parts of the "hacky stuff". I would love to see all math tags on the screen rendered in parallel. If some core component could participate in the parallelization and the caching, this would significantly simplify the math extension codebase. I am happy to contribute, as it is a lot of work for me to maintain the current hacks. In particular

  • I would like to get rid of calling str_replace on the page text in the parser after tidy hook
  • offload the caching to core, as there are different caching mechanisms (database, restbase, WANObject cache) built in the math extension that were fashionable at their times.

I hope people agree, and it fits (partly) into the scope of this ticket.