Page MenuHomePhabricator

Add support for derived MCR slots
Closed, ResolvedPublic

Description

From https://www.mediawiki.org/wiki/Multi-Content_Revisions/Derived_slots:

Derived slots are an an addition to MCR that would allow information that is derived from the content of a page (or more precisely, from the content of the slots of a revision) to be stored alongside that content (as part of the same revision), even if the derived content is generated asynchronously or updated later on.

Derived slots would work much like regular slots, with a few important differences:

  • their size is not included in the revision size, and their hash does not contribute to the revision's hash.
  • they can be updated at any time, using a new updateRevision() method that lives alongside saveRevision.
  • updating a derived slot is a destructive operation, the previous content of that slot (on the same revision) is lost.
  • updating derived slots is transparent for users. No entry is generated in the revision history or in RecentChanges or on the watchlist. The update is a purely technical operation, not an event from the perspective of the user.
  • If a derived slot is updated for the current revision of a page, this would however cause the page to be re-rendered (perhaps we want to make this optional), and derived page data (such as entries in the links tables) to be regenerated, similar to the way pages get rerendered when a template changes.
  • Derived slots should not show in diff views, at least not per default. The purpose of a diff view is to show what a user changed.

Event Timeline

Change 669277 had a related patch set uploaded (by Cicalese; owner: Cicalese):
[mediawiki/core@master] Add support for derived MCR slots

https://gerrit.wikimedia.org/r/669277

Change 669277 merged by jenkins-bot:
[mediawiki/core@master] Add support for derived MCR slots

https://gerrit.wikimedia.org/r/669277

This sounds really cool, but I am having trouble seeing what the use case for it is. What kind of data would one put in a "derived slot"? And since it is programatically generated, why not handle it in the parser and its cache?

This sounds really cool, but I am having trouble seeing what the use case for it is. What kind of data would one put in a "derived slot"?

One possible application would be a "blame map", another one would be OCR text for uploaded media. The application that led to this feature to be implemented was semantic tagging of sections.

And since it is programatically generated, why not handle it in the parser and its cache?

The parser cache is only available for the most recent version of the page. If the data in question is only needed for the latest version, this would be sufficient if the parser cache was generalized a bit (T227776).