[RFC] Multi-Content Revisions
Open, NormalPublic

Description

(proposal rewritten 2016-09-05)

The idea of this RFC is to allow multiple Content objects to be associated with a single revision (one per "slot"), resulting in multiple content "streams" for each page. The "main" slot being reserved for the primary content of the page (that is, for what is currently considered the content of the page).

For details, see the MCR design document on mediawiki.org.

Related Objects

There are a very large number of changes, so older changes are hidden. Show Older Changes

On 2016-09-05, @daniel moved the description of this task to mw:Multi-Content_Revisions That page now has many subpages, described in Daniel's recent email to wikitech-l

This is tentatively slated to be discussed in our ArchCom IRC meeting in a couple of weeks (E273, scheduled for 2016-09-21).

There's a lot to sift through. Daniel: a while back, you posted this list of questions:

Open questions:

  • Do we need to record the (primary) slots in th edatabase, so we can enumerate them reliably? Alternatively, we could ask all relevant services for all possible slots to get an enumeration.
  • Can primary content be stored outside the blob-store model? Related: should the slot table really have blob URLs?
  • Multiple "events" per revisions may be useful: each "like", each "view", each "comment", etc. "sub-revisions"?
  • Versioning for content models (wikibase model 0.1, 0.2, 1.0, 2.0; parsoid html 1.0, 1.1, 1.2...)
  • Drop content formats?
  • Slot content: use Content interface? More basic RevisionData interface? Or allow specialized interfaces, such as ParserOutput objects?
  • Optional meta-data associated with slots (etag, etc?)? with revisions (rev-props)? with blobs (hash, size)?

Are those still the key questions? Which ones have been answered?

@Pppery while I, personally, really appreciate any feedback, I'm probably not the only one interested, WHY you awarded a specific token :) would you like to explain it to us, so probably your concerns could be discussed and kept in mind while updating the RFC? :)

Pppery removed a subscriber: Pppery.Sep 15 2016, 11:26 PM

My concern is that this is just part of a general trend of making things more complicated than they need to be.

daniel added a subscriber: Pppery.Sep 16 2016, 11:04 AM

@Pppery I guess what's more or less complicated is a question of perspective. Is embedding JSON-based template schemas in wikitext more or less complicated than managing them separately? Is it more or less complicated to have a separate, form based interface for file license info, instead of nested templates?

Do you have a specific concern, or is this just a gut feeling?

Its really just a gut feeling that this is needless complexifying change. There is already a seperate TemplateData editor that can be accessed when you click the edit link for a template. And I'm not sure what series of nested templates for file license you are referring to.

daniel added a comment.EditedSep 16 2016, 12:35 PM

@Pppery I'm refering to this mess: https://commons.wikimedia.org/w/index.php?title=File:L%C3%ADneas_de_Nazca,_Nazca,_Per%C3%BA,_2015-07-29,_DD_46.JPG&action=edit. Oh, right, the license itself is not nested, just... cryptic.

Here's an overview of the use cases for MCR: https://www.mediawiki.org/wiki/Multi-Content_Revisions#Use_Cases.

daniel moved this task from proposed to tracking on the WMDE-TLA-Team board.
Alsee added a subscriber: Alsee.Sep 24 2016, 1:08 PM

Did anyone consider that it might be a bad idea to start building a radical change to the editing environment without investigating whether the editing community wants this? Ripping categories and templates and other stuff entirely out of the page?

Wiki operates on an extremely simple, powerful, and flexible paradigm. A page is simply a text file.

We can trivially copy-paste a page into any text editor, possibly close the browser or go offline, do anything and everything, then just paste it here and save.

This is proposing turning Wikipedia into a gigantic complex app.

A page is simply a text file.

A page is definitely not a simple text page. It's a text page written in a programming language - the wikitext and tempates - that happens to have a textual representation. It also includes references to external content like images inclusion, metadatas as categories and so on. It also happens that the wikitext representation of a page is not the only one and that parsoid has its own who may become the main representation for convenience of the developpers. This representation can also be copy/pasted.

To be rendered, it must be manipulated by a compiler that turns it into HTML content and does pretty complex stuff like managing the wiki categorisation index, include the templates and do on.

This proposal, as far as I understand, won't change a bit the textual representation of the content and the fact that you can copy/paste it into another page.

This is proposing turning Wikipedia into a gigantic complex app.

Open your eyes, Mediawiki is an app with decades of developpement and continuous developement, new features are constantly added, new usecases emerge. It's already a gigantic complex app. The only valuable question is "how do we manage such complexity", and this proposal is one of the answer.

Agree, Alsee. I don't find any of the use cases for this very compelling: Refutations of some of the usecases:

  1. Structured Media Data: What exactly is this seperating license information from? This proposed change seems like it would lose some of the flexibility in file licenses
  2. Page Assessments: The PageAssessments extension linked seems to be being built without MCR, so I don't see why this is necessary for it
  3. Infobox Data: Already handled via wikidata.
  4. Template styles: That's T483
  5. Template documentation: What is wrong with the convention of having a /doc page?
  6. Categories can already be managed as structure data via tools like HotCat
  7. Modules can have /doc subpages just like templates, and I don't see any advantage into integrating this further
Tgr added a comment.Sep 25 2016, 12:25 AM
  1. Structured Media Data: What exactly is this seperating license information from? This proposed change seems like it would lose some of the flexibility in file licenses

Flexibility means it is impossible to build assumptions into workflows and software. For very much the same reason that wikis use license and info templates to protect patrollers from the "flexibility" of hand-written, unstructured information, storing most of the file description page content as structured data would protect programmers' sanity when that information needs to be reused.

  1. Template documentation: What is wrong with the convention of having a /doc page?

Mainly that you are unable to preview changes. Smaller annoyances include that they do not get included in exports, they make the syntax more unforgiving (MediaWiki removes end-of-page newlines but does not remove newlines between the actual template and the documentation), the edit mechanism is unintuitive (click on the "edit documentation" link, fix a typo, and suddenly you are on a different page).

  1. Modules can have /doc subpages just like templates

More importantly, they tend to have unit test subpages, and currently we offer no help to module editors in testing changes *before* accidentally breaking half the wiki.

brion added a comment.Sep 25 2016, 7:30 AM

I wrote up some quick thoughts at https://www.mediawiki.org/wiki/User:Brion_VIBBER/MCR_alternative_thoughts

Mainly exploring along two lines:

  • what if we did a model with separate data tables for each new 'slot' instead of a common content-blob interface (possibly more line with Jaime's thoughts?, possibly different)
  • what if we went full in on using subpages, what would it take to support that?

The first would be in some ways similar to the MCR model, but with stricter typing, possible benefits in storage and schema consistency, etc but without the conveniences of the common interface for Content blobs. The second might be a much easier transition, but needs better high-level tooling and some new versioning concepts.

May be worth fleshing these out or combining some ideas just to brainstorm a bit.

Alsee added a comment.Sep 26 2016, 4:05 AM

My apologies, my intent wasn't to try to prove a case against MCR here. (Although I do understand why replies focused in that direction). Perhaps it would help if I shortened my previous comment:

Did anyone consider that it might be a bad idea to start building a radical change to the editing environment without investigating whether the editing community wants this?

Is there anyone who believes that point requires debate?

Did anyone consider that it might be a bad idea to start building a radical change to the editing environment without investigating whether the editing community wants this?

Each of the use cases have had quite a bit of discussion, and has had quite a bit of investigation by the people proposing it. For example, changing the way that licensing and media metadata is stored is documented here: https://commons.wikimedia.org/wiki/Commons:Structured_data

Many of the other use cases have similarly deep analysis with similarly long histories.

In general, the work that Daniel and many other people are doing at WMDE is work that affects core infrastructure, so there's not one single editing community we can ask. It also affects people outside the Wikimedia editing community. It has the same sort of complexity as interlanguage links had before Wikidata came along. Discussing it in the context of MediaWiki.org seems like a sensible place to put an RFC, and thus that's why we're discussing it here now.

Now, it would seem as though you are bringing this point up now because you're worried about making the system more complicated. Yes, that seems like a reasonable fear. A multi-slot "revision" seems similar to a file system fork, and will inevitably come with the same complexity. I'm eager to see how we make the case that this complexity is worth it.

@RobLa-WMF wrote

Now, it would seem as though you are bringing this point up now because you're worried about making the system more complicated. Yes, that seems like a reasonable fear. A multi-slot "revision" seems similar to a file system fork, and will inevitably come with the same complexity. I'm eager to see how we make the case that this complexity is worth it.

I think little of that complexity should be exposed to users. We probably don't want editors to freely mix and match slots - rather, we want an integrated experience for editing and display. Ideally editors should neither know nor care about slots.

What I take away from @Alsee's comment is that we should provide a more comprehensive and detailed overview of the use cases. It is however important to recognize that even if we implement MCR, this only gives us the *option* to manage things a different way. MCR itself changes nothing about how categories are stored - it just provides a sensible place outside the wikitext where they can be stored. MCR is designed to add a degree of freedom to MediaWiki which allows use to implement new features, and (perhaps more importantly) allow some features that have been hacked in in the past the work much more efficiently, smoothly, and user friendly.

One more thing about the simplicity of wikitext: wikitext isn't simple at all. It's versatile and powerful, but if you use it for anything but formatting text, it becomes rather complex and scary. The idea behind MCR is to use wikitext for formatting text, and move other data elsewhere, where it can be stored, edited, diffed, and rendered more efficiently and nicely.

I think little of that complexity should be exposed to users. We probably don't want editors to freely mix and match slots - rather, we want an integrated experience for editing and display. Ideally editors should neither know nor care about slots.

I think I agree with you, but you say this in a way that sounds dangerous.

The risk: the more that our data formats become a complex mystery that is only understood by a handful of people, the fewer people that will trust the systems we produce.

It's true that ideally, editors should not need to understand the underlying formats. We should create systems that are easy for both humans and computers to understand and manipulate. If we do this, we'll provide the ability to create user interfaces that behave intuitively. Advanced editors will learn the underlying model, and will be able to intuitively grasp the nature of the inevitable problems we'll have with the systems we build. They will also understand how to explain those problems to less advanced editors.

However, the more we try to hide the underlying storage format, the less that the most active editors will trust the systems we produce. Let's make sure that we come up with a system that is easy to explain what a revision is at the byte level.

Tgr added a comment.Sep 27 2016, 4:05 AM

I think little of that complexity should be exposed to users. We probably don't want editors to freely mix and match slots - rather, we want an integrated experience for editing and display. Ideally editors should neither know nor care about slots.

That probably works for editors but not for patrollers. Ie. we can keep the editing interface as it is (there would have to be a non-JS fallback with a textfield for each slot, but it does not have to be the default, even for non-JS users), but history will need some changes (it has to expose edits which do not change the main content, and probably add some filtering tools to handle that) and the diff view will have to expose the slots. That might be worth a discussion.

The risk: the more that our data formats become a complex mystery that is only understood by a handful of people, the fewer people that will trust the systems we produce.

Ah, yes, I agree. The structure of our content should be clearly defined and easy to grasp for interested people. That structure will become slightly more complex with MCR, since we add a level of indirection. On the plus side, the data formats used to represent things like categories or page assessments or license information will become a lot clearer and easier to understand and re-use.

However, the more we try to hide the underlying storage format, the less that the most active editors will trust the systems we produce. Let's make sure that we come up with a system that is easy to explain what a revision is at the byte level.

Yes, right - the system needs to remain transparent, and that's how MCR is designed. My point was that it should not be necessary for editing to know about this. People add tags on flickr without having to think about the underlying storage structure, or learn arcane syntax. It should be the same with MediaWiki.

That probably works for editors but not for patrollers. Ie. we can keep the editing interface as it is (there would have to be a non-JS fallback with a textfield for each slot, but it does not have to be the default, even for non-JS users), but history will need some changes (it has to expose edits which do not change the main content, and probably add some filtering tools to handle that) and the diff view will have to expose the slots. That might be worth a discussion.

Yes, at least in diffs, slots will be a visible concept. For history, watchlist, recentchanges, etc, filtering by slot may be useful, but otherwise I don't think it's necessary to expose the concept of slots there.

You are right that this aspect could use some more thought and discussion. The best place for this is the talk page of https://www.mediawiki.org/wiki/Multi-Content_Revisions/Views I think.

The risk: the more that our data formats become a complex mystery that is only understood by a handful of people, the fewer people that will trust the systems we produce.

Ah, yes, I agree. The structure of our content should be clearly defined and easy to grasp for interested people. That structure will become slightly more complex with MCR, since we add a level of indirection. On the plus side, the data formats used to represent things like categories or page assessments or license information will become a lot clearer and easier to understand and re-use.

Well, the "lot clearer" assertion remains to be seen. I think the current proposal still seems like an enormous change. I'm starting to wrap my head around it, but I can't fault many skeptics for questioning whether this represents a "minimum viable product". I realize there are many use cases, but what single use case would you consider your single must-have use case for an MVP?

Pppery removed a subscriber: Pppery.Sep 28 2016, 4:38 PM
daniel added a comment.EditedSep 28 2016, 5:16 PM

Well, the "lot clearer" assertion remains to be seen. I think the current proposal still seems like an enormous change. I'm starting to wrap my head around it, but I can't fault many skeptics for questioning whether this represents a "minimum viable product". I realize there are many use cases, but what single use case would you consider your single must-have use case for an MVP?

You are right that it is a big change, both conceptually and technically. I'm doing my best to minimize the cost, but it's not trivial.

To me it seems like the cost is justified because MCR would address the need of several use cases. For a single use case, it would perhaps not be justified, and a more specialized solution would be sufficient. But a specialized solution for each use case would be a lot more expensive, and would introduce a lot more complexity. The idea is that adding a layer of abstraction, MCR, will allow such use cases to be implemented with a minimum of extra code.

It's about scalability of the platform when adding features. Compare: TCP isn't great because it serve a specific use case particularly well, but because serves a large number of use cases reasonably well by adding a generalized abstraction layer for flow control on top of IP. Similarly, MCR aims to add a degree of freedom to MediaWiki's page model, which should serve a number of use cases quite well, in that it lower the complexity of their implementation significantly.

To allow the supposed benefit of MCR to be assessed and verified, we should define the requirements for the MVP for each must-have use case. If we find significant overlap in the platform needs of several use cases, a generalized solution like MCR is justified. The requirements for that generalized solution can then be derived directly from the platform needs of MVPs.

I have done the above informally in conversations with WMF product owners and developers over the last year, but I admit that this is not documented sufficiently. We (Lydia and me) are in the process of reaching out to WMF product owners, asking them to provide more detailed requirement, rationales, and priorities for their use cases, and we plan to document them on a subpage of https://www.mediawiki.org/wiki/Multi-Content_Revisions.

To me as a Wikidata developer, the "killer use case" is structured media info, but e.g. James, Mark, or Kaldari may have other priorities. The Wikidata team will provide a brief summary of the requirement and rationale for structured media info soon, but to get it right, we want to coordinate with the WMF multimedia team first.

(Please note that I'm out of office until October 24; I'll be working some of the time, but I will be traveling and attending a conference)

To me as a Wikidata developer, the "killer use case" is structured media info, but e.g. James, Mark, or Kaldari may have other priorities. The Wikidata team will provide a brief summary of the requirement and rationale for structured media info soon, but to get it right, we want to coordinate with the WMF multimedia team first.

I may update the description of this task and of the RFC on mediawiki.org to say this. This answer isn't etched in stone, but when someone asks me "what is the MVP for Multi-Content Revisions", I'll say "structured media info". I'm not sure which URL I'll point them to, but I'm sure I'll find something.

(Please note that I'm out of office until October 24; I'll be working some of the time, but I will be traveling and attending a conference)

Thanks for reminding us of this. You're obviously the primary contact from WMDE for this, but who is the product manager from WMDE whose work would be blocked if this is delayed? Is that @Lydia_Pintscher or someone else?

Tgr added a comment.Sep 28 2016, 7:29 PM

It might be helpful to split the use cases into ones where MCR is nice to have and those which need it. As I understand it, there are roughly three groups:

  • data that would otherwise be stored on separate pages but could be bundled into a single page for better UX: media info, doc subpages, {{/header}} and similar templates, maps JSON blobs etc. This is mostly "nice to have" territory although in the case of media info (some of which will have to be manually migrated from description page templates) the UX degradation would be pretty jarring so that might be closer to must have.
  • data that is currently stored on multiple pages but needs atomic updates to ensure consistency (gadget CSS/JS, template styles, template/module test pages). MCR is needed to make those behave correctly.
  • supplementary data that is used by some tool (editor, mobile app etc) and not really intended for direct manual editing: lead image focus, structured categories, page assessments, maps. These would have to be stored somewhere else, which would be a major loss of efficiency for developers as they would have to rebuild fundamental infrastructure from scratch for each one.

Thanks for reminding us of this. You're obviously the primary contact from WMDE for this, but who is the product manager from WMDE whose work would be blocked if this is delayed? Is that @Lydia_Pintscher or someone else?

Yes it's mine.

It might be helpful to split the use cases into ones where MCR is nice to have and those which need it. As I understand it, there are roughly three groups:

  • data that would otherwise be stored on separate pages but could be bundled into a single page for better UX: media info, doc subpages, {{/header}} and similar templates, maps JSON blobs etc. This is mostly "nice to have" territory although in the case of media info (some of which will have to be manually migrated from description page templates) the UX degradation would be pretty jarring so that might be closer to must have.

I would argue it is a must have. We can technically do it in several pages but the chance of getting it accepted by the community with the degraded usability and features is close to 0.

  • data that is currently stored on multiple pages but needs atomic updates to ensure consistency (gadget CSS/JS, template styles, template/module test pages). MCR is needed to make those behave correctly.
  • supplementary data that is used by some tool (editor, mobile app etc) and not really intended for direct manual editing: lead image focus, structured categories, page assessments, maps. These would have to be stored somewhere else, which would be a major loss of efficiency for developers as they would have to rebuild fundamental infrastructure from scratch for each one.

I may update the description of this task and of the RFC on mediawiki.org to say this. This answer isn't etched in stone, but when someone asks me "what is the MVP for Multi-Content Revisions", I'll say "structured media info". I'm not sure which URL I'll point them to, but I'm sure I'll find something.

https://commons.wikimedia.org/wiki/Commons:Structured_data is the best we have atm.

daniel added a comment.EditedSep 28 2016, 7:34 PM

It might be helpful to split the use cases into ones where MCR is nice to have and those which need it. As I understand it, there are roughly three groups:

I'm missing the group "currently embedded in wikitext and would benefit from separate storage, editing, diffing, etc", e.g. page assessment, media info, categories, template schema, translation tables, ...

Did anyone consider that it might be a bad idea to start building a radical change to the editing environment without investigating whether the editing community wants this?

Each of the use cases have had quite a bit of discussion, and has had quite a bit of investigation by the people proposing it.

So the answer is no, no thought of investigating whether the editing community wants this.

What I take away from @Alsee's comment is that we should provide a more comprehensive and detailed overview of the use cases.

So the answer is no, no thought of investigating whether the editing community wants this.

You've got two editors who stumbled across (*) this project, both waving red flags that there may be a problem here.

The WMF has been working on a Technical Collaboration Guideline as part of the Software Devlopment Process. In part, "establishing best practices for inviting community involvement in the product development and deployment cycle". Most development goes smoothly and everyone is happy with a lot of what the WMF develops, but there is a long history of occasional projects that result in conflict. There have been cases where the WMF believed something was obviously a good idea, but where editors had a very different perspective. The editing community may weigh the pros and cons very differently than you have.

The idea of pulling categories, templates, and other things of out the wikitext is a pretty radical change. I understand you have use-case-proposals and the reasons you think they're good ideas. I'm not here to directly debate that. I'm here to alert you to the fact that this is a Big Deal. I am here to alert you that the Community may have a very different perspective, that this may be highly controversial. The proposed use cases may start evaporating if the community considers them unwanted or disruptive.

I'm saying it would be a good idea to post the template-use-case and/or category-use-case and/or others at EnWiki Village Pump to find out how it will be received. (EnWiki is nearly half the global community, you can certainly post elsewhere as well if you feel broader input is needed.)

The response could range from "we love it", to identifying must-have design requirements to support various workflows, to "hell no". Whichever way it goes, the time to get that information is before something is built.

It's not true that we have not asked the community. Structured data for Commons has been asked for many many times. People are very happy with the progress we have made so far as can be seen for example here: https://commons.wikimedia.org/wiki/Commons_talk:Structured_data#It.27s_alive.21 Or here: https://blog.wikimedia.org/2016/08/23/wikidata-glam/
For the Wikidata team Multi Content Revisions is an essential part of making structured data on Commons happen. All the other use cases are potentials at this point. Their teams will be responsible for doing the community consultations on these as they start working on them. If they go ahead on those or not is independent of our need to have it for structured data on Commons. It is however important to bring them up now to make the case for why Multi Content Revisions are important to have in the long term.

What I take away from @Alsee's comment is that we should provide a more comprehensive and detailed overview of the use cases.

So the answer is no, no thought of investigating whether the editing community wants this.

That's not what I said. To the contrary, I know that such investigation was done at least for the use case that is my job to take care of, namely structured media info; I also know that separating categories out of the wikitext has been requested and discussed numerous times. What I said is that the status of these investigations and discussions needs to be better documented and linked to from the technical proposal.

The idea of pulling categories, templates, and other things of out the wikitext is a pretty radical change. I understand you have use-case-proposals and the reasons you think they're good ideas. I'm not here to directly debate that. I'm here to alert you to the fact that this is a Big Deal. I am here to alert you that the Community may have a very different perspective, that this may be highly controversial. The proposed use cases may start evaporating if the community considers them unwanted or disruptive.

I agree that it would be a Big Deal to e.g. moved Wikipedia infoboxes out of the wikitext. But please note that this RFC does not propose doing that. It proposes a change to the platform that would allow us to do that -- and more importantly, it would allow other sites to manage infoboxes outside the wikitext.

Of course, if none of the use cases was endorsed by the community (which community?), the proposed change to the platform would be pointless. And you are correct that we need to take care to have the community in the loop when discussing use cases and requirements.

I fear I missed an important point when listing the use cases: I did not make a clear distinction between use cases for which we have consensus for implementing them and use cases for which we see potential, or have had repeated requests, but which have not yet been fully investigated or discussed broadly. That's why I said that we need a more comprehensive and detailed overview of the use cases.

PS: One side note about discussing changes to the editing interface with the community of editors: the editors who are active on the site today are the ones who like (or at least got used to) the current interface - the ones that find the current way to edit unusable have given up after a few tries. We would like to change this, and open the editing experience to people who do not want to fiddle with complex syntax; this may mean changes that some people who have become experts at fiddeling with wikitext don't like. We'll need to find the right balance, but we cannot find it if we listen only to the people who are active editors now. But that has nothing to do with the MCR proposal, it's just a general observation about discussing new features with "the" community.

Pppery added a comment.Oct 8 2016, 9:10 PM

It might be helpful to split the use cases into ones where MCR is nice to have and those which need it. As I understand it, there are roughly three groups:

I'm missing the group "currently embedded in wikitext and would benefit from separate storage, editing, diffing, etc", e.g. page assessment, media info, categories, template schema, translation tables, ...

Why? What is wrong with the page assessment being stored in wikitext? Seperate editors for TemplateData and categories already exist and I see no need to split these out further. This would probably also make it harder to do the semi-common thing of removing an {{uncategorized}} template and replacing it with categories, which would requires going through two editors in this proposal

daniel added a subscriber: Pppery.EditedOct 12 2016, 4:36 AM

Why? What is wrong with the page assessment being stored in wikitext? Seperate editors for TemplateData and categories already exist and I see no need to split these out further.

The people who write specialized editors like that are exactly the ones who want MCR most. Because it's really hard to get this right. For instance, how can you find all category links on a page? You need to know all local aliases for the category namespace, you have to know which tag extensions accept wikitext (<poem> does, but <source> doesn't), and if the category link is in a template parameter, you have no idea whether it is *actually* a category link, or just looks like one.

And if you add a category, we save (and re-render!) a new copy of the entire page, instead of just the bit that changed.

Yes, these tools exist, but they unreliable, hard to maintain, and inefficient. That's exactly how the idea for MCR was born.

Izno added a subscriber: Izno.Oct 12 2016, 1:20 PM

which would requires going through two editors in this proposal

You seem to be visualizing a particular implementation. That's usually bad design.

When I see a potential implementation (for a wikitext solution--never mind VE for right now), I see multiple <textarea>s, each with their own storage of elements. That doesn't require a new window. Or autocomplete-enabled category selection ("here, add tags for this article!") without ever exposing the wikitext syntax to the user. (Right now, I need a gadget for that.) Or forms for page assessment (regarding which, I have no doubt T120219: PageAssessments deployment to WMF wikis would be enabled by MCR).

daniel added a comment.Nov 1 2016, 5:03 PM

I have proposed T149532: Why Multi-Content-Revisions? Use cases and requirements. as a session for the #Wikimedia-Developer-Summit_(2017). If you are interested in such a discussion at the summit, please comment on the ticket.

Question : History of old articles

If I understand correctly, this feature will potentially allow to view an article with the versions of the templates that existed at the time the wikitext was edited. Two questions arise then :

  • will that also work for deleted templates ?
  • will we be able to restore the revisions of version prior to multiple content revision deployment, say a 2005 revison of some article ?

@TomT0m No, Multi-Content-Revisions does not help with consistent display of old template revisions. Well, it does in cases where the use of templates is replaced by the use of slots - if e.g. template documentation was stored in a slot instead of a subpage, you would always see the correct version of the documentation for old versions of the template. But that would be because it would no longer use the template mechanism.

@TomT0m No, Multi-Content-Revisions does not help with consistent display of old template revisions. Well, it does in cases where the use of templates is replaced by the use of slots - if e.g. template documentation was stored in a slot instead of a subpage, you would always see the correct version of the documentation for old versions of the template. But that would be because it would no longer use the template mechanism.

Ok, I got confused. Does that mean that the documentation will not have its wikipage address anymore ?

Would this then be possible to have a special type of "reference" slot which would hold a pointer to another page revision ? I guess the parser could be modified to maintain those reference slots when page are saved.

For example the parser computes a new version of the page when its content is modified, and when he expands a template a hook triggers the slot mangager to store the revision number of the template with those "reference" slots - I guess this this kind of hooks or something similar exists since we got a list of the used templates on previsualisation of a page.

Tgr added a comment.Nov 17 2016, 8:09 AM

If I understand correctly, this feature will potentially allow to view an article with the versions of the templates that existed at the time the wikitext was edited.

You might be thinking of Memento (which is not related to this in any way).

daniel added a comment.EditedNov 19 2016, 3:23 PM

Ok, I got confused. Does that mean that the documentation will not have its wikipage address anymore ?

Yes, the documentation would be part of the template page proper, and would not have a separate title.

Would this then be possible to have a special type of "reference" slot which would hold a pointer to another page revision ? I guess the parser could be modified to maintain those reference slots when page are saved.

That would theoretically possible, but there are currently no plans to do this. I'm also not sure this would be the best way to tie a page revision to template revisions. So far, slots are intended to be editable, not derived. I have been thinking about derived slots, but the use cases for that idea all seem a bit contrieved, and would perhaps be better served by a more specialized solution, like a dedicated database table.

For example the parser computes a new version of the page when its content is modified, and when he expands a template a hook triggers the slot mangager to store the revision number of the template with those "reference" slots - I guess this this kind of hooks or something similar exists since we got a list of the used templates on previsualisation of a page.

This could be done with a DB table that associated a revision ID of the "transcluder" with a revision ID of the "transcluded" in each row. Simple enough to do, and would be stable against moving the template being renamed, etc. It's going to be a big table, though. And quite a change in how things work. As Tgr pointed out, there is the Memento extension that does this with some limitation. It's a feature that has been discussed time and time again, but never gained enough traction to be properly implemented.

Just for clarity, as I've worked on this task but not actually commented, we in Editing see MCR as very important to our long-term plans. The use cases laid out at Multi-Content Revisions#Use Cases cover a lot, but I'll just pull out the four that we see as most vital:

  • The structured media info work, as almost goes without saying;
  • Rejigging templates to have dedicated template, styling, data, and documentation slots, with UI to match;
  • Rejigging files to have a fused history for the blob and the description, removing UI confusion; and
  • Moving to a structured-data approach for categories.

Lots of others are also important, but those are the most useful.

daniel moved this task from Inbox to Project on the User-Daniel board.Jan 5 2017, 7:03 PM
cscott added a comment.EditedJan 11 2017, 8:13 PM

If we use MCR for annotation storage, it would be useful to have a canonical URL for the contents of a specific slot. That might be an API URL, like https://en.wikipedia.org/api/rest_v1/page/html/Main_Page/749836961/<slot number> or else a user-visible URL like https://en.wikipedia.org/wiki/Main_Page/<slot name> or https://en.wikipedia.org/wiki/<Slot>:Main_Page or even a quasi-API URL like https://en.wikipedia.org/wiki/Special:redirect/slot/<revision>/<slotname>. Thoughts?

(cc @MarkTraceur)

Addshore removed a subscriber: Addshore.Apr 3 2017, 9:52 AM
Deskana added a subscriber: Deskana.Jun 7 2017, 2:09 PM
-jem- added a subscriber: -jem-.Jun 23 2017, 10:51 AM
Ayack added a subscriber: Ayack.Jul 6 2017, 7:05 PM
Rical added a subscriber: Rical.Jul 16 2017, 2:04 PM
Restricted Application added a subscriber: PokestarFan. · View Herald TranscriptTue, Aug 1, 10:28 AM