Make relevant API modules aware of MCR
Open, NormalPublic

Description

Any API module that outputs revision (meta-)data needs to become aware of MCR, and must support multiple content objects per revision in the output.

See https://www.mediawiki.org/wiki/Multi-Content_Revisions/Web_API for some initial thoughts

daniel created this task.Aug 24 2017, 2:40 PM
thiemowmde triaged this task as Normal priority.Dec 5 2017, 6:45 PM
Lydia_Pintscher moved this task from incoming to monitoring on the Wikidata board.Dec 18 2017, 3:12 PM
tstarling assigned this task to Anomie.Feb 8 2018, 9:37 PM

Most core API modules are simple enough, they just use the RevDel constants and Revision::userCanBitfield() which are easily enough replaced by the same constants and method on RevisionRecord.

ApiQueryRevisionsBase and its subclasses need a "slots" parameter, both to select which slots should have content-related metadata returned and to indicate that the existing, main-only content format should no longer be used. A new prop to list the available roles per revision is also useful.

For ApiComparePages, we need to first figure out T174036: Allow the diffs to show multiple slots [MCR].

For ApiEditPage and ApiStashEdit, we need to first figure out what editing looks like. See also T174033: Refactor EditPage to allow multiple slots to be edited atomically [MCR].

For ApiParse, we need to first figure out T174035: Allow the view action to show multiple slots [MCR].

For ApiComparePages, we need to first figure out T174036: Allow the diffs to show multiple slots [MCR].

Diffs will be slot-by-slot. For T174036 we need to figure out how in what order they should be shown, with what headings, where the code for composing the HTML should live, etc. None of that matters for API output. So I'd think this can be handled just like the ApiQueryRevisionsBase case: add a slots parameter, and return diffs for multiple/all slots if requested, be introducing another level in the result structure.

For ApiParse, we need to first figure out T174035: Allow the view action to show multiple slots [MCR].

ApiParse has several modes, and MCR only matters when the page parameter is specified. In that case, it should be possible to again handle this just like the ApiQueryRevisionsBase case: add a slots parameter, and return output for multiple/all slots if requested, be introducing another level in the result structure.

That leaves the question what exactly should happen if no slots parameter was specified. Should the result be a "combined" output, or just the main slot? In my mind, the result of ApiParse is a representation of a ParserOutput object; and with no slots parameter, it should be the ParserOutput that we would put into the parser cache - that is, whatever PreparedEdit::getCombinedParserOutput() returns. For now, that is just the main slot, but we are free to change this.

For ApiComparePages, ApiEditPage, ApiStashEdit, and ApiParse, I need to figure out how exactly the API should accept being passed data for multiple slots to replace the existing text parameters. This will probably take the form of parameter-templates that turn into parameters based on the value of some other parameter.

One possibility is to have the template-parameters reference their parent parameter:

'slots' => [
    ApiBase::PARAM_TYPE => $slotRoles,
    ApiBase::PARAM_ISMULTI => true,
    ApiBase::PARAM_ALL => true,
],
'text-{slot}' => [
    ApiBase::PARAM_IS_TEMPLATE => [ 'slot' => 'slots' ],
    ApiBase::PARAM_TYPE => 'text',
    ApiBase::PARAM_REQUIRED => true,
],
'contentmodel-{slot}' => [
    ApiBase::PARAM_IS_TEMPLATE => [ 'slot' => 'slots' ],
    ApiBase::PARAM_TYPE => ContentHandler::getContentModels(),
],

That would mean you'd pass something like slots=foo|bar&text-foo=...&contentmodel-foo=...&text-bar=...&contentmodel-bar=... when making the query, and the API would recognize all those parameters and validate them properly.

Some notes on other interactions:

  • ApiHelp would add a note about template parameters to the documentation.
  • ApiParamInfo would (regardless of whether these wind up being returned by getAllowedParams() or a new method) separate them out into a 'tempateparameters' array separate from the existing 'parameters' array.
  • ApiSandbox would add or remove fields as appropriate when the value of the named template-for parameter changes.
  • ApiStructureTest might make sure none of the template parameters result in collisions with other parameters for the module (template or non-template).

Change 413223 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] API: Update query modules for MCR

https://gerrit.wikimedia.org/r/413223

For ApiComparePages, we need to first figure out T174036: Allow the diffs to show multiple slots [MCR].

Diffs will be slot-by-slot. For T174036 we need to figure out how in what order they should be shown, with what headings, where the code for composing the HTML should live, etc. None of that matters for API output.

What does matter is what changes if any get made in DifferenceEngine.

ApiParse has several modes, and MCR only matters when the page parameter is specified.

Wrong. It obviously also matters if pageid or oldid are used. And it should also be possible to specify multiple slots explicitly with text parameters.

In that case, it should be possible to again handle this just like the ApiQueryRevisionsBase case: add a slots parameter, and return output for multiple/all slots if requested, be introducing another level in the result structure.

I can't think of any reasonable method of doing page views where that would make sense for any existing use of action=parse. Maybe a method that blindly concatenates them in alphabetical order by slot name?

Neither your proposal nor mine works with parsing the slots individually and returning separate HTML blobs for each one. Yours has another layer on top of parsing individual slots that somehow combines them all, while mine only directly parses the main slot and lets other slots either be transcluded or implicitly add themselves like Cite's automatic adding of references if no <references/> was present.

That leaves the question what exactly should happen if no slots parameter was specified.

The output should be the same HTML blob that is inserted into the skin chrome when viewing a page or is shown inside the <div id="wikiPreview"> when previewing an edit.

If a "slots" parameter affects the output at all, it should most likely behave the same as viewing a non-main slot if we add that ability to the web UI.

What does matter is what changes if any get made in DifferenceEngine.

I don't expect any, since DifferenceEngine is per content model. You'd just have one DifferenceEngine for each slot.

ApiParse has several modes, and MCR only matters when the page parameter is specified.

Wrong. It obviously also matters if pageid or oldid are used.

True. My point was that it doesn't matter if text is given.

And it should also be possible to specify multiple slots explicitly with text parameters.

That would be nice (for ApiParse as well as ApiComparePages), but I'm not sure it's really needed. We do need it for ApiEdit, of course.

In that case, it should be possible to again handle this just like the ApiQueryRevisionsBase case: add a slots parameter, and return output for multiple/all slots if requested, be introducing another level in the result structure.

I can't think of any reasonable method of doing page views where that would make sense for any existing use of action=parse. Maybe a method that blindly concatenates them in alphabetical order by slot name?

I'm confused - why would you concatenate the output if the slots param is given? I can only

Neither your proposal nor mine works with parsing the slots individually and returning separate HTML blobs for each one. Yours has another layer on top of parsing individual slots that somehow combines them all, while mine only directly parses the main slot and lets other slots either be transcluded or implicitly add themselves like Cite's automatic adding of references if no <references/> was present.

That's both "combining", the question is just what does the combining. But I'D only do any kind of combining if the slots param is not given.

But being able to render non-main slots separately is a requirement in my mind. And I also see no reason not to support that. It seems straight forward.

That leaves the question what exactly should happen if no slots parameter was specified.

The output should be the same HTML blob that is inserted into the skin chrome when viewing a page or is shown inside the <div id="wikiPreview"> when previewing an edit.

Yes - which is "whatever PreparedEdit::getCombinedParserOutput() returns".

If a "slots" parameter affects the output at all, it should most likely behave the same as viewing a non-main slot if we add that ability to the web UI.

Yes, indeed. As I said, I think this is a requirement. This would bypass any "combining" logic external to the ContentHandler, but would not prevent cross-slot access (transclusion) by the ContentHandler (or whatever it uses to generate the ParserOutput, e.g. Parser).

What does matter is what changes if any get made in DifferenceEngine.

I don't expect any, since DifferenceEngine is per content model. You'd just have one DifferenceEngine for each slot.

Something needs to change, because the way it's currently called gets "the" content handler, passes revision IDs without any indication of desired slots, and relies on DifferenceEngine to extract "the" Content object for each revision.

If the only change is that we get a content handler per slot and pass the slot role when creating each DifferenceEngine, that's fine, but it's still a change in the signature of ContentHandler::createDifferenceEngine() and DifferenceEngine::__construct().

And it should also be possible to specify multiple slots explicitly with text parameters.

That would be nice (for ApiParse as well as ApiComparePages), but I'm not sure it's really needed.

It's needed for stuff like the "live preview" feature.

Yes - which is "whatever PreparedEdit::getCombinedParserOutput() returns".

ApiParse doesn't actually use WikiPage::prepareContentForEdit(). I don't think it even can, since ApiParse would need to pass in ParserOptions and that method is designed for preparing a ParserOutput for canonical options.

I note that EditPage previews don't use WikiPage::prepareContentForEdit() either, likely because that too wants to use ParserOptions specific to the user.

If the only change is that we get a content handler per slot and pass the slot role when creating each DifferenceEngine, that's fine, but it's still a change in the signature of ContentHandler::createDifferenceEngine() and DifferenceEngine::__construct().

Ah, you are right - as long as the DifferenceEngine calls getContent on the revision, it needs to know the slot. DifferenceEngine is a bit of a tangle, perhaps it should be split. But adding a constructor argument is the easiest approach for now.

And it should also be possible to specify multiple slots explicitly with text parameters.

It's needed for stuff like the "live preview" feature.

Ok, I see.

Yes - which is "whatever PreparedEdit::getCombinedParserOutput() returns".

ApiParse doesn't actually use WikiPage::prepareContentForEdit(). I don't think it even can, since ApiParse would need to pass in ParserOptions and that method is designed for preparing a ParserOutput for canonical options.

I note that EditPage previews don't use WikiPage::prepareContentForEdit() either, likely because that too wants to use ParserOptions specific to the user.

prepareContentForEdit() actually constructs two ParserOptions: user-specific options for PST, and canonical options for parsing.

But anyway, you are right: the user should see the per-user rendering of the page in preview, since that is what they see after the page has been saved.

So, let me re-phrase: the mechanism for combining output of multiple slots (whatever that mechanism may be) should be the same in EditPage, ApiParse, and prepareContentForEdit. Perhaps we want prepareContentForPreview? Something like that.

So, let me re-phrase: the mechanism for combining output of multiple slots (whatever that mechanism may be) should be the same in EditPage, ApiParse, and prepareContentForEdit. Perhaps we want prepareContentForPreview? Something like that.

Yes, the mechanism should be the same. I don't know that "prepareContentForPreview" is a good name if we actually need a function to abstract it, though, particularly since we'd probably want to use much the same code path for actual page views when they're not already cached (the major difference being that previews will want to override the "current revision" of the page being previewed, and perhaps supply a speculative revid callback).

"prepareContentForDisplay", then?

That name could work.