Update extension node renderings without building XML/wikitext in the client
Open, NormalPublic0 Story Points

Description

Currently when an extension is modified in VE, we generate XML in the client and ask Parsoid to convert it to HTML so we can get an updated version of the view.

However this means we are generating wikitext in the client:

wikitext = mw.html.element( tagName, attrs, new mw.html.Raw( extsrc ) );

which really shouldn't be our job. We should instead have an API to which we pass model HTML, and get back model HTML hydrated with view HTML:

in:
<div typeof="mw:Extension/Math" data-mw="...{body:'x+y'}..."></div>
out:
<div typeof="mw:Extension/Math" data-mw="...{body:'x+y'}..."><img src="..../x%20y.png"></div>

This is going a blocker to using HTML attributes in Parsoid.

Esanders created this task.Nov 27 2016, 2:41 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 27 2016, 2:41 PM

This is blocking our new gallery implementation as we will want to get an updated rendering of the gallery from Parsoid (we don't want to re-implement all the gallery rendering logic), but we won't be able to build the wikitext for it as we now have HTML captions.

ssastry added a subscriber: ssastry.EditedNov 28 2016, 6:31 PM

I don't understand this last bit. We are providing specialized HTML (with data-mw, etc.) so that you can provided customed (HTML) editing support for an extension. We then take the edited HTML (captions and all) and generate wikitext for it. So, if we are missing any attributes or information from the gallery, we should fix that. But, otherwise, I don't yet undertand where the intermediate HTML -> HTML part of the API comes in.

For example we have a "show filenames" checkbox. How do you propose we update our view once that property is added to data-mw?

Sending back <div typeof="mw:Extension/Math" data-mw="...{body:'x+y'}..."></div> wouldn't be sufficient though, because some information is only going to be in the html now (ie. captions).

You'd need to html2html the entire contents.

For example we have a "show filenames" checkbox. How do you propose we update our view once that property is added to data-mw?

Here are two possibilities I see (speaking generally):

  • When someone writes an extension (to parse HTML), they also write a custom plugin to provide rich editing support VE that provides customized editing support that updates the rendering (without going through wikitext) and also lets users edit the extension output visually. So, for example, you wouldn't let the user add arbitrary content to a gallery (because that isn't supported). [ Here, you = extension author in the general case; but, in this case, for the commonly used extensions like Cite, and Gallery, it is us. ]. So, I see this question you pose above as part of the customized editing experience for the extension. So, in this model, on save, you give Parsoid the edited HTML and Parsoid then serializes that to wikitext (because the extension author has also provided the hook to convert HTML to wikitext; in this case it is Arlo who has written that code).
  • Alternatively, if you want to provide somewhat limited editing support where editors can only directly manipulate the extension via text fields, then presumably, you can construct the extension source from that easily, because you know what you are editing (either attributes, or extension source). In that case, you can use the existing wt -> html endpoint to update the rendering.

So, TL:DR; basically I see either (a) "plain-text" editing support in VE (=> you use the wt -> html endpoint to update rendering); or (b) "customized-HTML" editing support in VE ( => rendering is updated in-place and Parsoid takes care of serializing the edited HTML to wt).

Extension authors provide wt -> html in the case of (a); or provide wt -> html, html -> wt, and VE plugin in the case of (b).

It seems you see this differently. Maybe we should chat and figure this out.

cscott added a subscriber: cscott.Nov 28 2016, 8:11 PM

I think an html2html mechanism would work fine for the general case, although I also support subbu's contention that an advanced editor plugin would probably do a dynamic preview without requiring an html2html step.

For instance, in the language converter work i'm doing, the content of the data-mw and typeof elements is the only thing we pay attention to in the html2wt conversion; the actual contents of the <span> are ignored:

<span typeof="mw:LanguageVariant" data-mw='{"text":{"": "blog, WEBJOURNAL, WEBLOG"},"target":["zh","zh-hans","zh-hant"]}'>anything here is ignored</span>

So if you ask parsoid to do a standard html2html transformation of this fragment (going through wikitext internally) it will fill in the contents of the <span> with the correct variant text. (Well, maybe: the API for getting a specific variant out of Parsoid is still in flux. But for purposes of discussion, let's say that it does.)

I think @Esanders request for <gallery> and other extensions is reasonable: there should be a standard html2html api that you could use to "normalize" a proposed HTML fragment, which would also "rehydrate" the rendered HTML inside an extension. We already do something similar for various HTML cleanups, which would also be reasonable to do in this API endpoint. In fact, I'm surprised we don't already provide this?

This is also how template editing ought to work: you provide the arguments in data-mw corresponding to https://www.mediawiki.org/wiki/Specs/HTML/1.2.1#Transclusion_content with "html" properties instead of "wt", and we'll html->wt->html it and provide you with an appropriate rendered representation of the template, as a DOM fragment (with potentially multiple nodes). I take it right now you are generating wikitext for the template instance and using the wt2html API to generate the preview rendering?

ssastry added a comment.EditedNov 28 2016, 8:37 PM

Other related thoughts

It would also be helpful to think of the general problem of content-model constraints. All HTML5 tags have content model constraints that specify what content can be nested within them. Extensions (via extension tags) have similar content model constraints.

For example, in HTML5, you cannot nest A-tags inside A-tags. VE would have to enforce that. Extensions provide custom tags and also provide their own content-model constraints. For example, <gallery> specifies that only certain kinds of HTML is valid, and ideally, the VE plugin for that extension would enforce it as part of HTML-editing support. So, in this HTML-editing model, rendering is dynamic without having to ask Parsoid about it.

But, for text-editing, I see the need to ask Parsoid to update rendering (which exists in the form of the wt -> html endpoint).

All that said, this model breaks down for templates because you cannot really edit the rendering of a template. You can only edit parameters => you need to ask Parsoid for an updated rendering always. So, in that context, I understand the request to provide a short-cut for templates where the args are provided in HTML form. Given that, I can see the argument to provide the same feature for extensions for scenarios where extension-authors/VE-maintainers can only provide partial HTML-editing support for them. For example, you might be able to provide HTML editing of gallery captions, but nothing else. So, for these scenarios, I can see the need to ask Parsoid for updated rendering.

Separately, there is another orthogonal problem that has always existed for templates: how do you compose a template's output DOM fragment into the page. I am not sure how VE does it right now (These notes assumes it is broken). In the general case, we need to rely on a DOM fragment compostion spec (1, 2) that VE (and other editing tools) use for updating rendering and that Parsoid uses for incremental parsing and other high-performance updates. This applies for extension output as well. But, anyway, this is orthogonal to the immediate problem of how rendering is updated after an extension (or template arg) is edited.

Longer term, we should also start thinking of templates and extensions as being similar in functionality (which is what I propose in this wikitext 2.0 doc)

Anyway, I am mentioning everything in one place so we can think this through while taking in all considerations.

I'm specifically talking about doing a live mid-edit update of the view HTML. The new spec for galleries is view and model HTML rolled into one, but I'd rather no implement all the logic for generating a gallery view when properties change. This was ok to do for images, but galleries are a lot more complex and we'd be duplicating a lot of logic.

In my specific example, we provide a checkbox to enable "show filenames". After this is checked and the gallery dialog is closed, we'd need to duplicate the logic that inserts a span into every caption containing the filename. I'd rather just send off a version of the HTML to Parsoid which has "showFilenames:true" in the data-mw, and have it re-hydrate the rendering with the actual filenames in place.

Removing gallery block as we may take a native rendering approach to support in-line caption editing.

Jdforrester-WMF set the point value for this task to 0.Feb 9 2017, 6:15 PM
ssastry triaged this task as Normal priority.Apr 9 2017, 10:12 PM
ssastry added a project: Parsoid-Web-API.