Page MenuHomePhabricator

Deprecate and remove Parsoid's body-only mode
Open, NormalPublic

Description

Everyone who starts out using this eventually realizes that they need the metainformation that is in the <head> after all. It's just unnecessary complexity.

Event Timeline

cscott created this task.Nov 29 2017, 5:38 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 29 2017, 5:38 PM

Everyone

VE are using it without needing the metainformation. If you're using it to render a whole document then you're going to have a Bad Time, but for re-rendering a fragment it is useful. Perhaps it is poorly named?

Some of our use cases could be replaced with T151720 which proposes an API to re-hydrate HTML, but that hasn't been looked at in a while.

cscott added a comment.Dec 6 2017, 7:48 PM

I think the idea is to add a separate interface for fetching an arbitrary subtree of the DOM (including the subtree of data-mw information), which would serve the "render a fragment" use case.

(And note that the <section> tag stuff serves as a handy DOM subtree to fetch.)

VE actually "needs" the metainformation, it just gets most of it from the PHP server-side code instead of from RESTbase/Parsoid. But we'd really prefer not to maintain two separate APIs here, and certainly not three (once the subtree loading lands).

cscott added a comment.Dec 6 2017, 7:56 PM

Oh, wait, I think I see what's going on here -- in ve.ce.MWTransclusionNode you're using the wt2html POST interface to (re)render just a single template transclusion, and using body_only there because the meta-info is actually irrelevant for fragment rendering. I should have paid more attention when you mentioned T151720: Update extension node renderings without building XML/wikitext in the client.

We do intend for extension translcusions to be simple subtree replacements (once we enforce well-formedness on extension html, etc). I don't think body_only is the appropriate hammer for that particular nail, but we'll have to think about a better solution.

cscott added a comment.Dec 6 2017, 8:44 PM

Also, note that transclusions *can* define their own required resources, and the only way you learn about new dependencies of that sort is via the <head> information. So VE *should* be using the full document API for transclusions as well, so that they can check the resources returned and update the resources required for the whole document if necessary. Otherwise this creates bugs where "VE doesn't display the extension correctly" because VE isn't including the necessary page resources.

Pchelolo added a subscriber: Pchelolo.

The body_only parameter is a part of the public API documented in the RESTBase docs, so we need to go though a deprecation procedure here. We at least need a deprecation notice email for that.

I can also run a query on Hadoop to make sure nobody's actually using the parameter. Will report on the results.

Well, we know that VE and CX are using the parameter right now. This is actually causing bugs in VE, because they don't get the resource-usage info from the <head> when they do a fragment parse of a template or extension, so they don't update the resources, and so sometimes the rendering looks off because it is missing necessary CSS/JS. I haven't looked into CX yet.

So the first step is to try to convince VE and CX to stop using body_only; then we can remove it from parsoid (since RestBase doesn't use parsoid's implementation, it does the body stripping itself)... and *then* we'd go through the RESTBase deprecation procedure.

since RestBase doesn't use parsoid's implementation, it does the body stripping itself

Actually RB does use Parsoid for that as well for transform endpoints: https://github.com/wikimedia/restbase/blob/master/sys/parsoid.js#L681

Arlolra triaged this task as Normal priority.Jan 4 2018, 10:57 PM
Krinkle moved this task from Unsorted to Migrate / Replace on the Technical-Debt board.