Page MenuHomePhabricator

Google's services should load data-mw from a separate API call to RESTBase
Open, MediumPublic0 Estimated Story Points

Description

Work is ongoing in Parsoid and RESTBase to remove data-mw from being an inlined HTML attribute. Accordingly, Google's services should be updated to fetch data-mw in a separate API call from RESTBase. Once this change is live on the RESTBase end, the version string in the Accept: header based on the changes in T130638: Add data-mw as a separate JSON blob in the pagebundle output of Parsoid's API to get the updated HTML without inlined data-mw.

Since Google is a high-traffic API client of Parsoid HTML, in order to minimize the performance impact in the Parsoid production cluster from having to convert the latest version to the previous version, please try to prioritize this work so that you can enable this change within a short time of this change being live on the RESTBase end.

Event Timeline

Ack. When will this happen? And will you keep inlined data-mw in HTML returned by /transform/wikitext/to/html endpoint?

About ~3-4 weeks before the code is ready for deploy on the Parsoid and RESTBase end. However, this will be enabled in production once we have a sense of what kind of performance impacts we are looking at supporting the older version, and if there are any blockers from clients.

And will you keep inlined data-mw in HTML returned by /transform/wikitext/to/html endpoint?

Good question regarding /transform/wikitext/to/html endpoint. I'll let @mobrovac and @GWicke answer that question.

Looking at the current behavior for data-parsoid which is already being stripped out by Parsoid,

[subbu@earth notes] curl -X POST --header 'Content-Type: application/x-www-form-urlencoded' --header 'Accept: text/html; charset=utf-8; profile="mediawiki.org/specs/html/1.2.0"' -d 'wikitext=\u0027\u0027foo\u0027\u0027%20and%20%0A*bar%0Aand%20%7B%7Becho%7Cboo%7D%7D' 'https://en.wikipedia.org/api/rest_v1/transform/wikitext/to/html'
<!DOCTYPE html>
<html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/"><head prefix="mwr: http://en.wikipedia.org/wiki/Special:Redirect/"><meta property="mw:articleNamespace" content="0"/><meta property="mw:html-content-type" content='text/html; charset=utf-8; profile="mediawiki.org/specs/html/1.2.0"'/><link rel="dc:isVersionOf" href="//en.wikipedia.org/wiki/Main_Page"/><title></title><base href="//en.wikipedia.org/wiki/"/><link rel="stylesheet" href="//en.wikipedia.org/w/load.php?modules=mediawiki.legacy.commonPrint,shared|mediawiki.skinning.elements|mediawiki.skinning.content|mediawiki.skinning.interface|skins.vector.styles|site|mediawiki.skinning.content.parsoid|ext.cite.style&amp;only=styles&amp;skin=vector"/></head><body id="mwAA" lang="en" class="mw-content-ltr sitedir-ltr ltr mw-body mw-body-content mediawiki" dir="ltr"><p id="mwAQ">\u0027\u0027foo\u0027\u0027 and </p>
<ul id="mwAg"><li id="mwAw">bar</li></ul>
<p id="mwBA">and <span about="#mwt1" typeof="mw:Transclusion" id="mwBQ" data-mw='{"parts":[{"template":{"target":{"wt":"echo","href":"./Template:Echo"},"params":{"1":{"wt":"boo"}},"i":0}}]}'

So, looks like RESTBase is querying the http://<parsoid>/<domain>/v3/transform/wikitext/to/pagebundle endpoint (and returning only the html bundle) instead of querying the http://<parsoid>/<domain>/v3/transform/wikitext/to/html endpoint. I think RESTBase should be using the latter endpoint for its /transform/wikitext/to/html requests in which case data-parsoid and data-mw will continue to be inlined.

Thanks. For our use case it will be easier if /transform/wikitext/to/html keeps data-mw inlined in result HTML.

GWicke edited projects, added Services (later); removed Services.
GWicke moved this task from later to watching on the Services board.
GWicke edited projects, added Services (watching); removed Services (later).

@Renxiaoyi: Hi! You assigned this task to yourself a while ago. Could you maybe share a status update? Are you still working (or still plan to work) on this issue? Or is there anything that others could help with? (If you do not plan to work on this issue anymore, please remove yourself as assignee (via Add Action...Assign / Claim in the dropdown menu) so others could work on it.) Thanks a lot!

@Renxiaoyi: Hi! You assigned this task to yourself a while ago. Could you maybe share a status update? Are you still working (or still plan to work) on this issue? Or is there anything that others could help with? (If you do not plan to work on this issue anymore, please remove yourself as assignee (via Add Action...Assign / Claim in the dropdown menu) so others could work on it.) Thanks a lot!

@Andre, this is blocked on RESTBase + Parsoid. We haven't turned on this feature yet for a bunch of reasons. So, nothing for any of the clients to do yet.

Aklapper changed the task status from Open to Stalled.Nov 25 2019, 12:48 AM

Setting task status to stalled as per ssastry's last comment.

Aklapper edited subscribers, added: Renxiaoyi; removed: Andre, GWicke, mobrovac.

@ssastry: Two years later, Is there a ticket about turning on this feature in RESTBase + Parsoid, that this task should depend on?

Boldly removing assignee as this task does not look actionable currently.

Aklapper changed the task status from Stalled to Open.Dec 29 2023, 12:57 PM

@ssastry: No reply; resetting status.