OCG should load data-mw from a separate API call alongside the body content
Open, NormalPublic0 Story Points


OCG should be updated to fetch data-mw in a separate API call from RESTBase. Whether OCG wants to inline the data-mw into the HTML or handle data-mw separately would be something for OCG to resolve. But, processing the data-mw by inlining it into the HTML would be a straightforward way to handle this change. Parsoid (and RESTBase) will accept inlined data-mw or a separate data-mw blob in the html -> wt API end points.

In order to minimize the performance impact in the Parsoid production cluster from having to return the old version, RESTBase and Parsoid won't bump the HTML version number in production till OCG and other Parsoid HTML clients are ready to consume this newer version. So, it is good to start work on this sooner than later.

At the time of deploying this change to production, OCG would also have to update the version string in the Accept: header based on the updates T130638: Add data-mw as a separate JSON blob in the pagebundle output of Parsoid's API.

ssastry created this task.Mar 23 2016, 12:37 AM

Note that we are at least 3-4 weeks away before both Parsoid and RESTBase are ready with code, storage, and API changes. At that point, we will also have to evaluate which of the clients are ready to switch over, and if not, what the performance impacts are on the Parsoid cluster (to convert from the new to the old version), and whether any of the clients have any blockers on this deployment. But, I am creating new tasks for all Parsoid HTML clients to start surfacing any issues that need to resolved for T78676: Store & load data-mw separately to be enabled in production.

As already announced in Tech News, OfflineContentGenerator (OCG) will not be used anymore after October 1st, 2017 on Wikimedia sites. OCG will be replaced by Electron. You can read more on mediawiki.org.