Page MenuHomePhabricator

Parse images synchronously without making imageinfo requests and use a final postprocessing pass to fixup image HTML
Closed, ResolvedPublic

Description

Right now, images trigger an imageinfo request to the Mediawiki API and the generated HTML is dependent on the output of the imageinfo request. However, this adds an unnecessary async dependency (even if the requests are batched and overlapped with other activity).

It should be possible to generate a "normalized" HTML output during regular parse that uses information from wikitext, and then postprocess the output based on a bulk API request in the end (images, redlinks, disambiguation links, and whatever else). This is hinted at in this Wikitext 2.0 note.

The generalized push here is to make the wikitext be as self-sufficient as possible on parse, and use post-processing to transform it based on database state. Our current redlinks and disambiguation link parse strategy are 2 steps towards that goal. This image parsing strategy is another step towards that goal.

Event Timeline

ssastry triaged this task as Medium priority.Dec 15 2016, 10:25 PM
ssastry added subscribers: Legoktm, GWicke, tstarling.
ssastry added a subscriber: cscott.

Change 463125 had a related patch set uploaded (by Arlolra; owner: Arlolra):
[mediawiki/services/parsoid@master] [WIP] Add media info in a post-processing pass

https://gerrit.wikimedia.org/r/463125

Change 463125 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Add media info in a post-processing pass

https://gerrit.wikimedia.org/r/463125