As mentioned in our [[ https://www.mediawiki.org/wiki/Parsoid/Roadmap#Use_Parsoid_HTML_for_all_page_views_.5Bhard.2C_stretch_goal.2C_Q2_2014.5D | roadmap ]], we would like to use Parsoid HTML+RDFa for regular page views too.
This has several potential advantages:
1) It can speed up visual editing by eliminating the need to re-load content for editing.
2) With [stable element IDs](https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec/Element_IDs) / T116350 in Parsoid HTML we can explore new light-weight editing tools. The ability to associate metadata with elements using the ID lets us also implement new features like per-paragraph comments, blame maps etc.
3) It lets us preserve rich metadata associated with content across copy & paste. See T54091.
4) We are moving towards a uniform content representation with a well-defined [DOM spec](https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec) and improve rendering consistency between view and edit mode.
5) In the longer term, we can stop maintaining two wikitext parsers.
Before we can do this, we need to address several issues:
[x] The size of compressed HTML delivered on view needs to be close to that of the PHP parser. With all metadata in attributes it is currently about 100% larger. T54936 , T1228, T78676
[ ] Site-specific CSS rules for the content need to be adjusted to work on Parsoid's more semantic HTML structures: T53245
[x] Media support, esp. for audio + video: {T64270}, {T56844}
[x] Red link annotations information
[ ] User preferences affecting the content rendering need to be implemented client-side: T39902, T53698 & T78046 (math)
- thumb sizes
- [red] link styling
- header numbering
- hidden category display
- math
- [perhaps] stub thresholds
[ ] Client-side scripts (mobile front-end, gadgets, browser extensions) can modify the DOM before editing. We'll need to either stash away a pristine copy of the content, or detect modifications based on a hash or the like & re-load content from the server to avoid this creating dirty diffs.
[ ] Improve page metadata returned by Parsoid, as well as markup of common content elements: {T105845}, T105845#1650013
[x] We need a way to store HTML and metadata for each page, so that views and edits can rely on the HTML being quickly available: T1228