As mentioned in our roadmap, we would like to use Parsoid HTML+RDFa for regular page views too.
This has several potential advantages:
- It can speed up visual editing by eliminating the need to re-load content for editing.
- With stable element IDs / T116350 in Parsoid HTML we can explore new light-weight editing tools. The ability to associate metadata with elements using the ID lets us also implement new features like per-paragraph comments, blame maps etc.
- It lets us preserve rich metadata associated with content across copy & paste. See T54091.
- We are moving towards a uniform content representation with a well-defined DOM spec and improve rendering consistency between view and edit mode.
- In the longer term, we can stop maintaining two wikitext parsers.
Before we can do this, we need to address several issues:
- The size of compressed HTML delivered on view needs to be close to that of the PHP parser. With all metadata in attributes it is currently about 100% larger. T54936 , T1228, T78676
- Site-specific CSS rules for the content need to be adjusted to work on Parsoid's more semantic HTML structures: T53245
- Media support, esp. for audio + video: T64270: Support video and audio content, T56844: Image / media handling (tracking)
- Red link annotations information
- User preferences affecting the content rendering need to be implemented client-side: T39902, T53698 & T78046 (math)
- thumb sizes
- [red] link styling
- header numbering
- hidden category display
- [perhaps] stub thresholds
- Client-side scripts (mobile front-end, gadgets, browser extensions) can modify the DOM before editing. We'll need to either stash away a pristine copy of the content, or detect modifications based on a hash or the like & re-load content from the server to avoid this creating dirty diffs.
- Improve page metadata returned by Parsoid, as well as markup of common content elements: T105845: RFC: Page components / content widgets, T105845#1650013
- We need a way to store HTML and metadata for each page, so that views and edits can rely on the HTML being quickly available: T1228