Page MenuHomePhabricator

[EPIC] Use Parsoid HTML for all page views
Open, MediumPublic


As mentioned in our roadmap, we would like to use Parsoid HTML+RDFa for regular page views too.

This has several potential advantages:

  1. It can speed up visual editing by eliminating the need to re-load content for editing.
  2. With stable element IDs / T116350 in Parsoid HTML we can explore new light-weight editing tools. The ability to associate metadata with elements using the ID lets us also implement new features like per-paragraph comments, blame maps etc.
  3. It lets us preserve rich metadata associated with content across copy & paste. See T54091.
  4. We are moving towards a uniform content representation with a well-defined DOM spec and improve rendering consistency between view and edit mode.
  5. In the longer term, we can stop maintaining two wikitext parsers.

Before we can do this, we need to address several issues:

  • The size of compressed HTML delivered on view needs to be close to that of the PHP parser. With all metadata in attributes it is currently about 100% larger. T54936 , T1228, T78676
  • Site-specific CSS rules for the content need to be adjusted to work on Parsoid's more semantic HTML structures: T53245
  • Media support, esp. for audio + video: T64270: Support video and audio content, T56844: Image / media handling (tracking)
  • Red link annotations information
  • User preferences affecting the content rendering need to be implemented client-side: T39902, T53698 & T78046 (math)
    • thumb sizes
    • [red] link styling
    • header numbering
    • hidden category display
    • math
    • [perhaps] stub thresholds
  • Client-side scripts (mobile front-end, gadgets, browser extensions) can modify the DOM before editing. We'll need to either stash away a pristine copy of the content, or detect modifications based on a hash or the like & re-load content from the server to avoid this creating dirty diffs.
  • Improve page metadata returned by Parsoid, as well as markup of common content elements: T105845: RFC: Page components / content widgets, T105845#1650013
  • We need a way to store HTML and metadata for each page, so that views and edits can rely on the HTML being quickly available: T1228



Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
gpaumier moved this task from Backlog to Triaged on the Notice board.Apr 2 2015, 7:00 PM
gpaumier moved this task from Triaged to Archive on the Notice board.Apr 9 2015, 5:44 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 3 2015, 10:40 PM
GWicke renamed this task from Use Parsoid HTML for read-only views too to Use Parsoid HTML for all page views.Sep 28 2015, 8:49 PM
GWicke updated the task description. (Show Details)
GWicke updated the task description. (Show Details)Sep 28 2015, 8:54 PM
cscott added a subscriber: cscott.Sep 28 2015, 9:04 PM

FWIW, I did get some informal agreement that using Parsoid HTML for "Printable page" views was a reasonable thing to deploy on wikitech, as a first step. I may try to write a standalone extension to do this.

Qgil added a subscriber: Qgil.Oct 3 2015, 8:57 PM

Congratulations! This is one of the 52 proposals that made it through the first deadline of the Wikimedia-Developer-Summit-2016 selection process. Please pay attention to the next one: > By 6 Nov 2015, all Summit proposals must have active discussions and a Summit plan documented in the description. Proposals not reaching this critical mass can continue at their own path out of the Summit.

GWicke updated the task description. (Show Details)Oct 23 2015, 10:43 PM
Qgil added a comment.Oct 28 2015, 11:24 AM

With so many blocking/blocked tasks, no specific Summit plans specified in the description, and no assignee, it is difficult to evaluate whether this Summit proposal is On Track with ongoing discussion. Can someone step in as driver of the discussion and confirm the interest in getting a slot in the Summit schedule, please?

@Qgil, the reason is that this requires a wide cross-org consensus (and later work), so I'd say the ArchCom should drive this one.

@mobrovac: if you really think this should happen, you should drive it rather than delegate it to TechCom. @GWicke might volunteer to take this on, but I'm not going to speak for him.

Qgil added a comment.Nov 6 2015, 10:49 AM

Today is November 6, and this proposal is basically not on track. Unless the situation suddenly changes and/or @RobLa-WMF and the Architecture Committee really want to schedule it, it will be removed as a Wikimedia-Developer-Summit-2016 proposal.

Qgil removed a subscriber: Qgil.Dec 28 2015, 8:31 PM

Wikimedia Developer Summit 2016 ended two weeks ago. This task is still open. If the session in this task took place, please make sure 1) that the session Etherpad notes are linked from this task, 2) that followup tasks for any actions identified have been created and linked from this task, 3) to change the status of this task to "resolved". If this session did not take place, change the task status to "declined". If this task itself has become a well-defined action which is not finished yet, drag and drop this task into the "Work continues after Summit" column on the project workboard. Thank you for your help!

MaxSem removed a subscriber: MaxSem.Jan 20 2016, 6:17 PM
GWicke moved this task from later to designing on the Services board.Jul 18 2017, 9:25 PM
GWicke edited projects, added Services (designing); removed Services (later).
flame_qi moved this task from Future to Blocked / others on the RESTBase board.Jan 6 2018, 3:41 PM
flame_qi moved this task from Blocked / others to Future on the RESTBase board.
ssastry moved this task from Needs Triage to Read Views on the Parsoid board.Jan 11 2018, 9:45 PM
Nirmos added a subscriber: Nirmos.Apr 4 2018, 1:47 AM

Tagging this for the Web and Infrastructure teams.

This ultra-epic is the perfect example of a task appropriately in the category of "general MediaWiki tasks".

Reedy edited projects, added Parsoid-Read-Views; removed Parsoid.Sep 17 2018, 7:25 PM
Aklapper edited projects, added Parsoid; removed Parsoid-Read-Views.Feb 29 2020, 5:14 PM
ssastry moved this task from Missing Functionality to Tech Debt / Big changes on the Parsoid board.
Alsee added a subscriber: Alsee.Mar 8 2020, 4:08 AM
cscott renamed this task from Use Parsoid HTML for all page views to EPIC] Use Parsoid HTML for all page views.Wed, May 13, 9:32 PM
cscott renamed this task from EPIC] Use Parsoid HTML for all page views to [EPIC] Use Parsoid HTML for all page views.