Page MenuHomePhabricator

Parsoid Roadmap April - June 2015 (Q4 2014/2015)
Closed, ResolvedPublic

Description

Ongoing focus areas

  1. Client support (VE, Flow, etc.)
  2. Parsoid HTML read views
  3. New directions (stable ids, incremental parse, etc.)

Top Goal
Identify and fix the most prominent remaining semantic roundtripping diffs.

Specifically, at least 99.95% of the 160K test pages roundtrip (wikitext -> HTML -> wikitext) without semantic errors in full roundtrip testing (which translates to significantly higher accuracy using the selective serializer in production). We want to nail down our functionality wrt semantic roundtripping and provide a reasonable metric that indicates this accuracy. Because of wikitext markup errors and edge cases, 100% is not a realistic goal.

Getting to this point not only builds confidence in Parsoid, but also enables us to advocate for other fundamental work improving wikitext.

Doing this will require categorizing semantic diffs (WIP @ https://www.mediawiki.org/wiki/Parsoid/Round-trip_testing/Diffs), improving our testing infrastructure to eliminate false positive semantic diffs (see T95258, T94861, T89628). This will help identify real errors that need fixing.

Other goals based on VE/RESTBase priorities

  1. Move inlined data-mw to its own attribute (T78676)
  2. Provide API end points for section editing (T94890 has initial ideas based on discussions between RESTBase, VE, and Parsoid)
  3. Support switching between HTML and wikitext editing
  4. Handle large / pathological pages on which Parsoid is currently timing out (T75412, T88915)

Event Timeline

ssastry created this task.Mar 13 2015, 5:43 PM
ssastry raised the priority of this task from to Normal.
ssastry updated the task description. (Show Details)
ssastry added a project: Parsoid.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 13 2015, 5:43 PM
ssastry set Security to None.
ssastry updated the task description. (Show Details)Mar 13 2015, 5:49 PM
Eloquence added a subscriber: Eloquence.

(Removing from master roadmap for now -- we can add specific major deliverables as tasks per roadmap policy)

ssastry renamed this task from [draft] Parsoid Roadmap April - June 2015 (Q4 2014/2015) to Parsoid Roadmap April - June 2015 (Q4 2014/2015).Apr 13 2015, 5:33 PM
ssastry updated the task description. (Show Details)
ssastry updated the task description. (Show Details)Apr 13 2015, 5:35 PM
ssastry moved this task from Backlog to In Progress on the Parsoid board.Apr 21 2015, 4:07 AM
ssastry updated the task description. (Show Details)Jun 24 2015, 4:28 PM
ssastry added a comment.EditedJun 24 2015, 4:51 PM

As of yesterday, we hit the 99.95% mark in RT-testing and consider the main objective met after doing a bunch of work fixing our roundtrip testing infrastructure to reduce misclassifications of syntactic diffs as semantic diffs, classifying the source of semantic diffs, fixing several bugs to eliminate some of these semantic diffs, and some page edits to fix wikitext errors (that we know we'll not support in Parsoid) that were causing semantic diffs.

As for the other goals,

  1. data-mw separation is not on VE's roadmap, and so we've postponed tackling it on the Parsoid end as well.
  2. We have done some initial work towards providing section offsets for RESTbase to use it for supporting section editing. That said, fully supporting section editing requires additional work to be done in multiple projects and is something that will shape up in upcoming quarters.
  3. Nothing has been done with wikitext <-> html switching in VE since it didn't surface to the top of the priority list.
  4. Very minimal investigation so far about pathological scenarios that cause Parsoid to timeout.

Other work that the team got done:

  1. Maintained a regular deployment schedule.
  2. Did a whole lot of work doing code cleanup, and addressing technical debt in the codebase.
  3. Started looking at Parsoid performance more generally starting with the PEG tokenizer. We have some promising results already. This could help the pathological parsing scenarios in some cases.
Elitre added a subscriber: Elitre.Jun 29 2015, 12:38 PM
ssastry closed this task as Resolved.Jul 1 2015, 5:11 PM