Page MenuHomePhabricator

Suppress spurious autoInsertedEnd flags from native wikitext tags
Closed, ResolvedPublic

Description

For native wikitext constructs that don't have explicit closing sequences (* # : ; | || ! !! |-), there is really no reason for Parsoid to set the autoInsertedEnd flag.

Right now, Parsoid does skip that flag for list constructs, but emits them for table constructs. This is just a side-effect of how lists and tables are built up in Parsoid.

As part of the cleanup DOM pass, we should get rid of these spurious autoInsertedEnd flags -- this will have the benefit of vastly cutting down the size of data-parsoid blobs for pages that are table-heavy.

It is important to be aware of a gotcha -- where the entire construct (like a tbody, or an implicit tr) is auto-generated, be careful not to remove the autoGeneratedEnd flag.

Event Timeline

ssastry created this task.Feb 23 2018, 5:24 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 23 2018, 5:24 PM
ssastry triaged this task as Medium priority.Feb 23 2018, 6:14 PM
ssastry assigned this task to Sbailey.Apr 20 2018, 4:33 PM

Change 428838 had a related patch set uploaded (by Sbailey; owner: Sbailey):
[mediawiki/services/parsoid@master] Suppress spurious autoInsertedEnd data parsoid flags from native wikitext tags

https://gerrit.wikimedia.org/r/428838

ssastry updated the task description. (Show Details)Apr 24 2018, 10:16 PM
ssastry added a subscriber: RESTBase.

Services folks: note that once this is merged and deployed, this will lead to a serious churn of a large percentage of data-parsoid blobs since tables are common.

mobrovac edited subscribers, added: mobrovac; removed: RESTBase.

Services folks: note that once this is merged and deployed, this will lead to a serious churn of a large percentage of data-parsoid blobs since tables are common.

As in, the data-parsoid blobs for pages containing tables will become invalid once this is deployed? That's fine as long as Parsoid can handle being provided autoInsertedEnd flags. In RESTBase, we only check the difference in HTML between different renders when deciding whether to store the new render or not.

Change 428838 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Suppress autoInsertedEnd flags where not required

https://gerrit.wikimedia.org/r/428838

Arlolra closed this task as Resolved.May 9 2018, 8:30 PM
Arlolra added a subscriber: Arlolra.

Presumably.