Deduplicate template styles in Parsoid
Open, NormalPublic

Description

T168333: CSS is duplicated on each template usage is fixed in the PHP parser using a stateless transform; Parsoid needs something equivalent.

The mechanism, as implemented in core for rMW9b2b375fcef1: ParserOutput: Add 'deduplicateStyles' post-cache transformation, is to find <style> tags with a data-mw-deduplicate attribute, and replace all but the first of each deduplication value with <link rel="mw-deduplicated-inline-style" href="mw-data:..."/>; the ellipsis there is a percent-encoded representation version of the deduplication value.

Related Objects

StatusAssignedTask
ResolvedTgr
OpenNone
Tgr created this task.Feb 13 2018, 12:22 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 13 2018, 12:22 AM
Anomie updated the task description. (Show Details)Feb 13 2018, 2:40 AM

@Tgr Could you please clarify how this task relates to TemplateStyles deployment ruwiki? Thanks!

Tgr added a comment.Feb 13 2018, 8:50 PM

That's up to the ruwiki community to decide, there never was much concern about duplication on the WMF side. Parsoid-generated HTML is only used in the Android app and in VisualEditor so this should have relatively small impact compared to T187141: Deduplicate template styles in mobileviews API.

Anomie added a subscriber: Anomie.Feb 13 2018, 9:06 PM

VE may not actually want deduplication anyway. If you delete the first transclusion of a template, the styles still need to stay if there are other transclusions. VE could handle this by either not having deduplicated content in the first place or by detecting when a <style data-mw-deduplicate> is being removed and re-duplicate the styles to the first matching <link rel="mw-deduplicated-inline-style"> marker.

ggellerman triaged this task as Low priority.Feb 27 2018, 6:43 PM

Nice to have, but not a blocker for any deployments per T133410#4055687.

Deskana moved this task from Up next to Backlog on the TemplateStyles board.Mar 16 2018, 9:51 AM
DLynch added a subscriber: DLynch.Aug 21 2018, 11:12 PM

Just to confirm, for VE it'd make our lives simpler if this doesn't happen.

Just to confirm, for VE it'd make our lives simpler if this doesn't happen.

I know, but this is one more instance where reading and editing needs don't always align up perfectly.

We could temporarily delay this, but note that this is not a viable long-term solution since Parsoid HTML will become the default output and at that time, read view performance constraints such as this will take precedence and VE code will have to adapt to that.

That said, this problem is part of a generic issue with extensions that we hope to tackle systematically. (See line 764 of https://gerrit.wikimedia.org/r/#/c/mediawiki/services/parsoid/+/441151/7/lib/ext/Cite/index.js ) .... which is where Parsoid adds an extension-specific handler for dealing with subtrees pre-edit and post-edit. That way, Parsoid deals with global state (that Cite has & TemplateStyles has introduced here for perf reasons) on the server side.

Anyway, https://gerrit.wikimedia.org/r/#/c/mediawiki/services/parsoid/+/452964/ is the parsoid patch to dedupe template styles, but we won't merge that till we chat with you all and see what is needed on the Parsoid end in the short term.

Tgr added a comment.Oct 15 2018, 4:35 PM

Note that deduplication should only happen after Cite did its own duplication checking of <ref> tags. In the PHP parser that caused T205803: Duplicate reference name errors in English Wikipedia caused by MediaWiki templatestyle handling? (although there the deduplication happens late so the exact cause was different).

Note that deduplication should only happen after Cite did its own duplication checking of <ref> tags. In the PHP parser that caused T205803: Duplicate reference name errors in English Wikipedia caused by MediaWiki templatestyle handling? (although there the deduplication happens late so the exact cause was different).

Thanks for the heads up! Turns out Parsoid isn't affected by that bug. Only true dupes are flagged since we don't use strip markers in Parsoid.

@Esanders and I discussed this from a VE point of view, and Ed seems to have ideas of how to deal with the deletion case when template styles are deduplicated. We'll go ahead and implement this in Parsoid.

ssastry raised the priority of this task from Low to Normal.Oct 16 2018, 6:35 PM

Change 452964 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] Dedupe template style occurrences

https://gerrit.wikimedia.org/r/452964

Change 452964 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Dedupe template style occurrences

https://gerrit.wikimedia.org/r/452964