Page MenuHomePhabricator

Deduplicate template styles in Parsoid
Closed, ResolvedPublic

Description

T168333: CSS is duplicated on each template usage is fixed in the PHP parser using a stateless transform; Parsoid needs something equivalent.

The mechanism, as implemented in core for rMW9b2b375fcef1: ParserOutput: Add 'deduplicateStyles' post-cache transformation, is to find <style> tags with a data-mw-deduplicate attribute, and replace all but the first of each deduplication value with <link rel="mw-deduplicated-inline-style" href="mw-data:..."/>; the ellipsis there is a percent-encoded representation version of the deduplication value.

Related Objects

StatusSubtypeAssignedTask
DeclinedNone
ResolvedJdlrobson
DeclinedNone
DuplicateNone
ResolvedJdlrobson
DuplicateNone
DuplicateNone
DeclinedNone
ResolvedJdlrobson
DuplicateNone
DuplicateNone
Openbwang
Duplicateovasileva
ResolvedNone
OpenNone
OpenNone
ResolvedTheDJ
DeclinedNone
InvalidNone
OpenFeatureNone
InvalidNone
ResolvedTheDJ
ResolvedTheDJ
InvalidNone
ResolvedIzno
ResolvedTheDJ
OpenNone
ResolvedJdlrobson
DeclinedNone
ResolvedTgr
ResolvedTgr
Resolved ssastry

Event Timeline

@Tgr Could you please clarify how this task relates to TemplateStyles deployment ruwiki? Thanks!

That's up to the ruwiki community to decide, there never was much concern about duplication on the WMF side. Parsoid-generated HTML is only used in the Android app and in VisualEditor so this should have relatively small impact compared to T187141: Deduplicate template styles in mobileviews API.

VE may not actually want deduplication anyway. If you delete the first transclusion of a template, the styles still need to stay if there are other transclusions. VE could handle this by either not having deduplicated content in the first place or by detecting when a <style data-mw-deduplicate> is being removed and re-duplicate the styles to the first matching <link rel="mw-deduplicated-inline-style"> marker.

Nice to have, but not a blocker for any deployments per T133410#4055687.

Just to confirm, for VE it'd make our lives simpler if this doesn't happen.

Just to confirm, for VE it'd make our lives simpler if this doesn't happen.

I know, but this is one more instance where reading and editing needs don't always align up perfectly.

We could temporarily delay this, but note that this is not a viable long-term solution since Parsoid HTML will become the default output and at that time, read view performance constraints such as this will take precedence and VE code will have to adapt to that.

That said, this problem is part of a generic issue with extensions that we hope to tackle systematically. (See line 764 of https://gerrit.wikimedia.org/r/#/c/mediawiki/services/parsoid/+/441151/7/lib/ext/Cite/index.js ) .... which is where Parsoid adds an extension-specific handler for dealing with subtrees pre-edit and post-edit. That way, Parsoid deals with global state (that Cite has & TemplateStyles has introduced here for perf reasons) on the server side.

Anyway, https://gerrit.wikimedia.org/r/#/c/mediawiki/services/parsoid/+/452964/ is the parsoid patch to dedupe template styles, but we won't merge that till we chat with you all and see what is needed on the Parsoid end in the short term.

Note that deduplication should only happen after Cite did its own duplication checking of <ref> tags. In the PHP parser that caused T205803: Duplicate reference name errors in English Wikipedia caused by MediaWiki templatestyle handling? (although there the deduplication happens late so the exact cause was different).

Note that deduplication should only happen after Cite did its own duplication checking of <ref> tags. In the PHP parser that caused T205803: Duplicate reference name errors in English Wikipedia caused by MediaWiki templatestyle handling? (although there the deduplication happens late so the exact cause was different).

Thanks for the heads up! Turns out Parsoid isn't affected by that bug. Only true dupes are flagged since we don't use strip markers in Parsoid.

@Esanders and I discussed this from a VE point of view, and Ed seems to have ideas of how to deal with the deletion case when template styles are deduplicated. We'll go ahead and implement this in Parsoid.

ssastry raised the priority of this task from Low to Medium.Oct 16 2018, 6:35 PM

Change 452964 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] Dedupe template style occurrences

https://gerrit.wikimedia.org/r/452964

Change 452964 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Dedupe template style occurrences

https://gerrit.wikimedia.org/r/452964

ssastry claimed this task.