Page MenuHomePhabricator

Decide on storage and delivery method for TemplateStyles CSS
Closed, ResolvedPublic0 Estimated Story Points

Description

The TemplateStyles extension allows complex CSS (including selectors and media queries) to be used in templates. The Wikimedia Reading Infrastructure team plans to deploy it to the Wikimedia cluster to fulfill our Q3 goal: to empower editors to create mobile-friendly templates. That requires some changes to the extension; it is not entirely clear what those changes should be, though.

When considering how that CSS will be stored, processed and delivered to the client, there are three design choices to be made:

  1. Where should the CSS for a given template be stored?
  2. When rendering a page which includes some template (possibly indirectly), how should the parser find out that the CSS for that template needs to be included?
  3. How should the CSS be delivered to the client?

The first question has big UX and workflow impact for template maintainers; the second affects how the extension can integrate with various things that use the parser(s); the third has important performance implications. They are not fully independent; most combinations are possible but some are more natural than others. At least for the first choice, changing our mind later will be pretty hard.

The goal of this RfC is to find out what the hard constraints are (do user expectations, performance requirements, Parsoid limitations etc. force our hand in any of the questions?) and then agree on what seems like the most promising combination of choices within those limitations. (There is a separate consultation to get input from editors.)


Some initial thoughts on what the choices are and what benefits/drawbacks they have:

  1. Where should the CSS be stored?
    1. On the template page, via a parser tag (this is what the current implementation of the extension does - you just put the CSS code together with the template code, inside a <templatestyles>...</templatestyles> block).
      • Pro: atomic changes
      • Mixed: flexible - template authors can chose to include CSS from a subpage, or use template parameters in the CSS code, or even generate dynamic CSS via #tag or a Lua module. That enables all kinds of interesting use cases; OTOH in the past we have sometimes regretted making templates too flexible, which resulted in problems that should have been handled in software being handled in template language (e.g. we probably don't want users to reimplement LESS in Lua).
      • Con: template wikitext becomes even more cluttered than it already is
      • Con: the wikitext editor is unlikely to offer CSS syntax highlighting and other feature-specific support.
    2. Use a separate page (e.g. a /styles.css subpage); migrate to a dedicated revision slot when multi-content revisions (MCR) become available.
      • Pro: controlling who can edit the template and who can edit the styles can be detached (especially important if we plan to extend this mechanism to template-specific JS eventually) - can be protected separately, or we can require a specific user right for one but not the other
      • Pro: more granular history (possible to subscribe for CSS changes only)
      • Pro: easy to provide a better UI (e.g. use CodeEditor to edit the CSS, or highlight syntax errors).
      • Pro: easier to search templates by style (e.g. to find templates with non-mobile-friendly styles).
      • Con: until MCR gets implemented, atomic changes to both template code and CSS will not be possible.
      • Con: until MCR gets implemented, won't work so well with watchlists and other change tracking mechanisms (e.g. won't show up on RecentChangesLinked), or requires extra work to make sure it does.
      • Con: initially for protected templates the (not yet created) subpage unprotected and an easy vandalism target; might need extra code to inherit protection status of the template page
      • Con: might need a way to turn it on for templates not in the Template namespace
    3. Use a <templatestyles>...</templatestyles> parser tag, like in the A option, but save the contents of the tag in the database when the template is saved (initially as a custom table; with MCR probably as a persistent virtual slot).
      • Pro/Con: largely the same as the A option, except for the flexibility.
      • Con: problematic with extensions which require an older revision or an unsaved edited version of the template to be used (FlaggedRevs, TemplateSandbox)
  2. How should the parser fetch the CSS styles that need to be included when rendering a page?
    1. Hook into the template transclusion mechanism, let the parser pull in the source code and only process it once all templates have been transcluded.
      • This requires 1A; with 1B and 1C, there is no straightforward way to do it (in the PHP parser it could be done via ParserFetchTemplate hook but there doesn't seem to be any equivalent in Parsoid)
      • Pro: Takes almost no effort to implement - it would just happen naturally.
      • Pro: Since transclusion is understood by a lot of tools, all kinds of things would also just work - Parsoid (since it relies on the PHP parser to process parser tags), TemplateSandbox, maybe scary transclusion.
      • Con: all included CSS would probably need to be parsed/minified again every time a page is edited (at least until the parser starts supporting partial invalidation for DOM subtree replacements), or cached manually
      • Con: when rendering and article, not really possible to tell where a piece of CSS came from. (Although for debugging it could just be included as a CSS or HTML comment.)
    2. Hook into the end of parsing (e.g. OutputPageParserOutput), get list of included templates, get the corresponding CSS.
      • Pro: probably easier to support debugging/bundling/other clever behavior as we know where each piece of CSS came from.
      • Con: hard to make it interact properly with ParserFetchTemplate/BeforeParserFetchTemplateAndtitle extensions (incl. FlaggedRevs) - again, MCR would probably help
      • Con: even if 1A is chosen in the previous question, the flexibility advantage would be lost.
      • Con: Parsoid would probably require its own plugin. Making it work with TemplateSandbox would probably require extra code.
    3. Require the location of the CSS styles to be included as a parameter to the parser tag (this would mean an empty parser tag would be required even if we go with 1B, i.e. something like <templatestyles src="Template:Foo/styles.css" /> would have to be appended to the template), rely on transclusion to find out which CSS pages can be included.
      • This makes no sense for 1A (just use 2A instead).
      • Pro: relies on transclusion so a lot of things would just work (much like with 2A)
      • Pro: probably easier to support debugging/bundling/other clever behavior as we know where each piece of CSS came from.
      • Pro: with 1B, could play nicely with ParserFetchTemplate/BeforeParserFetchTemplateAndtitle
      • Con: a bit more cumbersome/verbose wikisyntax (although the pre-save transform could be used to automatically convert <templatestyles /> into <templatestyles src="[pagename]/styles.css" />)
  3. How should the CSS be delivered to the client?
    1. A <style> tag in the HTML page, within the DOM subtree of the template.
      • Pro: no extra effort to add/remove from the page when using VisualEditor or live preview
      • Pro: CSS for templates lower in the body do not block rendering of the first paragraph
      • Con: no deduplication; some templates (flags etc.) might be used hundreds of times on the same page (not a bandwidth concern due to compression but might affect browser memory usage / rendering speed?)
      • Con: FOUC issues if the template is at the end of the page and the styles affect content outside it.
      • Con: if templates have conflicting styles, the end result might depend on the order in which the templates are used in the page.
      • Con: could cause extra repaints/reflows?
    2. <style> tags in the document head
      • Pro: deduplicated.
      • Pro: can be ordered (although supporting that in various clients would require extra work)
      • Mixed: CSS loaded before content (no FOUC, but blocks rendering)
      • Con: VisualEditor and other applications that do "piecewise composing" of previews will need some sort of API to tell which stylesheets need to be added/removed (and possibly deal with ordering them).
      • Con: applications which fetch page content and then discard parts of it (e.g. Mobile Content Service stripping out collapsed sections and certain templates) have no way of discarding unneeded styles
    3. External stylesheets loaded via ResourceLoader (see T155813#3047635 for more details)
      • Pro: deduplicated.
      • Pro: can be ordered (although supporting that in various clients would require extra work)
      • Pro: can be properly cached in the client.
      • Pro: if we want to allow templates to depend on RL modules to share code, that would be easier to do this way.
      • Mixed: CSS loaded before content (no FOUC, but blocks rendering)
      • Con: on non-HTTP/2-aware clients (do those still exist?) less performant than inlining
      • Con: some browsers (Chrome?) are bad at prioritizing HTTP/2 multiplexing so end up downloading CSS slower this way
      • Con: if RL bundles modules, cache fragmentation; if it does not, compression gets less effective
      • Con: VisualEditor and other applications that do "piecewise composing" of previews will need some sort of API to tell which stylesheets need to be added/removed (and possibly deal with ordering them).
      • Con: applications which fetch page content and then discard parts of it (e.g. Mobile Content Service stripping out collapsed sections and certain templates) have no way of discarding unneeded styles
    4. Use ResourceLoader modules to tell OutputPage what styles are needed, but make it inline the styles in the head instead of linking to them. This would have the frontend characteristics of 3B and the backend characteristics of 3C.
    5. Like 3A, but somehow remove the duplicate style blocks.
      • The logic for this should probably live in core, since the reasons Performance likes it apply to several other extensions too. There's a straw proposal for that in T155813#3095844.
      • Any tool manipulating the HTML would have to be aware of 3E so it could avoid removing styles when removing the HTML that contains the <style> tag.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

We want to provide the ability to specify stylesheets that get added to the page when a template is transcluded (in large part so @media can be used), and these stylesheets must be sanitized to prevent the styles leaking outside of the content area, to prevent use of dangerous features such as the expression() function, to prevent tracking via external url()s, etc.

That's… a major digression from "only affecting the relevant template", which was the original point.

"only affecting the relevant template" does not seem to ever have been the original point, at least if the original RFC as it was when it was approved by ArchCom is any indication of the original point.

In the "Challenges" section, it was specifically considering the use of <style scoped> (before that was removed from the draft HTML standard) but leaving it up to the on-wiki template authors to add any containing elements that would be necessary to truly scope the styles to only the output of the template. I note that sort of scoping can as easily be applied by using appropriate classes on the containing elements and using them in the selectors.

Even that much isn't part of the actual "Proposal" section that ArchCom approved. In fact it seems directly contradicted by "Where multiple versions of the same template are added on the same page ensure only one instance of the style tag is included" which would be impossible to do when using <style scoped> since <style scoped> needs to be repeated in every different containing element. "Templates containing simply style tags can be transcluded and shared between commonly used templates" is also directly contrary to enforcing style scoping to the template defining the styles, since Template:Foo can't transclude styles from Template:Bar if those styles are scoped to the output of Template:Bar.

Scoping to the content div is almost valueless; some minor security benefits

True, it isn't a major security benefit, although even a minor security benefit is probably worthwhile if there's not a reason not to do it.

at a major content usability loss.

What loss would that be, exactly? It can't be vandalism since people already do that with the existing inline style support, and I can't think of what else you might be referring to.

I'm not sure I can endorse deployment if this is going to make the template problem worse.

What "template problem", exactly, are you referring to? The only things I remember reading about that are described as a "template problem" don't seem to be at all applicable to your objection here.

When was that decision made?

2014-10-15T21:50:58Z, it looks like. Although to be fair it doesn't seem like the extent of scoping (content versus template) was given a whole lot of discussion there.

BTW. 3A <style> tag in the HTML page, within the DOM subtree of the template. states. Con: no deduplication; some templates (flags etc.) might be used hundreds of times on the same page (not a bandwidth concern due to compression but might affect browser memory usage / rendering speed?) Is this a fact ? Why can't we add a fragment only once (first time it is added to the page). We do similar things for ExtensionData and several RL parameters right ? Just checking.

It introduces global state, which Parsoid and VE really hate. With RL modules (3C), every call to the extension tag usually re-specifies the module (and Parsoid/VE could track that easily enough) and they get deduplicated automatically. I suppose Parsoid could do 3B in a similar manner, although it'd be a lot more data to collect and combine.

I don't think Parsoid deals with ExtensionData stuff at all, chances are each call to the extension tag winds up thinking it's the only one on an otherwise-empty page.

I see that there are specific implementation questions and proposals (with pros/cons) being made, but I don't see a listing of what properties you wish for the solutions to have. i.e. instead of having some properties emerge as side-effects of specific solutions, it would be better to be explicit about what the design goals are. Ex: do you want wikitext clutter or not, do you want CSS highlighting or not, what kind of performance do you want, what kind of edit requirements are being considered, do you want editors to do crazy things or not, do you want to provide LESS support or not, and so on. Right now, it seems we can pick a medley of options for the different implement options and we get some subset of properties by default.

Can I request an explicit articulation of what the design goals are?

Identifying (or prioritizing) secondary design goals was pretty much the goal of this RFC. (The primary design goal was to enable inclusion of CSS into articles in a safe manner, in a way that's compatible with the template system; but the choices presented here don't affect that.) I was hoping that feedback here in or the editor consultation can cut down the number of options before we move on more subjective grounds, but looks like that's not happening, so here is my attempt at design goals:

  • Do not negatively impact performance. The (not yet fully verified) Reading metrics review for 2016 suggested that performance improvements were the only thing visibly moving the needle on reader attrition; we don't want to move it back. (As far as I can see, this would mean frontend performance; parsing happens rarely and is already fairly slow so the choices for questions 1-2 probably won't matter much. I am unsure about the exact effect of 3A/B/C though; I was hoping the Performance team can weigh in on this.)
  • Make CSS editing a good experience, so that template maintainers actually take advantage of this feature. I'd argue that this means a dedicated editor and the ability to pinpoint errors, which would require CSS to be stored on separate pages (1B). (It would also be nice if TemplateSandbox kept working, but I don't think people rely on it so heavily as to make that critical.)
  • Make the system flexible enough that editors are willing to replace most existing templates. IMO this means some sort of template parameter passing - there are way too many complex templates that take arguments for color, width, float direction etc. I guess colors could be kept as inline styles (they aren't really device-dependent), and the rest can be handled by classnames (and specifying width is a bad idea in the first place)? But it would be nice if there was an explicit way to pass settings to the CSS code. Since 1A/C has been excluded in the previous bullet point, that would imply 2C in which case the parser tag could be used with {{#tag}} to take parameters. How would it work with the CSS parsing though? (We want to parse when the page is saved, to have sane error handling.) So I guess this goal is infeasible no matter what.
  • Make the system modular, for the same reason (ie. make it sure that a stylesheet can reference another stylesheet). With 1A/C excluded, this would mean using LESS (or rolling our own syntax, which seems like a bad idea). Or we could just leave it to be handled at the template level (ie. make sure when you include A.CSS you also include B.CSS, and in the right order) - but that would kind of suck. LESS is a bit of a scope creep; OTOH it would not be needed for the MVP, just make sure nothing prevents adding it later.

if scoped styles are really required, one possibility would be to assign auto-generated (based on template name) class names that will be part of the HTML and would be automatically added to CSS style rules that the template provides.

My understanding was that this would only be doable in Parsoid, ie. would block deployment until the Great Parser Unification. Having seen how well blocking projects until the Great User Unification worked out, I'd be very skeptical of that idea :-)

Scoping to the content div is almost valueless; some minor security benefits at a major content usability loss. I'm not sure I can endorse deployment if this is going to make the template problem worse. When was that decision made?

Scoping to the content div is better than not scoping at all, and easy to do. Not scoping to the template was always the default; scoped styles was the pie in the sky option for the glorious future when all browsers start supporting it. That didn't work out.

What's the concern about not scoping styles to the template? That editors would unintentionally cause CSS conflicts or that they would intentionally abuse the possibilities in a way that would cause chaos? I'd argue that the first is not a big problem (it's very straightforward to limit the styles to your template, just add a class around it), and it's not obvious to me why the second one would be bad. There are some use cases where it would make sense (if not super compelling): styling full pages which are expected to be used with section=new (this is now commonly done with styled unclosed divs for user pages; that will be killed when Tidy gets replaced), and modular page-level CSS (move main page styles etc. from Common.css; you can enclose the whole page in a div instead and use scoped styling, but it seems a but pointless).

Also, if we go with automatic scoping (and find a way to do it in the PHP parser / wait until it goes away), that would make it impossible to deduplicate CSS. You can use something like .template1 .template2 .template3 { ...rules... } but dynamically updating that in VE as templates get added/removed seems nightmarish. So it would be good to hear confirmation first from Performance that duplication is not a problem.

For example, it seems scoping of styles to the template content might be a desirable goal because of the higher level goal of composition of documents (for performance reasons, or editability reasons in VE and maybe other tools even?).

There are two slightly different concepts being lumped together here.

  • Replacing template transclusions without reparsing the whole page, as VE editing does and has long been a suggestion for Parsoid performance improvement, requires that the styles be associated with the rest of the template output but not that those styles don't affect anything else in the page. This seems to be a desirable property. 3A and 3C can achieve this fairly easily, while with 3B it would be difficult (but 2B+3B could do it by regenerating the whole CSS from the list of templates).
  • Scoping the effects of the styles themselves to the output of the template that includes them. I'm not at all sure that this is a desirable property, particularly if we choose 2B since it would seem to preclude sharing of styles between templates which very much is a desirable property. Since templates do not currently return self-contained DOM nodes or document fragments, this sort of scoping would likely also be blocked on waiting for that to actually happen.

Valid observation. So, the two potential secondary design goals (to use the term that @Tgr used) are:

  • Do not introduce page-global state that blocks composing a page from individual fragments -- affects VE edits and future incremental parsing approaches
  • Prevent a template's styles from affecting anything else on the page

I think you are saying the first one is desirable If we all agree about that, we should consider removing 3B from the list of available implementation choices. If we are hesitant to remove it, then that means we aren't agreed on that goal and we should discuss that (or we should figure out how we can support VE and Parsoid goals while retaining it).

As for the second one above, you are saying that it is not necessarily a reasonable design goal. You seem to be saying that we shouldn't complicate the architecture and implementation to prevent template authors from doing undesirable things. So, template authors who want to strictly scope CSS rules to just the template content could write stylesheets that qualify it based on element ids. There could also be linting tools that flag bad usage that could leak styles outside the template. Since I am not a UI designer or someone who develops user-facing UI products, I will say I don't have a strong opinion about it. But, I can buy the reasoning. That said, one thing that would make the case stronger is to add language to the documentation that indicates that the current implementation will not preclude future design / implementation changes that might enforce stronger scoping, i.e. template authors should not develop CSS styles expecting that it would leak out and apply elsewhere. That is, we treat CSS leaking as undefined behavior which might change in the future. This will encourage good styling practices as well as give us the flexibility of not having to preserve potentially broken behavior in the future.

if scoped styles are really required, one possibility would be to assign auto-generated (based on template name) class names that will be part of the HTML and would be automatically added to CSS style rules that the template provides.

My understanding was that this would only be doable in Parsoid, ie. would block deployment until the Great Parser Unification. Having seen how well blocking projects until the Great User Unification worked out, I'd be very skeptical of that idea :-)

I agree with you about not blocking on parser unification. However, I would also add that, as a general principle, we should also not introduce features that would make that job even more harder than it already is. That said, more specifically, based on my remarks earlier in this comment in response to @Anomie, it doesn't seem that there is any such thing you are considering right now that would introduce global state on the page. So seems like you should get rid of option 3B (unless there is a way to support VE and Parsoid centric goals given that).

BTW. 3A <style> tag in the HTML page, within the DOM subtree of the template. states. Con: no deduplication; some templates (flags etc.) might be used hundreds of times on the same page (not a bandwidth concern due to compression but might affect browser memory usage / rendering speed?) Is this a fact ? Why can't we add a fragment only once (first time it is added to the page). We do similar things for ExtensionData and several RL parameters right ? Just checking.

It introduces global state, which Parsoid and VE really hate. With RL modules (3C), every call to the extension tag usually re-specifies the module (and Parsoid/VE could track that easily enough) and they get deduplicated automatically. I suppose Parsoid could do 3B in a similar manner, although it'd be a lot more data to collect and combine.

I don't think Parsoid deals with ExtensionData stuff at all, chances are each call to the extension tag winds up thinking it's the only one on an otherwise-empty page.

Yes, the only global state Parsoid currently supports is that required by Cite and if options surface that can let us remove that, we will pursue that option. Global state gets in the way of updating a page based on local changes. This most directly affects VE right now, but more broadly also gets in the way of page composition from individual fragments. This is not a problem limited to Parsoid and VE, but also affects wikitext editing. It is just that editors accept that for section edits, previews might be broken (because of unclosed tags, missing citations, etc.) and that bots might have to clean up citations after the fact (if a section edit deletes a named <ref> tag). This wikitext-edit problem only gets worse if you start considering edits at a finer granularity below sections. So, my contention is that eliminating global state is good in general, not just for VE and Parsoid. Global state and page-composition / fine-grained-edit-support are at odds with each other.

  • Make the system modular, for the same reason (ie. make it sure that a stylesheet can reference another stylesheet). With 1A/C excluded, this would mean using LESS (or rolling our own syntax, which seems like a bad idea). Or we could just leave it to be handled at the template level (ie. make sure when you include A.CSS you also include B.CSS, and in the right order) - but that would kind of suck.

CSS has @import. The trick would be that we'd need to provide an endpoint for fetching sanitized CSS and then whitelist only that endpoint as the argument to the at-rule. Or we could make our own @-mw-import that TemplateStyles would transform somehow. But I think I'd rather leave either of these for a second iteration.

If we all agree about [page-global state being undesirable], we should consider removing 3B from the list of available implementation choices. If we are hesitant to remove it, then that means we aren't agreed on that goal and we should discuss that (or we should figure out how we can support VE and Parsoid goals while retaining it).

I'm hesitant to decide on duplicated vs. deduplicated CSS until we are confident we understand the performance implications (and would really like to hear from the Performance team on that). I don't understand what's the difference between 3B / 3C in terms of globalness - it seems trivial to transform one into the other. Deduplication seems inherently global to me.

You seem to be saying that we shouldn't complicate the architecture and implementation to prevent template authors from doing undesirable things.

Depends on how much we need to complicate it. I think it's desirable (there are a few reasonable use cases for unscoped CSS but they are not important and probably can be achieved differently) but it seems very hard to achieve and not desirable enough to offset that.

Yes, the only global state Parsoid currently supports is that required by Cite and if options surface that can let us remove that, we will pursue that option.

Cite seems to require a different level of global state though. Content-dependent stylesheets / RL modules require the ability to aggregate (and maybe deduplicate) content metadata. Cite requires one piece of content to depend on other pieces in an application-specific ways. The first seems a lot easier to handle, and very unlikely to become completely unnecessary (the same kind of globalness is needed for extension-specific RL modules, LinksUpdate dependency aggregation, parser limits, outputting error messages etc).

I don't understand what's the difference between 3B / 3C in terms of globalness - it seems trivial to transform one into the other. Deduplication seems inherently global to me.

IMO the difference is data versus metadata.

For 3C, every use of <templatestyles> results in a ResourceLoader module being added to the output. That's a short string that's basically metadata. The existing PHP code handles deduplicating it for us, and Parsoid and VE should already be doing that too (T156414 is because they aren't).

For 3B, every use of <templatestyles> results in some internal metadata, but what any external client sees is a potentially-large blob of data being added to the output. Parsoid/VE would have to take the CSS data, metadataize it somehow, deduplicate it, and then at some point turn it back into data in a way that VE can still associate that data with the source template(s) correctly while editing.

Whether the CSS dependencies of the page can be represented as a list of RL modules is a different question from whether they are included by adding <link> tags pointing to an API which converts a list of module names to concatenated+minified CSS, or by inlining said CSS.

Passing big CSS blobs around is (while possible) awkward and does not seem to have much advantage over 3A, yes. But possing module names around and then RL looking them up and inlining them into the document head seems like a reasonable approach to me if it's found that inline CSS has better performance but deduplicating is still important.

One security aspect that we didn't consider (probably not related to the questions of this RfC, but mentioning just in case): MediaWiki disables user-defined JS/CSS on some pages (e.g. login). It would make sense to extend that to CSS in templates in system messages used on those pages. Is that something that's easier/harder with some of the parsing strategies?

One security aspect that we didn't consider (probably not related to the questions of this RfC, but mentioning just in case): MediaWiki disables user-defined JS/CSS on some pages (e.g. login). It would make sense to extend that to CSS in templates in system messages used on those pages. Is that something that's easier/harder with some of the parsing strategies?

For 3C RL should handle it, I believe. For 3A and 3B, we'd have to pass that flag into the parser. OTOH, I wonder how much it matters since we'd be sanitizing the CSS anyway.

I discussed scoping CSS to templates with @Jdforrester-WMF. As I understand it, the main reason it's attractive is that it would prevent problems with piecewise composition / partial presentation of pages. E.g. visual section editing, mobile lazy-loading of sections. Also it wold prevent FOUC problems with the 3A strategy. And it's a lot easier to scope and then change our minds later than the other way around (that could mean dealing with the fallout of template editors finding some creative uses for page-global CSS.) IMO if it's not hard to scope, it's worth doing.

Wikimedia wikis are in the process of switching to a PHP-based Tidy-replacement (a PHP port of Html5Depurate, so there is a chance that could be used to add unique IDs or classes to templates, and communicate those IDs to TemplateStyles (or get them from TemplateStyles). I'm hoping @ssastry or @tstarling can tell whether that's plausible.
(The drawback is that it would only work with same tidy config WMF plans to use, which likely requires HHVM to be fast enough, ie. only usable on large wikis. IMO that's not a problem - the extension would still work on other wikis, only reusing Wikipedia templates would become harder, and that is already a hopeless endeavor on small wikis.)

The other option for scoping is using Parsoid (which has a concept of the DOM, unlike the PHP parser), and somehow dealing with the extension not working (or not working the same way) on the PHP parser. IMO that's not realistic. Maybe it will become realistic in the future, if Wikimedia become able to use the same parser everywhere, but that's not something we want to wait for.

OTOH, I wonder how much it matters since we'd be sanitizing the CSS anyway.

I'm mainly thinking of position: absolute, which is hard to do without for template styling but could be abused on special pages for clickjacking.

Wikimedia wikis are in the process of switching to a PHP-based Tidy-replacement (a PHP port of Html5Depurate, so there is a chance that could be used to add unique IDs or classes to templates, and communicate those IDs to TemplateStyles (or get them from TemplateStyles). I'm hoping @ssastry or @tstarling can tell whether that's plausible.

Unless I'm entirely mistaken about what that tool does, by the time it runs all templates have been transcluded and turned into HTML long before.

only reusing Wikipedia templates would become harder, and that is already a hopeless endeavor on small wikis.)

Not that hopeless. Often you just need ParserFunctions and Scribunto, and sometimes you need to grab some styling from Common.css too. TemplateStyles should improve that situation. Tidy is sometimes an issue, but AFAIK not that often.

Unless I'm entirely mistaken about what that tool does, by the time it runs all templates have been transcluded and turned into HTML long before.

I was thinking of using it to find the next sibling of the style block and scope it to that. (E.g. TemplateStyles leaves something like <style data-mw-prefix="sk5d25kd">.sk5d25kd [original selector] {...}</style> and then the depurator adds that class to the next DOM node.) Granted, it would only work with balanced templates (those which are enclosed in a single HTML tag), but most templates are like that or could be converted without too much pain.

Not that hopeless. Often you just need ParserFunctions and Scribunto, and sometimes you need to grab some styling from Common.css too. TemplateStyles should improve that situation. Tidy is sometimes an issue, but AFAIK not that often.

(In my experience Wikipedia templates are pretty useless without Tidy. Which is fine for regular Tidy, any wiki can run that. OTOH most templates are just too complex to be able to compile on non-dedicated machines.)

I discussed scoping CSS to templates with @Jdforrester-WMF. As I understand it, the main reason it's attractive is that it would prevent problems with piecewise composition / partial presentation of pages. E.g. visual section editing, mobile lazy-loading of sections. Also it wold prevent FOUC problems with the 3A strategy. And it's a lot easier to scope and then change our minds later than the other way around (that could mean dealing with the fallout of template editors finding some creative uses for page-global CSS.) IMO if it's not hard to scope, it's worth doing.

[ Hello from a train in South India! ]

Given that we have an ordering an dependency problem, here is one way to break that dependency. I don't think we need to guarantee perfect scoping right now in the PHP parser implementation. As indicated in T155813#3021202, all we need to do is declare what the specified behavior is thereby making everything else undefined and subject to change when we are able to turn on the stricter constraints of scoped CSS. However, by itself, such declarations won't help editors. They need tools that help them detect when they are relying on such undefined behavior. Parsoid could be such a feedback tool. i.e. Parsoid's rendering of template styles can be declared to be the authoritative rendering. So, if a page looks broken in Parsoid because of scoped CSS enforcement, then editors will have to fix their styles even if looks "fine" in the PHP parser rendering.

I don't know how useful it would be to declare as "authoritative" a rendering that isn't used by or even easily available to the majority of users, including me when I'm trying to develop the thing, and that is known to be non-authoritative in other cases.

I understand where you are coming from. That was a proposal to figure out a way forward without having to solve all problems at once. Open to other possibilities.

Some additional specific remarks below.

I don't know how useful it would be to declare as "authoritative" a rendering that isn't used by or even easily available to the majority of users,

It is available via RESTBase at a well-established URL so it isn't true that it isn't easily available. But, if you are talking about local development, yes, it does require dev installs of Parsoid.

including me when I'm trying to develop the thing, and that is known to be non-authoritative in other cases.

The differences between Parsoid and PHP parser rendering are not that significant -- one of the biggest sources of diffs were because of HTML5 and HTML4. So, once Tidy is replaced, Parsoid and PHP parser rendering differences are reduced further. So, Parsoid rendering not being non-authoritative is not a significant blocker. If we decide to go with this approach, we could even prioritize identifying the biggest sources of differences via visual diff testing and addressing them.

But, as I said, this is a proposal to figure out a way forward without having to solve all problems upfront. But, the broader idea is to figure out a way to leverage Parsoid's DOM semantics to prevent templates / editors relying on leaky CSS behavior in the time frame when Parsoid's HTML is not used for read views.

<thinking-aloud>The ParserMigration extension that Tim developed for aiding editors during Tidy replacement can be used to compare renderings from two different sources. So, that extension could be leveraged for comparing renderings between PHP parser and Parsoid for previewing edits. So, editors don't need to anything special beyond enabling it. And, if we are talking about template editors, it is a small subset of all editors.</thinking-aloud>

So, if a page looks broken in Parsoid because of scoped CSS enforcement, then editors will have to fix their styles even if looks "fine" in the PHP parser rendering.

IMO making it work differently in the two parsers would be a bad idea and would just confuse everyone. Instead of having to think about their templates looking good on both mobile and desktop (and tablets, and desktop using the mobile site - so lot's of complexity in that alone), they would now have to test with desktop+PHP, desktop+Parsoid (Flow, VE), mobile+PHP, mobile+Parsoid. It's hard to even keep track of what is used when.

We could just tell very sternly to not use unscoped CSS and then give tools to detect it (shouldn't be that hard with a gadget using querySelector that template maintainers could turn on).

So, if a page looks broken in Parsoid because of scoped CSS enforcement, then editors will have to fix their styles even if looks "fine" in the PHP parser rendering.

IMO making it work differently in the two parsers would be a bad idea and would just confuse everyone. Instead of having to think about their templates looking good on both mobile and desktop (and tablets, and desktop using the mobile site - so lot's of complexity in that alone), they would now have to test with desktop+PHP, desktop+Parsoid (Flow, VE), mobile+PHP, mobile+Parsoid. It's hard to even keep track of what is used when.

We could just tell very sternly to not use unscoped CSS and then give tools to detect it (shouldn't be that hard with a gadget using querySelector that template maintainers could turn on).

Works for me. The intent of using Parsoid was to figure out a way of ensuring that scoped CSS is used, but any other tool to do the same should do the trick as well.

At this point, I think actually forcing the scoping of styles to the template including them simply isn't possible. As much as some people might wish Parsoid's HTML were used for everything, that's not the reality.

If we want to keep the option open for Editing to add restrictive scoping in the future when we have a DOM-based parser in core, by documenting "Yes, you can do this, but don't because we reserve the right to break it in the future. You have been warned!" and giving people tools to determine that styles are leaking, I don't have a problem with that as long as it still allows the same stylesheet to be used by multiple templates (e.g. option 1B+2C, possibly 1A+2A by (ab)using {{#tag:}}).

Although I've yet to see really compelling arguments in favor of doing it at all, most of the people arguing in favor seem to have been assuming it as a given. The actual arguments presented so far seem to be:

  1. It may help avoid FOUC with 3A.
    • But 3A still has the duplication problem. I'd rather pick a solution that doesn't have that issue.
  2. It makes section editing previews more like the actual rendering.
    • Although we've had a similar problem with named references ever since those were added, and a similar problem whenever multiple sections are wrapped in a container (e.g. enwiki's various discussion-closing templates), and a similar problem with any other extension that conditionally adds RL modules (e.g. Babel).
  3. It means mobile's collapsed sections could also avoid loading the styles required inside a collapsed section until that section is actually opened.
    • Although that seems like it would have the same "multiple sections are wrapped in a container" and "any other extension adding RL modules" issues as #2.
    • I also note it could be handled by having the service that does the chopping of the article into pieces analyze the styles too, which would have additional benefits since it could also handle unnecessary style rules from MediaWiki:Common.css and the like.

In any event, I would recommend against a situation where all templates with styles register a module in ResourceLoader as this would significantly increase the startup manifest which would negatively affect performance and imho does not scale.

Obviously not, that'd probably be thousands of modules! If we were to go the 1B+3C route, my thinking is to somehow register a handler for anything with the prefix "ext.TemplateStyles.". A normal parse would output a module of "ext.TemplateStyles.$revid" which the handler would be able to fetch by $revid, and something like a preview would probably wind up doing the one stylesheet inline like we do now for previewing user JS and CSS edits.

I explained this plan a bit better in an email to Gergő, so I'm copying the explanation here too:

ResourceLoader has two parts, the PHP and the in-page JS, each with a manifest of known modules. For the PHP, currently every individual module gets added in the big array in resources/Resources.php, or the equivalent through other mechanisms. For the JS, the PHP generates the JS-manifest from the PHP-manifest and serves it like https://en.wikipedia.org/w/load.php?debug=true&lang=en&modules=startup&only=scripts&skin=monobook (see the mw.loader.register call).

My plan is to extend these manifests so RL is able to know that any module with a prefix should use a copy of the same entry in those two places. So instead of having to list out 1e6 copies of exactly the same thing in the manifest for every possible stylesheet, RL would be able to generate the needed manifest entry on demand.

Possibly something like this, Resources.php contains

'ext.templatestyles.*' => [ 'class' => 'TemplateStylesResourceLoaderModule' ]

When RL gets asked to load 'ext.templatestyles.r12345', it'd look that up first and not find it. Then it'd chop off the last piece (".r12345") and look up 'ext.templatestyles.*'. It does find that, so it uses it just as if Resources.php had contained 'ext.templatestyles.r12345' with the same definition-array. TemplateStylesResourceLoaderModule would call the getName() method it inherits from ResourceLoaderModule and serve up the styles from revision 12345 (or serve an error response in whatever manner other RL modules do if their content is missing).

Nothing else would need to know anything about this change. From the perspective of OutputPage generating its <link> tags or JS calling mw.loader.load(), it just continues to hand module names to RL and RL handles the details.

At this point, I think actually forcing the scoping of styles to the template including them simply isn't possible.

I have come to the same conclusion. As I understand it, the only potential implementation path currently (pending the Great Parser Unification) were to use the Tidy replacement (MediaWiki\Tidy\Balancer) for the PHP parser, and have Parsoid rely on its (already existing) template-parent-identification powers. But that would mean two parallel implementations which need to be functionally equivalent but are not even conceptually similar (Parsoid operates on some kind of MediaWiki syntax tree and turns it into a HTML DOM, Balancer takes a HTML string which has already lost all information about wikitext semantics), which is a straight road to maintenance hell. Also, Balancer has a narrow purpose (transform an arbitrary HTML document into a valid one), which is generally how we want our components, and is a candidate for librarization, so trying to force this unrelated functionality into it would be very awkward.

So a stern warning that we will break your stuff if it relies on lack of scoping (and maybe some tool with which editors can detect scope leaks) is the best we can do ATM.

I'm going to summarize the Performance Team's position on this RFC, which was brought up in a private email thread.

Given the uncertainty about the content size (we really won't know until this feature is in the wild for a while), we're going to assume that it will be small when compared to the gzipped size of an article and that having it alongside the content is more efficient than the various scenarios that entail extra requests. For that reason, we recommend that template CSS is included inline, with the template. This assumes the encouraged best practice of template CSS striving to only affect template contents.

Regarding "keeping track of state", which I'm rather going to call what it really is: not re-adding CSS to the page that's already on it. This is an optimization that will obviously save CPU, an important factor on low-powered device, to ensure better scroll smoothness, etc. This is an optimisation that the various techniques to compose pages (existing and upcoming) should implement at some point. But if some cases like saving a section edit re-adds some template CSS that was already on the page, it's really not a big deal. Common sense should apply to deciding which situations warrant extra engineering work to make that optimisation happen. @Krinkle believes that Mediawiki can fairly easily do that right now, whereby only the first occurrence of a template on a page would include the template CSS.

As for enforcing scoping constraints to the CSS so that it can only affect template contents, that would be nice to have, but in my opinion should probably be treated as a possible follow-up and not a hard dependency. Encouraging best practices and monitoring repaints should be sufficient at first. And monitoring will be a must-have, since articles could be quite uneven in the amount of template CSS they carry.

And monitoring will be a must-have, since articles could be quite uneven in the amount of template CSS they carry.

For the record, I have absolutely no idea how to even begin doing that monitoring. If there's some code that would need to be added by TemplateStyles so you can do this monitoring, please supply it.

Given the uncertainty about the content size (we really won't know until this feature is in the wild for a while), we're going to assume that it will be small when compared to the gzipped size of an article and that having it alongside the content is more efficient than the various scenarios that entail extra requests. For that reason, we recommend that template CSS is included inline, with the template. This assumes the encouraged best practice of template CSS striving to only affect template contents.

By "inline, with the template" you mean at the same location as the template-generated HTML (as opposed to the document head), right? ie. 3A? If that makes deduplication impossible (it seems at least very hard to me; I would be curious what Krinkle's idea is), would you still favor it over 3D (styles in head)?

And monitoring will be a must-have, since articles could be quite uneven in the amount of template CSS they carry.

What does that mean specifically? Just looking long and hard on firstPaint, or manual testing or some kind of automation like browser-perf?

Given the uncertainty about the content size (we really won't know until this feature is in the wild for a while), we're going to assume that it will be small when compared to the gzipped size of an article and that having it alongside the content is more efficient than the various scenarios that entail extra requests. For that reason, we recommend that template CSS is included inline, with the template. This assumes the encouraged best practice of template CSS striving to only affect template contents.

By "inline, with the template" you mean at the same location as the template-generated HTML

Yeah, right before the main parsed HTML as returned from the template transclusion.

I know that we allow for some level of recursion, and after a certain point it's hard to know what has or hasn't been included before the "current" point (considering multiple passes of substitution, and what further expansion of previous portions may end up adding). In a native synchronous depth-first recursion, this would work fine. But I imagine we don't want to rely on that, especially if leveraging asynchronicity.

However I figured that 1) it seems we do actually go depth-first right now, so perhaps this isn't an issue. 2) If nothing else, perhaps this can be done by using two separate passes. (Either parse/post-process, or pre-process/parse), meaning that the initial pass leaves placeholders for the second pass to only consider the first occurrence of any template's css would be expanded, the others becoming no-ops. I briefly looked through Parser.php for examples similar to this. Couldn't find anything, though I do remember us having something like this already for other cases (maybe mw:editsection?).

The placeholder would carry the template revision id (or maybe template name/page id).

And monitoring will be a must-have, since articles could be quite uneven in the amount of template CSS they carry.

What does that mean specifically? Just looking long and hard on firstPaint, or manual testing or some kind of automation like browser-perf?

Synthetic testing seems like the way to go, I doubt we'll have information exposed to the client soon about paints beyond the first one. Counting the paint events and looking for spikes is the idea. This is probably something worth doing in general, not just for template styles. I don't have a specific tool to suggest, as I haven't checked if the usual suspects have that feature. For WPT at least, it seems to only be an idea at the moment: https://github.com/WPO-Foundation/webpagetest/issues/278

We'll only know if we can make it useful by trying, as natural variations in content might create too much noise to set meaningful alerting thresholds. The challenge will be picking a corpus of articles that's meaningful to watch. We won't be able to be comprehensive, but maybe if we focus on articles that contain the most used templates, we have a chance of catching that sort of issue that would impact a lot of our traffic.

For the coming fiscal year we're going to work on synthetic testing tools that would allow us to check for performance regressions on commit (that's the dream anyway), I'll keep paint tracking in mind when we work on that, as it might be possible in the context of the tool we build and could have a variety of uses.

Anyway, this is all still quite pie-in-the-sky, but necessary in the long term, I think, because reflows are subtle enough that they won't get reported, but will definitely negatively impact people's experience.

And monitoring will be a must-have, since articles could be quite uneven in the amount of template CSS they carry.

For the record, I have absolutely no idea how to even begin doing that monitoring. If there's some code that would need to be added by TemplateStyles so you can do this monitoring, please supply it.

Beyond the very hypothetical stuff I've just described, I think we could in very concrete terms track the size of CSS in templates, to be able to identify outliers. And even better (I don't know how difficult that would be), cross-reference template usage with actual traffic. If a template has a ton of CSS but is only used on very obscure pages, it's probably nothing to worry about, but the combination of huge CSS and very common template could be deadly. This is the sort of tracking I had in mind in the short term. It would also allow us to identify the average and extreme uses of the feature in the wild to reassess if the delivery strategy we went for was the right choice.

Yeah, right before the main parsed HTML as returned from the template transclusion.

We can easily do "Wherever the <templatestyles src="..."/> tag appears in the wikitext", but pulling it to the top of the template's output seems unlikely to be workable without some really weird and invasive hooking deep in the internals of the existing parser.

However I figured that 1) it seems we do actually go depth-first right now, so perhaps this isn't an issue. 2) If nothing else, perhaps this can be done by using two separate passes. (Either parse/post-process, or pre-process/parse), meaning that the initial pass leaves placeholders for the second pass to only consider the first occurrence of any template's css would be expanded, the others becoming no-ops. I briefly looked through Parser.php for examples similar to this. Couldn't find anything, though I do remember us having something like this already for other cases (maybe mw:editsection?).

I suspect we don't have general multiple passes going on because it quickly leads to sitautions where extension A and extension B both need an extra pass and each would want to run its pass after the other's pass. So you wind up with either weird ordering behavior or an unbounded number of passes as you keep rerunning them all until a full set of passes results in nothing changing.

For example, say that we have Cite and some hypothetical "DataTable" extension that transcludes a floated table of data once near its first use in the page and generates in-text links (like "See table 1") to refer to it from running text, with automatic numbering of the tables and so on. If Cite runs first and DataTable runs second, things blow up because the DataTable output could contain a <ref> placeholder. But if we have DataTable run first and Cite run second, that might fail if a use of a table appears inside a <ref>.

Anyway, I'm still waiting for Editing to reply here, since for years they've been very opposed to the sort of state tracking that you're advocating for here.

At this point, I think actually forcing the scoping of styles to the template including them simply isn't possible.

I have come to the same conclusion. As I understand it, the only potential implementation path currently (pending the Great Parser Unification) were to use the Tidy replacement (...) for the PHP parser, and have Parsoid rely on its (already existing) template-parent-identification powers. But that would mean two parallel implementations which need to be functionally equivalent but are not even conceptually similar
...... snip .....
So a stern warning that we will break your stuff if it relies on lack of scoping (and maybe some tool with which editors can detect scope leaks) is the best we can do ATM.

It doesn't make sense to implement this in the PHP parser. Scoping makes more sense in Parsoid which already has the necessary information. And, I think we seem to be agreed that warnings / documentation is how we are going to prevent editors relying on css leaking out of scopes.

Cite seems to require a different level of global state though. Content-dependent stylesheets / RL modules require the ability to aggregate (and maybe deduplicate) content metadata. Cite requires one piece of content to depend on other pieces in an application-specific ways. The first seems a lot easier to handle, and very unlikely to become completely unnecessary (the same kind of globalness is needed for extension-specific RL modules, LinksUpdate dependency aggregation, parser limits, outputting error messages etc).

This is a good observation. That leads me to the following.

As long as we agree that scoped CSS is necessary (independent of whether it is enforced or not), functionally, it won't matter where the CSS appears (inline or in <head>), and whether it is duplicated or de-duplicated. The deduplication then truly becomes an optimization, and as @Gilles observed in T155813#3075569 and I did in T155813#3012231, if we don't require that all CSS be de-duplicated fully all the time, VE and incremental updates can both introduce some duplicated CSS in the HTML without affecting functionality (because scoped CSS ensures that). So, the goals of incremental updates and page composition in VE and elsewhere are both supported. An incremental update implementation can decide at what frequency it decides to compact duplicated CSS on a page.

All this is doable in Parsoid. I think a lot of the recent discussion has been focused on what to do with the current PHP parser implementation. I don't have an easy answer for that. I don't think it makes sense to do heroic efforts there. On the Parsing team, we are firmly committed to adopting Parsoid technology as the primary parsing solution for MediaWiki. It is still at least an year off as we work through functionality and rendering gaps. So, my recommendation is to go with a solution that is compatible with performance and functionality goals, and doesn't require a DOM-based solution to be implemented in the PHP parser, or require CSS scoping to be implemented / available right away.

I haven't thought through @Krinkle's and @Anomie's discussion to comment on what the two are thinking about. I'll come back to it again later this week, if required.

if we don't require that all CSS be de-duplicated fully all the time, VE and incremental updates can both introduce some duplicated CSS in the HTML without affecting functionality (because scoped CSS ensures that).

The problem with VE and incremental updates isn't in introducing duplicated CSS, it's in accidentally removing the only copy. For example, consider a situation where Template:Foo and Template:Bar both use Template:Something/styles.css, a page SomePage uses both of those templates, and it so happens that the deduplication resulted in the <style> tag being produced by the transclusion of Foo.

If someone uses VE to remove the transclusion of Foo from SomePage, VE is going to have to somehow know that it needs to add the styles to the transclusion of Bar.

If someone edits Foo so that it no longer uses the shared stylesheet, the incremental updater is going to have to somehow know that it has to add the styles that were formerly attached to Foo to the transclusion of Bar which does still use them.

if we don't require that all CSS be de-duplicated fully all the time, VE and incremental updates can both introduce some duplicated CSS in the HTML without affecting functionality (because scoped CSS ensures that).

The problem with VE and incremental updates isn't in introducing duplicated CSS, it's in accidentally removing the only copy.

I address that in T155813#3012231. ( ... updating <head> styles in VE after an edit is simple (at worst, it retains unneeded styles when the last use of a template is deleted). Incremental updates would also work for the most part (you could either choose to retain unneeded styles if the edit removed the last use of a template OR choose to rebuild the styles for the page on certain edits OR do the rebuilding every so often) ... ) So, on deletion of a transclusion, you don't delete the CSS associated with the template. So, you are retaining useless CSS / duplicate CSS which is harmless (because of scoped CSS requirement).

[ PS: Looks like I had the wrong comment link in my comment earlier today. Fixed it to point to T155813#3012231 as intended. ]

So your plan is to special-case <style> tags in VE, incremental updater, and whatever other tools are involved so they're not attached to the wikitext that originally generated them like everything else is, and let the cruft accumulate until something else comes along and cleans it all up?

So your plan is to special-case <style> tags in VE, incremental updater, and whatever other tools are involved so they're not attached to the wikitext that originally generated them like everything else is, and let the cruft accumulate until something else comes along and cleans it all up?

I don't understand what you mean by special-casing, but, yes, this is an option that VE and incremental updater can consider. For example, VE could exercise this option of not removing CSS on edits since this simplifies the implementation inside VE - it won't have to do anything more than drop in new CSS whenever a translusion is added to a page. But, the incremental HTML updater could choose to always compact CSS .. or could do so on "major" edits (according to some knob of what major constitutes) .. or could do it every N edits .. or some much metric. Maybe it won't be as expensive to always force CSS recomputation on all edits. Given that we don't have this implementation right now, this is all speculation as to how this would work. But, I was mostly indicating that scoped CSS makes available implementation options open for the future while also potentially simplifying VE's own implementation. Am I missing something?

I don't understand what you mean by special-casing,

Unless I'm mistaken, when Parsoid processes a transclusion of a template it marks every bit of resulting HTML as coming from that transclusion (with some special magic to handle "unbalanced" templates, but let's not worry about that here). When VE wants to delete the transclusion, it deletes every bit of HTML that is marked as coming from that transclusion.

By "special-casing" I mean changing "every bit of the resulting HTML" in at least one of the two places to "every bit of the resulting HTML, except <style> tags". It'll probably even have to specifically extract <style> tags deeper in the tree to avoid deleting them.

I don't understand what you mean by special-casing,

Unless I'm mistaken, when Parsoid processes a transclusion of a template it marks every bit of resulting HTML as coming from that transclusion (with some special magic to handle "unbalanced" templates, but let's not worry about that here). When VE wants to delete the transclusion, it deletes every bit of HTML that is marked as coming from that transclusion.

By "special-casing" I mean changing "every bit of the resulting HTML" in at least one of the two places to "every bit of the resulting HTML, except <style> tags". It'll probably even have to specifically extract <style> tags deeper in the tree to avoid deleting them.

Ah, got it.

If compact / de-duplicated CSS is a requirement (which I gather from the discussion so far, seems to be a requirement from the Performance team), then VE / incremental-updater will have to do more than simply inserting / deleting HTML blobs. In that context, recomputing CSS by examining the whole doc is more work (and more complex) than simply examining the HTML blob that is being deleted. Unless there are other options, this seems like necessary complexity if de-duplicated CSS is a goal (Ref: T155813#3012231 The reason I ask is because the goals might dictate the implementation choices even if it might be a bit more work for TemplateStyles, Parsoid, PHP Parser, or VE.)

Unless there are other options, this seems like necessary complexity if de-duplicated CSS is a goal

Well, we seem to have four options under consideration at the moment:

  • 3A: <style> tags in the body, no deduplication.
  • 3C: One <link rel="stylesheet"> in the head, produced by ResourceLoader.
  • 3D: One <style> tag in the head, produced by ResourceLoader.
  • 3E: <style> tags in the body, somehow deduplicated.

Only 3A lacks deduplication.

3C would need T156414 fixed in VE but otherwise shouldn't need any additional complexity. Incremental updater would need to avoid the same bug with respect to tracking ResourceLoader module usage.

3D would need the fix for T156414 to cover embedded modules too, and/or would need conversion to 3C style for editing (i.e. swap the <style> for <link rel="stylesheet">). For incremental updater it would probably work exactly like 3C, since if I guess the plan correctly it's dealing with just the page content HTML (not the skin or page header) and at that point there's no difference between 3C and 3D.

Only 3E requires the complex special handling of all <style> tags in VE and incremental updater. Also, only 3E currently lacks a workable plan for implementation.

The fact that one requires more engineering work than others isn't good justification to go for an easier one with known performance degradation for users, such as adding inline styles to the head like 3D. As for lack of implementation details, 3E is only being brought up now, while the others have been investigated in depth already. I can't really help with figuring out implementation details, though, being completely unfamiliar with our parsing.

And you have to consider that 3E is 3A optimized. It can start off as 3A. What matters is not making engineering choices early on that make 3E subsequently impossible/very hard to achieve.

The fact that one requires more engineering work than others isn't good justification to go for an easier one with known performance degradation for users,

It'd be nice if you'd deign to quantify the performance degradations you refer to.

while the others have been investigated in depth already.

Not really.

3A is trivial, you just output stuff every time without caring about any global state.

3B (although we've dropped 3B from consideration at this point) is reasonably straightforward too, you output nothing but accumulate some state internally and dump it all into the ParserOutput in ParserAfterParse.

3C and 3D need work on the ResourceLoader side, true, but it didn't take any in-depth investigation for me to figure out how to do that work. And on the Parser side they're both as trivial as 3A.

3E, though, is not so simple. One option is to output content only once, which seems simple until you realize that you also have to make sure that it doesn't wind up with zero instances of the content if some enclosing thing winds up discarding or mangling where you tried to output it. Another is to output a placeholder every time and then add a second pass to the parser to expand the placeholders, but I've already pointed out the drawbacks of adding parser passes in T155813#3080271. Or someone could suggest a third option.

And you have to consider that 3E is 3A optimized. It can start off as 3A.

Unless someone has a brainstorm, it's probably going to finish up as 3A as well.

What matters is not making engineering choices early on that make 3E subsequently impossible/very hard to achieve.

That seems extremely unlikely.

No matter what, the tag hook has to determine what stylesheet is being included. By the time it has done that, it has $rev that's a Revision object and $content that's a TemplateStylesContent (which is a subclass of CssContent).

Here's what the relevant bit of the code for the tag hook to implement 3A would look like:

$style = $content->getSanitizedString();

// Hide the CSS from Parser::doBlockLevels
$marker = Parser::MARKER_PREFIX . '-templatestyles-' .
    sprintf( '%08X', $parser->mMarkerIndex++ ) . Parser::MARKER_SUFFIX;
$parser->mStripState->addNoWiki( $marker, $style );

// Return the inline <style>, which the Parser will wrap in a 'general'
// strip marker.
return Html::inlineStyle( $marker );

Seriously, that's basically it. If someone comes up with something to make 3E actually work, you start with that and do whatever it is you do for 3E so it only happens once.

Here's what the relevant bit for 3C and 3D would probably look like:

// Add the ResourceLoader module. Note we should only ever get a non-truthy $rev->getId() during page preview.
$id = $rev->getId() ? 'r' . $rev->getId() : TemplateStylesResourceLoaderModule::stashContent( $content );
$parser->getOutput()->addModuleStyles( "ext.templatestyles.$id" );

return '';

There'd also be a ResourceLoaderModule subclass that would extract the $id from the module name, load the corresponding TemplateStylesContent object, and call $content->getSanitizedString() just like 3A does to get the actual stylesheet for RL to use. Switching from 3C or 3D to 3E would involve throwing away these few lines of code, and eventually throwing away the ResourceLoaderModule too once we don't need it anymore for existing cached pages. But nothing there would make it very hard or impossible to switch to 3E.

It'd be nice if you'd deign to quantify the performance degradations you refer to.

I thought it was obvious that styles in the head are blocking and delay first paint. Which gets even worse when a request is required, like in 3C. Styles in the head should be kept to a vital minimum (think site header, logo, chrome, space placeholders to avoid things jumping around as the page loads). Styling for the article content, which is the case with templates, shouldn't be in the head.

Now you might argue that this content is already in the head at the moment because it might be added to a large degree via Common.css (I haven't checked, I'm just assuming that's one of the ways Common.css is used) and that we might bundle the new request with that existing one. But I think that TemplateStyles would actually encourage people to add more custom CSS than there is currently in Common.css, which would definitely worsen first paint. Which, when dealing with requests, might mean staring at a completely blank page for extra seconds in some parts of the world.

Doing 3A/E is actually an opportunity to get Common.css to slim down to what it should be if there's anything in there that belongs in TemplateStyles , while making sure that as people leverage TemplateStyles they don't make the pages slower across the board for content that might be way down inside the article.

I would like to better understand your conclusions.

I thought it was obvious that styles in the head are blocking and delay first paint.

By "blocking and delay first paint" are you referring to one thing or two? If two, what does it block besides "delay first paint"?

Does a <style> in the head "block" or "delay" anything beyond however many extra milliseconds it takes to transfer the bytes? About how much extra time are you estimating for that when considering 3D compared to the time we already use for bytes in the head? Considering that some of the most likely things people will be using TemplateStyles for (maintenance banners and infoboxes) already come before the page content, are you excluding those from your estimated additional time before first paint when considering 3D versus 3E?

Which gets even worse when a request is required, like in 3C.

Considering this depends heavily on guesses as to how communities will use styles, which we can only guess at. I'm sure you're making some assumptions here, but what are those assumptions?

Will most pages wind up only loading the same few sheets (e.g. most pages only use the generic infobox sheet, the generic sheet for message-box templates, and the generic sheet for navbox templates)? In that case it seems like it would be very similar to the three existing <link rel="stylesheet"> tags output by ResourceLoader in that we'd expect the browser to load the merged TemplateStyles sheet once and have it cached for subsequent accesses. And in that case, would the 70K sheet for RL modules block more than a probably-much-smaller sheet for TemplateStyles? Do browsers request the sheets in parallel and so would it be likely that the TemplateStyles sheet would finish loading before the big RL sheet?

Or will the needed sheets be extremely varied on different pages, more so than however much RL modules already vary? In that case then the network request to load the probably-not-cached merged TemplateStyles sheet would be of concern.

Styles in the head should be kept to a vital minimum (think site header, logo, chrome, space placeholders to avoid things jumping around as the page loads).

The stylesheet I see currently being loaded (via an extra request) in the head for ResourceLoader modules other than site.styles on enwiki's main page for anons currently weighs in at 48021 bytes (10969 compressed). Does this represent the "vital minimum"?

Styling for the article content, which is the case with templates, shouldn't be in the head.

Should someone (i.e. Performance) look into making a generic "3E" mechanism for MediaWiki so other extensions that add styles for article content can avoid adding to the ResourceLoader request in the page header? At a glance, it looks like HTMLForm in core, Babel, Calendar, CategoryTree, CharInsert, Cite, InputBox, Kartographer, Math, SyntaxHighlight_GeSHi, TemplateData, Timeline, Translate, WikiHeiro, and WikimediaIncubator could certainly use it for their tag hooks, and maybe also FlaggedRevs, Graph, PageForms, ProofreadPage, Wikidata, and possibly others could use it for their Content objects or other hooks that seem content-related.

@Anomie, see T124966 for data showing the impact of linked head styles on first paint times, especially on 2G. tl;dr: First paint with linked styles on a really slow 2G connection is ~90s, while inlined styles cut that down to 11.3s (!). No styles at all in the head further reduce this to 8.2s, so the size of inlined head styles does matter.

tl;dr: First paint with linked styles on a really slow 2G connection is ~90s, while inlined styles cut that down to 11.3s (!).

I note your 90s number is with images and external styles, while 11.3s is without images and inline styles. A more accurate comparison would be the 52s without images and external styles compared to the 11.3s is without images and inline styles. Which is still a significant decrease.

How much does any of this matter for TemplateStyles, though, when other (larger) external stylesheets are not being inlined? I see no test there for mixed external/inline to be able to draw conclusions.

@Anomie, I was off by 0.1s. To be precise, numbers with images are 11.4s with inlined styles compared to 92.5s with linked styles.

Ah, you're comparing "images and external styles" with "images and inline styles and deferred loading of images below the fold". Still not quite an exact comparison.

At any rate, the realization that 3E should be something that's added to MediaWiki core instead of being done in the TemplateStyles extension IMO makes its implementation more tractable, since it's much more sane to add an "expand style placeholders" pass into the Parser (or into OutputPage?) by editing core code rather than trying to use a hook to do it from an extension where other extensions might be trying to do something similar-but-incompatible. At this point I'm seeing two ways forward here:

  1. Someone writes the 3E-in-core code in the next week or two. Then TemplateStyles uses it from the start, and someone updates all those other extensions I mentioned to use it too. If we can come up with a decent plan soon enough, I might wind up doing the writing.
  2. We pick 3A, 3C, or 3D for TemplateStyles for now, and convert it to 3E-in-core along with all those other extensions I mentioned when someone finally does implement 3E-in-core.

As for actually implementing 3E-in-core, there would probably be an addInlineStyles( $something ) method somewhere (Parser? ParserOutput? StripState? ResourceLoader?) that returns a token (strip-marker style? HTML-like, maybe even a <link rel="stylesheet"> tag with an extra data property, that Parsoid could request be returned by the API so it can do its own dependency tracking and deduplication?) that the tag hook would include in its output. While the obvious thing would be to make $something be the CSS text, it might make conversion simpler to have it be a ResourceLoader module name exactly as is currently being passed to addModuleStyles(). Then at some point (just before the ParserBeforeTidy hook?) something would regex out all the markers and do the first-instance-only insertion of the corresponding style tag. Thoughts, anyone? Should we split this discussion out to a separate task?

Ah, you're comparing "images and external styles" with "images and inline styles and deferred loading of images below the fold". Still not quite an exact comparison.

No, the only technical difference between those measurements is linked vs. inline styles, literally by inserting the linked stylesheet into <style> tags in a text editor. I mentioned the observed Chrome image loading prioritization, but there was no deferred image loading or JavaScript involved here, just plain image tags, in both cases.

Okay, so if I understand correctly:

3E may be too hard right now (or is it?), and could come as an optimization later.
3D for now is ruled out based on the high likelihood that, unlike our generalized above-the-fold styles (where we could practically speaking drive down time to first content-ful paint if they were embedded, potentially at the expense of more non-cached bytes over the course of deep sessions), pages would likely eventually suffer bloat in the <head><style that negatively impacts user perceived load time (because it's not just above the fold templates that would end up in <head><style>, it's many if not all of the template styles in a page).
3C is a no-go as it's in all likelihood even more render blocking than 3D.
3B for now is ruled out on the same grounds as 3D.
3A thus seems the only remaining option (unless 3E really is straightforward; although a word of caution that this may be absorption into core something that needs time to evolve). As @ssastry notes, enforced style scoping isn't strictly necessary, but for the purposes of our modeling of the system is a required property/assumption; this allows for flexibility down the road in where the styles might be placed.

Are we agreed that 3A (styles inlined in the body, not de-duplicated) is a minimally viable option?

I updated my previous comment. My sense is 3A seems the most pragmatic approach with the fewest dependencies, but what do you think?

3E may be too hard right now (or is it?), and could come as an optimization later.

Not necessarily too hard, but IMO it needs doing in core rather than in TemplateStyles.

If people want it done by me as part of this project then I'd like some feedback so I don't wind up writing a bunch of code only for people to complain that I should have done it some other way. My straw proposal: Add a static method on ResourceLoaderClientHtml to return tokens (looking like <link rel="stylesheet"> tags) that will be returned by the tag hook function, and have Parser call a static method on ResourceLoaderClientHtml (just before the ParserBeforeTidy hook and only when $isMain is set) to replace those tokens with <style> tags with the styles from the relevant RL module(s).

Or we can skip it for now and someone can update TemplateStyles when they update Babel, Calendar, CategoryTree, and so on after they implement 3E-in-core.

3B for now is ruled out on the same grounds as 3D.

3B we ruled out because the frontend behavior is identical to 3D while it will probably work better on the back end.

3D [...]
3C [...]
3A [...]

As for 3D, 3C, and 3A, Performance has yet to say which of those they'd actually prefer if 3E isn't being done immediately, especially since my clarification that neither 3C nor 3D would make 3E any harder to do in the future.[1] I suspect they'll go for 3A despite the potential duplication,[2] but that's just a guess on my part.

[1]: In fact, in my straw proposal 3A would be slightly harder to switch to 3E since 3A won't already have a RL module to embed while 3C and 3D would. But I don't think it'll turn out to be a particularly complex RL module.
[2]: Especially in things like enwiki's maintenance banner templates where there's sometimes several above the fold, although usually not on the popular articles

Thanks, @Anomie. Do I understand correctly that you could do 3E without touching all those other extensions? In other words, would it be possible to add the support you mention to Core, have TemplateStyles enroll in its use, and let other extension maintainers convert as they see fit when they have time and if it makes sense to them?

@ssastry is 3E any more or less preferable than 3A for Parsoid?

@Gilles, my assessment of your statements on this thread is you're comfortable with either of 3A or 3E for now, but generally opposed, at least at this time, to 3B/3C/3D. Would you please clarify if I misunderstand?

Thanks, @Anomie. Do I understand correctly that you could do 3E without touching all those other extensions? In other words, would it be possible to add the support you mention to Core, have TemplateStyles enroll in its use, and let other extension maintainers convert as they see fit when they have time and if it makes sense to them?

Yes, that is a possibility. Although I'd likely convert at least one myself as a proof of concept, and if the needed changes are as minimal as I think they will be I might do more.

@ssastry is 3E any more or less preferable than 3A for Parsoid?

3A is obviously the simplest from the Parsoid perspective. But, we can implement 3E.

Separately, as a design decision, it would be useful to note that (in the context of 3E), implementations of TemplateStyles might leave duplicate CSS behind in some scenarios. VE may prefer this loosening for reasons of simplicity, as I indicated in T155813#3082022 (/cc @Jdforrester-WMF). From what @Gilles said in T155813#3075569, I think Performance would be comfortable with this.

An additional stylesheet URL in the head tag would slow down first paint on mobile for 2G connections so 3B sounds like a non-starter for me.
Without de-duplication wouldn't the css bloat be compensated for by gzipping?

Without de-duplication wouldn't the css bloat be compensated for by gzipping?

In T155813#3075569, @Gilles said: ... This is an optimization that will obviously save CPU, an important factor on low-powered device, to ensure better scroll smoothness, etc. ...

Thanks for that! It's getting quite noisy here and hard to follow the conversation. Is someone able to roll this back and summarize the discussion/decisions so far?

@Gilles, my assessment of your statements on this thread is you're comfortable with either of 3A or 3E for now, but generally opposed, at least at this time, to 3B/3C/3D.

Exactly.

Without de-duplication wouldn't the css bloat be compensated for by gzipping?

That's correct, the effect of text repeated verbatim is negligible when gzipped. The only concern is the browser re-parsing and re-applying the duplicated CSS needlessly every time it occurs. Something which we know in general terms causes wasteful CPU cycles, but which we've not benchmarked to quantify its gravity. We definitely know, however, that's it's a much lesser problem than putting styles in the head or having extra requests blocking content rendering. Seeing content sooner is more important than smooth scrolling.

We know that duplicated inline CSS will probably affect scroll smoothness on low-powered devices, but we don't know to what extent. We have plans to start measuring scroll smoothness performance, which might inform that kind of discussion down the line, but right now it's just at the idea/task stage: T115606: Add "consistently interactive" metric to WebPageTest data

We don't want anything blocked on a performance impact/phenomenon we haven't studied yet. But we don't want it to be completely ignored either, because we know it's wasteful. Which is why we recommend starting with 3A and aiming for 3E where practical.

Separately, as a design decision, it would be useful to note that (in the context of 3E), implementations of TemplateStyles might leave duplicate CSS behind in some scenarios. VE may prefer this loosening for reasons of simplicity, as I indicated in T155813#3082022 (/cc @Jdforrester-WMF). From what @Gilles said in T155813#3075569, I think Performance would be comfortable with this.

Indeed. Optimize what you can, and it doesn't have to be a blocker for anything's launch. I don't expect it to be applied to every situation, everywhere, either. Use best judgment about the effort required to apply that optimization to different contexts.

@Jdlrobson we're down to 3A and 3E. 3A is simpler to implement and probably less error prone, but would miss out on the opportunity of 3E exposing a new <body> inlining and de-duplication mechanism.

Anomie claimed this task.

Ok, this seems to be resolved now. The final decision is 1B+2C+3A/3E, meaning:

  • Styles will be stored as separate pages with a "sanitized CSS" content model.
  • Styles will be used via a <templatestyles src="PAGENAME"/> tag, no magic inclusion of subpages.
  • Styles will be delivered to the browser by embedding <style> tags in the body HTML, with deduplication when that becomes feasible.

Making the deduplication feasible has been filed as T160563: Provide infrastructure for embedding styles in the content HTML with deduplication. Thanks everyone.