Page MenuHomePhabricator

[Spike] Explore automatic night mode fixes with content transform team
Closed, ResolvedPublic5 Estimated Story Points

Description

A common problem with the new night mode that is being built out is certain template styles / templates specify a background-color without a color

Example: https://m.mediawiki.org/w/index.php?title=Manual:Database_layout/diagram/1.41.0&minervanightmode=1
Recommendation: https://www.mediawiki.org/wiki/Recommendations_for_night_mode_compatibility_on_Wikimedia_wikis#Always_define_color_when_defining_background

We'd like to setup a linting rule to encourage editors to fix this and also add some code to correct this.

The equivalent JS for style attribute correction might look like this:

$('[style]').each((i,node)=> {
    const style = node.getAttribute('style');
    if ( style.indexOf('background') > -1 && style.indexOf('color') === -1 ) {
         node.setAttribute('style', style + ';color:#333;')
    }
} )

Examples that we would want to automatically fix:

  1. https://en.m.wikipedia.org/w/index.php?title=Bulgaria&oldid=1208926388&minervanightmode=1#Largest_cities

Screenshot 2024-02-20 at 4.41.06 PM.png (718×406 px, 42 KB)
for background rule a corresponding color: #333 should be added.

  1. INLINE STYLES: https://en.m.wikipedia.org/w/index.php?title=User:Jdlrobson/T356899&minervanightmode=1
  2. Outcome
  • Work out whether correction is possible (on template styles and inline styles)

Focus on detection and reporting aspect of this work.
Work out if a lint rule that works on template styles as well as inline styles is feasible.
Update: The work is more suitable as a tool in css-sanitizer we can utilise the sanitiser in other packages and extensions

  • Get a rough estimate of the timeline for getting this to work, and whether this is something that the web team can do with code review from content transform team.

Sign off

  • Create tickets for any work discussed here.

Related Objects

Event Timeline

Happy to review, so just let us know.

Jdlrobson set the point value for this task to 5.

Change 1003846 had a related patch set uploaded (by Mabualruz; author: Mabualruz):

[css-sanitizer@master] [POC] Explore automatic night mode fixes

https://gerrit.wikimedia.org/r/1003846

@Jdlrobson @cscott Please check the POC patch, we need to discuss the same solution for inline css, also the placement of the current one have to be moved into a proper sanitiser rule

I moved it to blocked until we get more feedback from CT, I got to play around and have some more context myself I will use it when we discuss that with CT when we have the meeting. Remaining is the meeting and creating tasks to finalise the work and put it in a proper place and use.

I updated the description with the latest outcomes

Ready for review templates inline styles and css util functions are ready

  • We need feedback on where else we might need this, for template’s styles everything looks good.
  • If a lint check is required if it can be auto fixed
  • Does the color: inherit; resolve the issue instead of using a static color

Notes from meeting with @cscott. @Mabualruz will create a task in preparation for estimation meeting tomorrow to determine what our next steps are and move this to sign off.

Notes:

  • CSS sanitization does not currently change the CSS, so it marks a change in the contract for how it currently works.
  • Will all users expect this - should it be a configurable option? How will users override it?
  • The way the sanitizer works in core: any bad rules are replaced with a comment in the CSS explaining why it was removed.
  • The core sanitizer is different from TemplateStyles - this should be fixed on long run.
  • Leave space for user feedback - if the linters on wiki are fast enough won't need to do autocorrect.

Agreements:

  • When inserting color we would insert a comment that provides pointers to how to override this and why it was added`/* [[Phab:]] */`
  • We have two different sources of CSS - inline style attributes and CSS defined in TemplateStyles. We should focus on the inline styles initially. We can use the Parsoid hook (AfterParsoidParse) as a post-cache hook. You wouldn't have access to the parsed cache at this point.

Options:

  • Automatically fixing things
  • Flagging things that could be fixed.

Development options:

  • Update the CSS sanitizer.
  • Hacky way: Use ParserAfterTidy hook and run regular expression on HTML looking for style=" etc... will execute quickly. Add tracking category to page when detected.
  • If we get false positives, try to do a better parse or TemplateStyle solution
  • Could add global rule (not sure about performance of this) - might be a sufficient solution for logged in users.
[style*=background] {
    color: #333;
}
  • If implementing it in core parser or Parsoid there's pros/cons. The core parser uses tracking categories (TrackingCategories.php) and the Parsoid splits this into two parts - generate them with an abstract name in Parser and then Linter extension implements it.

Some quick comments/clarifications:

  • CSS sanitization does not currently change the CSS, so it marks a change in the contract for how it currently works.

Just to clarify: CSS sanitization currently /drops/ parts of the CSS, but (in core) it always replaces those with a /* ... */ comment so the reason for the modification is discoverable. It never /alters/ a rule other than to drop it. /Adding/ new rules that the user didn't explicitly author would be a change in the contract and could be surprising.

  • Will all users expect this - should it be a configurable option? How will users override it?
  • The way the sanitizer works in core: any bad rules are replaced with a comment in the CSS explaining why it was removed.
  • The core sanitizer is different from TemplateStyles - this should be fixed on long run.

MediaWiki\Parser\Sanitizer::checkCss() in core (and a copy in Parsoid) vs MediaWiki\Extension\TemplateStyles\TemplateStyleContentHandler::sanitize().

Of these, TemplateStyleContentHandler already "modifies" the CSS via it's "flip" option, but this doesn't seem to be hooked up to user code. Maybe a missing feature?

In any case, doing additional transformations in TemplateStyles is pretty easy, and you can do something like <templatestyles nodark="true"> if you wanted to suppress the autofix. But it's thought that most of the problematic CSS isn't in TemplateStyles but is in inline wikitext styles. Alas.

  • We have two different sources of CSS - inline style attributes and CSS defined in TemplateStyles. We should focus on the inline styles initially. We can use the Parsoid hook (AfterParsoidParse) as a post-cache hook. You wouldn't have access to the parsed cache at this point.

For detecting errors, there are two possible mechanisms (and two hooks):

  • One is the ParserAfterTidy hook, which is invoked by the legacy parser. It is not invoked by Parsoid, but the legacy parser is used post-edit and on almost all page views. This hook occur before caching and before the links update jobs, so it's an excellent place to add a tracking category (git grep nonnumeric-formatnum includes/ in core for an idea how this works). Editors use the category lists on-wiki to chase down and fix errors. Because this hook is (a) not invoked by Parsoid, and (b) affects pre-cache content, it's not a great place to make changes to the HTML output, but it's a fine place to tweak metadata like categories. The full HTML for the article, including styles added by TemplateStyles, should be available here, and attributes are guaranteed to be double-quoted and so a quick-and-dirty regexp like \bstyle="([^"]*(background-?)color)[^"]*" might be enough to pull up the lion's share of the problematic uses.
  • The other path is to use Parsoid and the Linter extension. Parsoid runs on all mobile views, but is not guaranteed to be triggered by edits or by desktop views. You'd probably be looking at a DOM Postprocessing pass either inside Parsoid directly or registered as an extension, and it would then register any lints found via Parsoid's SiteConfig.

For automatically fixing errors, there again a number of different paths:

  • If you use the ParserOutputPostCacheTransform hook you will get output from both Parsoid and the legacy parser, and because this is post-cache it won't affect the HTML served to VisualEditor, etc and so you're not likely to corrupt other downstream clients. But because this is post-cache, whatever transformation you write should be as efficient as possible. (There are still edge caches in front of this, so it's not after *all* caching, but its still fairly performance-sensitive.)
    • A variation of this just adds a class marker to any tag which defines a style attribute, in order to make a client-side [style*="color"] { ... } stylesheet more efficient, eg: .mw-defines-style[style*="color"] { ... }
  • Modifying TemplateStyles is technically the cleanest, but of course only affects TemplateStyles
  • In an ideal world, Sanitizer::checkCSS in core would use the same css-sanitizer library that TemplateStyles does, which would make any CSS mutation more technically clean, but this is probably a heavier lift than we'd like to tackle right now, especially since we'd have to consider the comparative performance of css-sanitizer versus the regexp that core uses.

Thanks all. I've created T358387 to track all the possible work we have identified here.

Change 1003846 abandoned by Mabualruz:

[css-sanitizer@master] [POC] Explore automatic night mode fixes

Reason:

Not needed, css only fixes is applied

https://gerrit.wikimedia.org/r/1003846