Page MenuHomePhabricator

Scraped description text of images shows remnants of CSS transforms in MediaViewer
Closed, ResolvedPublic

Description

https://en.wikipedia.org/wiki/Scotland#/media/File:Scotland_in_the_UK_and_Europe.svg renders the description as:

Location of  .mw-parser-output .nobold{font-weight:normal}Scotland  (dark green) – in Europe  (green & dark grey) – in the United Kingdom  (green)

… instead of "Location of Scotland (dark green) – in Europe (green & dark grey) – in the United Kingdom (green)". The extra spaces and class are generated from https://en.wikipedia.org/wiki/Template:Infobox_country_UK as:

map_caption = {{{map_caption|{{map caption |location_color=dark green|country={{nobold|{{{common_name|{{PAGENAME}}}}}}} |subregion=the [[United Kingdom]] |subregion_color=green |region=Europe|region_color=dark grey}}}}}

… and these are leaking out.

Details

Related Gerrit Patches:
mediawiki/extensions/MultimediaViewer : masterIgnore TemplateStyle-generated content when textifying HTML

Event Timeline

Restricted Application added a project: Multimedia. · View Herald TranscriptDec 7 2018, 7:09 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Tgr added a subscriber: Tgr.

Seems unrelated to PageImages. It's caused by TemplateStyles but not a problem with that component either, MMV just needs to be more intelligent about HTML parsing.

Change 478247 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/MultimediaViewer@master] Ignore TemplateStyle-generated content when textifying HTML

https://gerrit.wikimedia.org/r/478247

Change 478247 merged by jenkins-bot:
[mediawiki/extensions/MultimediaViewer@master] Ignore TemplateStyle-generated content when textifying HTML

https://gerrit.wikimedia.org/r/478247