Page MenuHomePhabricator

[Regression] systemLanguage="en" does not work anymore, it takes default instead
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

What happens?:
It get rendered in the default language of the SVG

What should have happened instead?:
Same rendering as on https://svgcheck.toolforge.org/index.php (uses English as default)

Other examples:


Event Timeline

JoKalliauer renamed this task from systemLanguage="en" does not work, it takes default instead to [Regression] systemLanguage="en" does not work anymore, it takes default instead.Apr 25 2023, 3:13 PM
JoKalliauer triaged this task as High priority.
JoKalliauer updated the task description. (Show Details)
JoKalliauer updated the task description. (Show Details)

When MW builds the page, it considers lang=en to be the default, so MW uses the src attribute

That URL does not specify a langtag for the SVG file.

Consequently, Thumbor (at least in the past) would rely on librsvg defaulting the system langtag to en. Thumbor would not explicitly set the $LANG environment variable:

IIRC, the C-version of librsvg uses the $LANG environment variable; that environment variable is usually an opaque Unix locale string. That variable should be part of the system configuration (and could conceivably be set to the locale of the server; for example, a server based in Germany might set the locale to German). If that environment variable is no longer set or is set to something such as "en_US", then librsvg may not be using en as the default system language. That could cause the switch element to use the default clause (which is German) rather than the en clause.

https://commons.wikimedia.org/wiki/File%3AAbdomal_organs_body.svg may have the same problem. It's default language is English, but not all of its translations display. The set of working translations also varies. Right now, German (de) works but Spanish (es) displays English. The corresponding image URLs are

Changing the pixel widths to unusual widths (e.g. 666) should defeat both local and remote image caching, but then the results flip!

Try again at 950 and both fail:

Something is screwy.

Are some image scalers running old software (and generating good images) while other scalers are running new software (that only generates the default language)?

Generating new requests and looking at HTTP response headers suggest Thumbor version correlation:

server: Thumbor/6.3.2 produces correct result.

server: Thumbor/7.3.2 produces incorrect result:

REQUEST:
Request URL: https://upload.wikimedia.org/wikipedia/commons/thumb/a/aa/Abdomal_organs_body.svg/langgsw-950px-Abdomal_organs_body.svg.png?20230507205636
Request Method: GET
Status Code: 200 
Remote Address: 208.80.154.240:443
Referrer Policy: strict-origin-when-cross-origin

RESPONSE
accept-ranges: bytes
access-control-allow-origin: *
access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache
age: 1
content-disposition: inline;filename*=UTF-8''Abdomal_organs_body.svg.png
content-length: 155820
content-type: image/png
date: Mon, 08 May 2023 20:10:37 GMT
nel: { "report_to": "wm_nel", "max_age": 604800, "failure_fraction": 0.05, "success_fraction": 0.0}
report-to: { "group": "wm_nel", "max_age": 604800, "endpoints": [{ "url": "https://intake-logging.wikimedia.org/v1/events?stream=w3c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0" }] }
server: Thumbor/7.3.2
server-timing: cache;desc="miss", host;desc="cp1086"
strict-transport-security: max-age=106384710; includeSubDomains; preload
timing-allow-origin: *
x-cache: cp1088 miss, cp1086 miss
x-cache-status: miss
x-client-ip: 173.228.4.52
x-content-type-options: nosniff
xkey: File:Abdomal_organs_body.svg

Looking to get this fixed asap. The issue is indeed that the newer version of rsvg-convert doesn't obey the LANG var, and that we are currently in a mixed environment between old and newer instances due to capacity issues which we're also working on addressing.

Unfortunately the version we're currently running doesn't support language headers at all and we will need a newer version (even the bullseye version of 2.50.3 doesn't support --accept-language, the new solution for this problem). I'm working on verifying the new flag's behaviour and getting this package built at the moment. I was wrong about this, see below.

Change 917861 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/software/thumbor-plugins@master] svg: set LC_ALL instead of LANG

https://gerrit.wikimedia.org/r/917861

For the short term we can hack around this rather than worrying about building a new version which will take time due to the differences in rust build environments in Debian.

Doing the following I see valid images being generated with Window (windowing system).svg:

for i in ar en de ru fr tr; do LC_ALL="$i" /usr/bin/rsvg-convert Window.svg -u -f png -w 800 > $i.png; done

This change should hopefully address this. Thanks for the report and the handy repro cases!

Change 917861 merged by jenkins-bot:

[operations/software/thumbor-plugins@master] svg: set LC_ALL instead of LANG

https://gerrit.wikimedia.org/r/917861

Change 917917 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] thumbor: bump version

https://gerrit.wikimedia.org/r/917917

Change 917917 merged by jenkins-bot:

[operations/deployment-charts@master] thumbor: bump version

https://gerrit.wikimedia.org/r/917917

The fix has been deployed - I am seeing some improvements but not fixes for all images. Continuing to investigate.

I think this has been resolved, lingering issues appear to be from edge caching. Please reopen if there are existing instances of this failing.

Change 920760 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/software/thumbor-plugins@master] wip: upgrade container and dependencies for bullseye

https://gerrit.wikimedia.org/r/920760

For the short term we can hack around this rather than worrying about building a new version which will take time due to the differences in rust build environments in Debian.

Doing the following I see valid images being generated with Window (windowing system).svg:

for i in ar en de ru fr tr; do LC_ALL="$i" /usr/bin/rsvg-convert Window.svg -u -f png -w 800 > $i.png; done

This change should hopefully address this. Thanks for the report and the handy repro cases!

The fix does not work for

The file has systemLanguage="en" clauses, but they do not display. The en clause is skipped and the default clause is displayed instead.

The for loop above does not reflect what Thumbor actually does. If the Thumbor URL does not specify a language, then Thumbor does not set LC_ALL to en and Engish is not rendered. Compare

If hasattr(self.context.request, 'lang') is false, then it should default LC_ALL to en to enforce MW's convention of English is the default (and match what happens in the for loop above).

See also

Glrx reopened this task as Open.EditedJul 12 2023, 3:03 PM

Reopen. Fix not complete.

The first test case in the description

displays "other".