Several librsvg issues are not fixed in 2.44.10 (Debian Buster), but are in later versions. Debian Bullseye (testing) is currently shipping 2.50.1.
Description
Event Timeline
The Commons file
offers a quick and easy switch diagnostic because it displays the IETF language code.
The current version of Commons has the librsvg 2.40 bug that match the first zh tag. Consequently, all of these these display zh-Hans
- https://commons.wikimedia.org/w/index.php?lang=zh-hans&title=File%3ASystemLanguage.svg should display zh-Hans`
- https://commons.wikimedia.org/w/index.php?lang=zh-hant&title=File%3ASystemLanguage.svg should display zh-Hant (fail)
- https://commons.wikimedia.org/w/index.php?lang=zh-tw&title=File%3ASystemLanguage.svg should display zh-tw (fail)
- https://commons.wikimedia.org/w/index.php?lang=zh-cn&title=File%3ASystemLanguage.svg should display zh-cn (fail)
- https://commons.wikimedia.org/w/index.php?lang=zh&title=File%3ASystemLanguage.svg should display zh-Hans (pass)
Same issue happens with ku-Arab, ku-Latn, and ku; sr-EC, sr-EL, sr-Cyrl, sr-Latn, and sr.
Could somebody try the SVG file
with the newer librsvg 2.44 or librsvg 2.50?
- $ LANG=zh-hans rsvg-convert -w 512 -h 224 -o result-zh-hans.png SystemLanguage.svg
- $ LANG=zh-hant rsvg-convert -w 512 -h 224 -o result-zh-hant.png SystemLanguage.svg
- $ LANG=zh-cn rsvg-convert -w 512 -h 224 -o result-zh-cn.png SystemLanguage.svg
- $ LANG=zh-tw rsvg-convert -w 512 -h 224 -o result-zh-tw.png SystemLanguage.svg
- $ LANG=sr-ec rsvg-convert -w 512 -h 224 -o result-sr-ec.png SystemLanguage.svg
- $ LANG=sr-el rsvg-convert -w 512 -h 224 -o result-sr-el.png SystemLanguage.svg
note: Having the librsvg package installed does not mean you get the rsvg-convert command as well. You'll have to apt install librsvg2-bin for that.
edit: I had pasted results but from 2.44.10 until I realized it is about anything > 2.44.10. so removed that again. nevermind me.
@Dzahn Thank you very much. The test results (now removed) suggest that hyphens are death. I do not know why the switch element's default clause "other" was not displayed. That suggests something else is awry. The SVG file validates.
Those tests simulate what MW would do with an updated librsvg. With the old librsvg the first subtag would have effect, but the new version does not even do that.
In other words, we cannot update to a new version of librsvg without breaking breaking multilingual SVG files that use hyphenated language tags.
This issue is more serious than T261192.
So the update should be blocked until https://gitlab.gnome.org/GNOME/librsvg/-/issues/356 is resolved (or WMF does its own localization).
Federico is willing to fix #356. See https://gitlab.gnome.org/GNOME/librsvg/-/issues/729 . (After such a fix, there may still be issues with WMF bogus langtags such as sr-EC.)
I would expect that 2.44.7 is good enough. @Aklapper, @AntiCompositeNumber or @JoKalliauer could try it in 2.50.
@Glrx sorry I missed it earlier.
bash-input
#!/bin/bash rsvg-convert --version LANG=de rsvg-convert -w 512 -h 224 -o result-de.png SystemLanguage.svg LANG=de-at rsvg-convert -w 512 -h 224 -o result-de-at.png SystemLanguage.svg LANG=de_at rsvg-convert -w 512 -h 224 -o result-de_at.png SystemLanguage.svg LANG=de_at.UTF-8 rsvg-convert -w 512 -h 224 -o result-de_atUTF8.png SystemLanguage.svg LANG=sr-ec rsvg-convert -w 512 -h 224 -o result-sr-ec.png SystemLanguage.svg LANG=sr_ec rsvg-convert -w 512 -h 224 -o result-sr_ec.png SystemLanguage.svg LANG=sr-el rsvg-convert -w 512 -h 224 -o result-sr-el.png SystemLanguage.svg LANG=sr_el rsvg-convert -w 512 -h 224 -o result-sr_el.png SystemLanguage.svg LANG=zh-cn rsvg-convert -w 512 -h 224 -o result-zh-cn.png SystemLanguage.svg LANG=zh_cn rsvg-convert -w 512 -h 224 -o result-zh_cn.png SystemLanguage.svg LANG=zh-hans rsvg-convert -w 512 -h 224 -o result-zh-hans.png SystemLanguage.svg LANG=zh_hans rsvg-convert -w 512 -h 224 -o result-zh_hans.png SystemLanguage.svg LANG=zh-hant rsvg-convert -w 512 -h 224 -o result-zh-hant.png SystemLanguage.svg LANG=zh_hant rsvg-convert -w 512 -h 224 -o result-zh_hant.png SystemLanguage.svg LANG=zh-tw rsvg-convert -w 512 -h 224 -o result-zh-tw.png SystemLanguage.svg LANG=zh_tw rsvg-convert -w 512 -h 224 -o result-zh_tw.png SystemLanguage.svg
terminal-output:
rsvg-convert version 2.50.5
(librsvg 2.51.1 gave the same png-results as with librsvg 2.50.5)
png-Result
requested by Glrx
everything containing - gets rendered as "other"
zh-hans | zh-hant | zh-cn | zh-tw | sr-ec | sr-el |
---|---|---|---|---|---|
bash-input as in my post
_ matches with the first zh or sr tag
de | de-at | de_at | de_at.UTF-8 | sr-ec | sr_ec | sr-el | sr_el | zh-cn | zh_cn | zh-hans | zh_hans | zh-hant | zh_hant | zh-tw | zh_tw |
@JoKalliauer Thanks for showing that hyphens do not work, that 2.50 shows the default when there is no match, and that simply substituting an underscore for a hyphen does not solve the problem.
librsvg needs a way to specify IETF langtags (Gnome #356).
FYI: related librsvg-issues:
- librsvg#735 RFC: meta-issue for localized SVGs make it easy for Wikimedia to have localized SVG
- librsvg#356 Provide a way to specify the user's preferred languages specify langtags
- librsvg#357 Take weights into account when matching systemLanguage if allowReorder=yes
I think if you wrote a build script for that, and tested it on a stretch container, I would support deploying it.
I feel the frustration, but it is no longer just a matter of installing the latest librsvg. The latest version will choke if hyphenated langtags (e.g., sr-latn, zh-hant, ku-arab) are passed to librsvg via the $LANG environment variable.
MediaWiki and Thumbor code needs to be updated.
The PHP $lang langtag needs to be passed in via the --accept-language command line argument. See
- Gnome #356 https://gitlab.gnome.org/GNOME/librsvg/-/issues/356
- Gnome commit https://gitlab.gnome.org/GNOME/librsvg/-/commit/d1658a7ab3c5427986cbe8f5d0be7a40e351f0a1
The code changes are less than half a page. One of the phabricator issues points to the MW and Thumbor sources; I do not see it above.
I do not know PHP or Python, but here are the changes needed to wiki configuration, SVGHandler.php, and Thumbor's svg.py.
They are breaking changes. They need the Rust version of librsvg/rsvg-convert. If used with the old version of librsvg, I expect the new command line arg would cause an exception.
MediaWiki could be made compatible by having rsvg and rsvglang entries and testing for a "$lang" substring in the conversion string.
Thumbor is hardwired, so making it compatible with both versions would be more complicated. However, it would be good to allow Thumbor to use both rsvg-convert or resvg.
I grepped for rsvg in exec.log and found nothing, going back to May, so it looks like T260504 is sufficiently complete that we don't have to upgrade librsvg on the appservers or update SvgHandler::rasterize(). An update to SvgHandler::rasterize() could be done as a courtesy to non-WMF users but it does not block this task.
If we install the new version of librsvg into a custom prefix, say /opt/librsvg-2.54, so that we can have both versions of librsvg installed, then we can switch the version in the Thumbor configuration. That would allow us to decouple the librsvg deployment from Thumbor 7 and the OS upgrade, reducing risks.
Changing one line of code in svg.py on the existing Thumbor servers is an annoyingly complex task. We could instead have a shell script wrapper, installed by Puppet, along the lines of
#!/bin/sh LANG=en_US.UTF-8 /opt/librsvg-2.54/rsvg-convert --accept-language="$LANG" "$@"
We can change the binary that Thumbor uses in puppet. /etc/thumbor.d/40-wikimedia.conf comes from the package but it can be safely replaced by Puppet.
Beyond rust version, the dependency issue is mainly pango. Buster has 1.42.3 while librsvg wants 1.46. This bar was raised by two commits upstream:
- https://gitlab.gnome.org/GNOME/librsvg/-/commit/077f21b03b1cc9eebacbbbf6860045f2fcd96768 does 1.38 to 1.44. This is required for the T36947#7062325 bugfix. Okay, it's not that bad. The bug's reappearance was caused by pango's behavior change in 1.44.3, so in 1.42 we will fly fine.
- https://gitlab.gnome.org/GNOME/librsvg/-/commit/19f07cd73556f138d2b53932cb28d0be800626dc does 1.44 to 1.46. This is fine-ish because it's for overline and other fancy SVG2 stuff, which we don't use. Yet.
Built it on buster with patches to deal with pango version; it works well enough. See https://gist.github.com/Artoria2e5/458f3dfcf5aa68272648c1dc21c039ae for how.
Test images with the haphazard build:
It would look less deformed with hinting disabled, but for the sake of small text I want to keep it on. Not a wise choice given that upstream has it off, I know.
Wow I am actually impressed this thing uploads. Run this binary at your own risk.
{F37026565}