Page MenuHomePhabricator

Update librsvg to > 2.44.10
Open, HighPublic

Assigned To
None
Authored By
AntiCompositeNumber
Oct 14 2020, 9:41 PM
Referenced Files
Restricted File
May 21 2023, 8:34 AM
F37026547: 1.png
May 21 2023, 8:30 AM
F37026542: 3.png
May 21 2023, 8:30 AM
F34459692: result-zh_hant.png
May 19 2021, 8:00 AM
F34459679: result-sr_ec.png
May 19 2021, 8:00 AM
F34459691: result-zh-hant.png
May 19 2021, 8:00 AM
F34459687: result-zh-cn.png
May 19 2021, 8:00 AM
F34459681: result-sr_el.png
May 19 2021, 8:00 AM
Tokens
"Cup of Joe" token, awarded by JoKalliauer.

Description

Several librsvg issues are not fixed in 2.44.10 (Debian Buster), but are in later versions.

Debian Bullseye (testing) ships a more recent version: https://packages.debian.org/bullseye/librsvg2-bin (as of October 2023, that 2.50.3)

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
StalledNone
StalledNone
StalledNone
StalledNone
StalledNone
OpenFeatureNone
StalledBUG REPORTNone
StalledBUG REPORTNone
StalledNone
DuplicateBUG REPORTNone
OpenNone
StalledBUG REPORTNone
OpenNone
OpenNone
DuplicateNone
OpenNone
ResolvedAntiCompositeNumber

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Reedy changed the task status from Open to Stalled.Oct 15 2020, 12:51 AM

The Commons file

offers a quick and easy switch diagnostic because it displays the IETF language code.

The current version of Commons has the librsvg 2.40 bug that match the first zh tag. Consequently, all of these these display zh-Hans

Same issue happens with ku-Arab, ku-Latn, and ku; sr-EC, sr-EL, sr-Cyrl, sr-Latn, and sr.

Could somebody try the SVG file

with the newer librsvg 2.44 or librsvg 2.50?

  • $ LANG=zh-hans rsvg-convert -w 512 -h 224 -o result-zh-hans.png SystemLanguage.svg
  • $ LANG=zh-hant rsvg-convert -w 512 -h 224 -o result-zh-hant.png SystemLanguage.svg
  • $ LANG=zh-cn rsvg-convert -w 512 -h 224 -o result-zh-cn.png SystemLanguage.svg
  • $ LANG=zh-tw rsvg-convert -w 512 -h 224 -o result-zh-tw.png SystemLanguage.svg
  • $ LANG=sr-ec rsvg-convert -w 512 -h 224 -o result-sr-ec.png SystemLanguage.svg
  • $ LANG=sr-el rsvg-convert -w 512 -h 224 -o result-sr-el.png SystemLanguage.svg

note: Having the librsvg package installed does not mean you get the rsvg-convert command as well. You'll have to apt install librsvg2-bin for that.

edit: I had pasted results but from 2.44.10 until I realized it is about anything > 2.44.10. so removed that again. nevermind me.

@Dzahn Thank you very much. The test results (now removed) suggest that hyphens are death. I do not know why the switch element's default clause "other" was not displayed. That suggests something else is awry. The SVG file validates.

Those tests simulate what MW would do with an updated librsvg. With the old librsvg the first subtag would have effect, but the new version does not even do that.

In other words, we cannot update to a new version of librsvg without breaking breaking multilingual SVG files that use hyphenated language tags.

This issue is more serious than T261192.

So the update should be blocked until https://gitlab.gnome.org/GNOME/librsvg/-/issues/356 is resolved (or WMF does its own localization).

Federico is willing to fix #356. See https://gitlab.gnome.org/GNOME/librsvg/-/issues/729 . (After such a fix, there may still be issues with WMF bogus langtags such as sr-EC.)

edit: I had pasted results but from 2.44.10 until I realized it is about anything > 2.44.10. so removed that again. nevermind me.

I would expect that 2.44.7 is good enough. @Aklapper, @AntiCompositeNumber or @JoKalliauer could try it in 2.50.

@Glrx sorry I missed it earlier.

bash-input
#!/bin/bash
rsvg-convert --version
LANG=de rsvg-convert -w 512 -h 224 -o result-de.png  SystemLanguage.svg
LANG=de-at rsvg-convert -w 512 -h 224 -o result-de-at.png  SystemLanguage.svg
LANG=de_at rsvg-convert -w 512 -h 224 -o result-de_at.png  SystemLanguage.svg
LANG=de_at.UTF-8 rsvg-convert -w 512 -h 224 -o result-de_atUTF8.png  SystemLanguage.svg
LANG=sr-ec rsvg-convert -w 512 -h 224 -o result-sr-ec.png  SystemLanguage.svg
LANG=sr_ec rsvg-convert -w 512 -h 224 -o result-sr_ec.png  SystemLanguage.svg
LANG=sr-el rsvg-convert -w 512 -h 224 -o result-sr-el.png  SystemLanguage.svg
LANG=sr_el rsvg-convert -w 512 -h 224 -o result-sr_el.png  SystemLanguage.svg
LANG=zh-cn rsvg-convert -w 512 -h 224 -o result-zh-cn.png  SystemLanguage.svg
LANG=zh_cn rsvg-convert -w 512 -h 224 -o result-zh_cn.png  SystemLanguage.svg
LANG=zh-hans rsvg-convert -w 512 -h 224 -o result-zh-hans.png  SystemLanguage.svg
LANG=zh_hans rsvg-convert -w 512 -h 224 -o result-zh_hans.png  SystemLanguage.svg
LANG=zh-hant rsvg-convert -w 512 -h 224 -o result-zh-hant.png  SystemLanguage.svg
LANG=zh_hant rsvg-convert -w 512 -h 224 -o result-zh_hant.png  SystemLanguage.svg
LANG=zh-tw rsvg-convert -w 512 -h 224 -o result-zh-tw.png  SystemLanguage.svg
LANG=zh_tw rsvg-convert -w 512 -h 224 -o result-zh_tw.png  SystemLanguage.svg
terminal-output:

rsvg-convert version 2.50.5
(librsvg 2.51.1 gave the same png-results as with librsvg 2.50.5)

png-Result
requested by Glrx

everything containing - gets rendered as "other"

zh-hanszh-hantzh-cnzh-twsr-ecsr-el
result-zh-hans.png (224×512 px, 7 KB)
result-zh-hant.png (224×512 px, 7 KB)
result-zh-cn.png (224×512 px, 7 KB)
result-zh-tw.png (224×512 px, 7 KB)
result-sr-ec.png (224×512 px, 7 KB)
result-sr-el.png (224×512 px, 7 KB)
bash-input as in my post

_ matches with the first zh or sr tag

dede-atde_atde_at.UTF-8sr-ecsr_ecsr-elsr_elzh-cnzh_cnzh-hanszh_hanszh-hantzh_hantzh-twzh_tw
result-de.png (224×512 px, 6 KB)
result-de-at.png (224×512 px, 7 KB)
result-de_at.png (224×512 px, 6 KB)
result-de_atUTF8.png (224×512 px, 6 KB)
result-sr-ec.png (224×512 px, 7 KB)
result-sr_ec.png (224×512 px, 7 KB)
result-sr-el.png (224×512 px, 7 KB)
result-sr_el.png (224×512 px, 7 KB)
result-zh-cn.png (224×512 px, 7 KB)
result-zh_cn.png (224×512 px, 8 KB)
result-zh-hans.png (224×512 px, 7 KB)
result-zh_hans.png (224×512 px, 8 KB)
result-zh-hant.png (224×512 px, 7 KB)
result-zh_hant.png (224×512 px, 8 KB)
result-zh-tw.png (224×512 px, 7 KB)
result-zh_tw.png (224×512 px, 8 KB)

@JoKalliauer Thanks for showing that hyphens do not work, that 2.50 shows the default when there is no match, and that simply substituting an underscore for a hyphen does not solve the problem.

librsvg needs a way to specify IETF langtags (Gnome #356).

I am… getting impatient enough to ask: how hard is it to, really, just make our own statically-compiled rsvg-convert binary into a deb package and then deploy it? I mean:

  • Rust already builds binaries with rust stuff statically linked in.
  • rustup is available for getting us an installation of rust without going through debian, and without interfering with anything stored in a prefix. Only thing that could stop rustup is the glibc version, but even then we could just build it on a newer distro and do static-crt.
  • System C deps for librsvg feel… reasonably conservative? I am not ruling out the possibility that it’s too new though.
  • deb packages are easily assembled from a DESTDIR structure with dpkg-buildpackage.

We can get this as a stop-gap measure *while* we talk about what else to switch to. The surface for any security review would be minimal compared to anything that requires adding a layer of adaptation to the PHP side (hopefully we just do the language code change).

I think if you wrote a build script for that, and tested it on a stretch container, I would support deploying it.

I am… getting impatient enough to ask: how hard is it to, really, just make our own statically-compiled rsvg-convert binary into a deb package and then deploy it? I mean:

  • Rust already builds binaries with rust stuff statically linked in.
  • rustup is available for getting us an installation of rust without going through debian, and without interfering with anything stored in a prefix. Only thing that could stop rustup is the glibc version, but even then we could just build it on a newer distro and do static-crt.
  • System C deps for librsvg feel… reasonably conservative? I am not ruling out the possibility that it’s too new though.
  • deb packages are easily assembled from a DESTDIR structure with dpkg-buildpackage.

We can get this as a stop-gap measure *while* we talk about what else to switch to. The surface for any security review would be minimal compared to anything that requires adding a layer of adaptation to the PHP side (hopefully we just do the language code change).

I think if you wrote a build script for that, and tested it on a stretch container, I would support deploying it.

I feel the frustration, but it is no longer just a matter of installing the latest librsvg. The latest version will choke if hyphenated langtags (e.g., sr-latn, zh-hant, ku-arab) are passed to librsvg via the $LANG environment variable.

MediaWiki and Thumbor code needs to be updated.

The PHP $lang langtag needs to be passed in via the --accept-language command line argument. See

The code changes are less than half a page. One of the phabricator issues points to the MW and Thumbor sources; I do not see it above.

I would like to note that this can all easily be implemented for non-wmf wikis. If someone just spent some time on adapting SVGHandler (or created an extension to override SVGHandler).

It just CANNOT easily go to WMF production any time soon because of security reviews, thumbor plugins which would have to be made, and the fact that the thumbor install itself is stuck in old systems that require updating all things for which there currently are no WMF budgets..

It's very straightforward to switch to something else, here's the entire logic for SVG processing at the moment: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/thumbor-plugins/+/refs/heads/master/wikimedia_thumbor/engine/svg/svg.py

I can't find the man page for the resvg command-line tool. What it needs to support is rendering to a specific width and the ability to set the language you want rendered (for multilingual SVGs).

Points to Thumbor source

I do not know PHP or Python, but here are the changes needed to wiki configuration, SVGHandler.php, and Thumbor's svg.py.

They are breaking changes. They need the Rust version of librsvg/rsvg-convert. If used with the old version of librsvg, I expect the new command line arg would cause an exception.

MediaWiki could be made compatible by having rsvg and rsvglang entries and testing for a "$lang" substring in the conversion string.

Thumbor is hardwired, so making it compatible with both versions would be more complicated. However, it would be good to allow Thumbor to use both rsvg-convert or resvg.

I grepped for rsvg in exec.log and found nothing, going back to May, so it looks like T260504 is sufficiently complete that we don't have to upgrade librsvg on the appservers or update SvgHandler::rasterize(). An update to SvgHandler::rasterize() could be done as a courtesy to non-WMF users but it does not block this task.

If we install the new version of librsvg into a custom prefix, say /opt/librsvg-2.54, so that we can have both versions of librsvg installed, then we can switch the version in the Thumbor configuration. That would allow us to decouple the librsvg deployment from Thumbor 7 and the OS upgrade, reducing risks.

Changing one line of code in svg.py on the existing Thumbor servers is an annoyingly complex task. We could instead have a shell script wrapper, installed by Puppet, along the lines of

#!/bin/sh
LANG=en_US.UTF-8 /opt/librsvg-2.54/rsvg-convert --accept-language="$LANG" "$@"

We can change the binary that Thumbor uses in puppet. /etc/thumbor.d/40-wikimedia.conf comes from the package but it can be safely replaced by Puppet.

Beyond rust version, the dependency issue is mainly pango. Buster has 1.42.3 while librsvg wants 1.46. This bar was raised by two commits upstream:

Built it on buster with patches to deal with pango version; it works well enough. See https://gist.github.com/Artoria2e5/458f3dfcf5aa68272648c1dc21c039ae for how.

Test images with the haphazard build:

3.png (410×600 px, 71 KB)

1.png (2×2 px, 221 KB)

It would look less deformed with hinting disabled, but for the sake of small text I want to keep it on. Not a wise choice given that upstream has it off, I know.

Wow I am actually impressed this thing uploads. Run this binary at your own risk.
{F37026565}

So uh @tstarling, what about the promise of trying to deploy a build script? I've got all the steps laid out...

Following remark of Aklapper in T97233, raising priority to high. Many users are stumbling across that regression bug.

The switch to librsvg 2.44.10 broke many files. Some files have been fixed by hacking LC_ALL, but that is not a long-term fix. And that fix will be less than optimal when more recent librsvg versions are used.

Even using Bullseye only loads librsvg 2.50.3; Bookworm has 2.54.5.

WMF should be loading more recent versions of librsvg. When WMF was running the C version on an ancient Debian, it was loading a recent C version of librsvg.

Glrx raised the priority of this task from Low to High.Oct 12 2023, 9:13 PM

Why is this task stalled? Rust is available on the Debian upgrade.

Glrx changed the task status from Stalled to Open.Oct 22 2023, 6:11 PM

The upgrade to bullseye is ready for review (as part of T336881) which will bring us to 2.50.3 as a start