Page MenuHomePhabricator

Lack of some fonts leads to Unicode characters embedded in SVG files on Commons to be shown as boxes in PNG thumbnails
Open, LowPublic

Description

While even SMP letters can be displayed as plain text, there are BMP letters that the librsvg can't show.
Examples:
https://commons.wikimedia.org/wiki/File:֍.svg U+058D
https://commons.wikimedia.org/wiki/File:֎.svg U+058E
https://commons.wikimedia.org/wiki/File:1F30D.svg U+1F30D

Event Timeline

Hi @Sarang, thanks for taking the time to report this! In the future, please always follow https://www.mediawiki.org/wiki/How_to_report_a_bug and provide full links where to see some problem.

This has to do with fonts being available on servers. There is nothing to fix in librsvg here, so the file descriptions on Commons are wrong.

Aklapper updated the task description. (Show Details)
Aklapper renamed this task from Librsvg cannot resolve some embedded Unicode text to Lack of some fonts leads to Unicode characters embedded in SVG files on Commons to be shown as boxes.Sep 16 2020, 7:58 AM
Aklapper renamed this task from Lack of some fonts leads to Unicode characters embedded in SVG files on Commons to be shown as boxes to Lack of some fonts leads to Unicode characters embedded in SVG files on Commons to be shown as boxes in PNG thumbnails.
Aklapper edited projects, added Thumbor; removed Wikimedia-SVG-rendering.

Cannot reproduce locally using librsvg2-2.48.8: rsvg-convert -w 512 -f png -u -o 1F30D.png 1F30D.svg creates a proper PNG version.

These files all use font-family=monospace, which according to https://meta.wikimedia.org/wiki/SVG_fonts#Fallback is configured to fall back to DejaVu Sans Mono. https://raw.githubusercontent.com/dejavu-fonts/dejavu-fonts/master/status.txt indicates that those characters are not supported by DejaVu yet.
The only free font that I found that supports U+058D and U+058E is Noto, specifically Noto Sans Armenian. Debian Stretch currently packages Noto from 2016-11-16, which appears to be too old to include those glyphs. Debian Buster and later has an up-to-date version of Noto.
U+1F30D is in the emoji character set, and I don't think we include any emoji fonts at the moment. fonts-noto-color-emoji is available in Debian Buster.

The options to fix this are:

  • Wait for T216815: Upgrade Thumbor to Buster and then use Noto
  • Backport the newer fonts-noto and fonts-noto-color-emoji packages and use those
  • Ask DejaVu to support those characters, then backport that fix if/when it arrives