Page MenuHomePhabricator

Install more fonts (especially for Unicode) (tracking)
Closed, ResolvedPublic

Description

At the moment the SVG rasterizer uses Arial or a font that looks like it. It
lacks many characters needed by many languages.

The rasterizer should use fonts that have a better Unicode coverage like DejaVu
fonts for Latin, Greek and Cyrillic based scripts, etc.

Many of these fonts are free and/or Open Source.

Example http://commons.wikimedia.org/wiki/Image:Digestive_system_diagram_ln.svg
should have words like Monɔkɔ, Nsɔ́ngɛ, etc, but the characters missing in Arial
are just not displayed.


Version: unspecified
Severity: enhancement

Details

Reference
bz8898

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:34 PM
bzimport set Reference to bz8898.
bzimport added a subscriber: Unknown Object (MLST).

See also
Bug 3769 Fonts are off in rasterized SVG images on wikimedia sites
Bug 8666 SVG with CJK fonts doesn't render CJK text
Bug 5694 Greek character in .svg wrong rendered for replacement- .png

and Bug 8797 Some greek characters don't render in SVG

Bug 8895 may also be related (stretched font's don't work right)

Note that most of the problems do not exist with my local install of rsvg 2.16.0
(on unbuntu edgy)

Perhaps a tracking bug for rsvg font issues would be handy

(In reply to comment #3)

Note that most of the problems do not exist with my local install of rsvg 2.16.0
(on unbuntu edgy)

That's probably because you have the required fonts installed on you system.

robchur wrote:

(In reply to comment #3)

Perhaps a tracking bug for rsvg font issues would be handy

Filed as bug 8901.

arunganesh03 wrote:

There should atleast be a serif font like Times new roman if not all fonts.
Its quite a nuisance when a map youve made has everything in Arial Roman,
even if its in Arial Bold or Italics.

I have installed the following font packages from FC4:

bitstream-vera-fonts-1.10-5
fonts-bengali-1.10-2
fonts-chinese-2.15-2
fonts-gujarati-1.10-2
fonts-hindi-1.10-2
fonts-japanese-0.20050222-3
fonts-korean-1.0.11-4
fonts-punjabi-1.10-2
fonts-tamil-1.10-2

and a custom build of DejaVu fonts version 2.16.

vyzasatya wrote:

Please install Telugu fonts too.
Fonts are off from the svg images in telugu wikipedia ex. see http://te.wikipedia.org/wiki/Image:Distancedisplacement-te.svg

makineni.pradeep wrote:

You can get two telugu fonts from http://www.kavya-nandanam.com/Pothana2k.zip
And there is also Gautami font, comes with the default installation of the WinXP.

Liberation ( https://www.redhat.com/promo/fonts/ ) has been suggested on [[m:Talk:SVG fonts]] as a free package that contains fonts metrically equivalent to many common non-free fonts. Right now PNGs converted from SVGs that use standard Windows fonts look awful. This one for example uses Times New Roman: http://csomalin.csoma.elte.hu/~tgergo/bugs/commons_svg_conversion_problems.svg and the result is: http://csomalin.csoma.elte.hu/~tgergo/bugs/commons_svg_conversion_problems.png .

Liberation Sans and Serif are great substitutions for Arial and Times New Roman as they match their metrics. It would be great to have the system configured to use them instead of Arial and Times New Roman if those aren't present or up to date with the latest version.

Gergő: there's a serious bug with the renderer used for commons_svg_conversion_problems.svg

kjoonlee wrote:

Liberation fonts do not match Arial or Times New Roman very well IMHO.

Nimbus Sans L and Nimbus Roman No9 L, however, match Helvetica (and thus Arial) and Times New Roman almost perfectly. The URW Nimbus fonts are already installed.

(In reply to comment #12)

Liberation fonts do not match Arial or Times New Roman very well IMHO.

Nimbus Sans L and Nimbus Roman No9 L, however, match Helvetica (and thus Arial)
and Times New Roman almost perfectly. The URW Nimbus fonts are already
installed.

Liberation fonts might have different looks than Arial and Times New Roman but they have almost the exact same metric.
So a string in Arial will probably have the same length as Liberation Sans. Whereas Nimbus Sans has the same shapes as Helvetica but metrics, especially kerning or the space between characters, will be have more differences than with Liberation fonts.

Either way, Liberation fonts should be installed.

Fonts have been updated, all the language support virtual packages in Ubuntu 8.04 have been installed, Telugu included. Liberation fonts have been installed.

filippe.vasconcellos wrote:

This has seriously screwed up font rendering, probably because of a substitution issue—see [[m:Talk:SVG fonts]]. All SVGs uploaded after the installation and specifying Arial/Helvetica/Nimbus Sans L are now defaulting to Liberation, with pretty sketchy results (hinting in particular seems terrible, at least on my system). Liberation Sans may be metrically equivalent to Arial, but is in no way an adequate substitute; Nimbus Sans was much, much better from an aesthetic point of view, and there were never any appreciable kerning/metrics issues.

The new packages were a great addition, but if there is any way for the Nimbus fonts to be the renderer's default Sans and Serif families, _please_ go back to that. Pretty please. Cherry on top.

There is another bug in the SVG to PNG conversion done by the WikiCommons software. Check File:Prokaryote cell diagram-bn.svg . I just translate in bengali language with a Font SolaimanLipi.ttf(welknown good looking Bengali font) in svg, but PNG out put shown Font with vrinda.ttf. Please check.
Is it possible to add SolaimanLipi.ttf to coversion list here http://meta.wikimedia.org/wiki/SVG_fonts

camjsb7j9g wrote:

I find wikimedia renders thumbnails for SVGs when they contain DejaVu Sans and Liberation Sans fonts. A set of files with problems is linked from

http://commons.wikimedia.org/wiki/File:Ikaros_solar_sail_key_liberation_sans.svg

and I posted more details here:

http://commons.wikimedia.org/wiki/Commons:Graphics_village_pump#Wikimedia_renders_Deja_and_Liberation_fonts_badly

Is it possible that wikimedia server software has out of date versions of those fonts? I recall one or both of those fonts had serious kerning or hinting problems on my PC and I recently installed newer versions to clear the problems.

ondra.hosek wrote:

I'd like to request installation of the TeX Gyre fonts [1], which are rather popular amongst TeX users and contain clones of the prescribed PDF fonts (Helvetica, Times, Futura, Bookman, Chancery, Palatino, et al.) based on the URW fonts, but with some slightly less common accented characters. They are licensed under the comparably permissive GUST font license. [2]

[1] http://www.gust.org.pl/projects/e-foundry/tex-gyre/whole
[2] http://www.gust.org.pl/projects/e-foundry/licenses

Thanks a lot in advance.

Turning this into a tracking bug so it is easier to keep track of individual font requests.

bump .. In October 2011 after a major upgrade the situation was then as follows.
Please confirm if you are still missing something specific nowadays..

ii console-setup 1.34ubuntu15 console font and keymap setup program
ii console-terminus 4.30-2 Fixed-width fonts for fast reading on the Li
ii defoma 0.11.10-4ubuntu1 Debian Font Manager -- automatic font config
ii fontconfig 2.8.0-2ubuntu1 generic font configuration library - support
ii fontconfig-config 2.8.0-2ubuntu1 generic font configuration library - configu
ii gsfonts 1:8.11+urwcyr1.0.7~pre44-4 Fonts for the Ghostscript interpreter(s)
ii gsfonts-x11 0.21 Make Ghostscript fonts available to X11
ii kbd 1.15-1ubuntu3 Linux console font and keytable utilities
ii libfont-afm-perl 1.20-1 Font::AFM - Interface to Adobe Font Metrics
ii libfontconfig1 2.8.0-2ubuntu1 generic font configuration library - runtime
ii libfontenc1 1:1.0.5-1 X11 font encoding library
ii libfreetype6 2.3.11-1ubuntu2.4 FreeType 2 font engine, shared library files
ii libt1-5 5.1.2-3build1 Type 1 font rasterizer library - runtime
ii libxfont1 1:1.4.1-1ubuntu0.1 X11 font rasterisation library
ii libxft2 2.1.14-1ubuntu1 FreeType-based font drawing library for X
ii lmodern 2.004.1-3 scalable PostScript and OpenType fonts based
ii psfontmgr 0.11.10-4ubuntu1 PostScript font manager -- part of Defoma, D
ii texlive-font-utils 2009-7ubuntu3 TeX Live: TeX and Outline font utilities
ii texlive-fonts-extra 2009-7ubuntu3 TeX Live: Extra fonts
ii texlive-fonts-extra-doc 2009-7ubuntu3 TeX Live: Documentation files for texlive-fo
ii texlive-fonts-recommended 2009-7 TeX Live: Recommended fonts
ii texlive-fonts-recommended-doc 2009-7 TeX Live: Documentation files for texlive-fo
ii ttf-dejavu-core 2.30-2 Vera font family derivate with additional ch
ii x-ttcidfont-conf 32 TrueType and CID fonts configuration for X
ii xfonts-encodings 1:1.0.3-1 Encodings for X.Org fonts
ii xfonts-utils 1:7.5+2 X Window System font utility programs

Looks fixed to me.

If you find a problem, please open a *new* bug and point to an SVG with text that does not render correctly in the PNG thumb.

Adding the excellent Noto collection? (Sponsored by Google and Adobe, but completely free fonts, under Apache Licence 2.0).

http://www.google.com/get/noto/

No more tofu, that whole collection can be a default base collection working in stead of fallback fonts showing very little information.

The collection is high quality, fully hinted, with metrics adjusted for correct display in pages using ultiple scripts ? Some scripts are available in two styles.

Almost all world scripts are supported (most modern scripts, including Burmese for which good fonts are very scarse or bugged). All Indian scripts are covered, and most scripts for Eastern Asia.

Now it also supports the full CJK repertoire (with 4 linguistic variants, with 7 weights !). Full support of OpenType required features for each scripts)

Full support of Arabic script (both major styles).

There are only 2 missing modern scripts : Tibetan, Thaana for Divehi/Maldivian (but work is in progress, with some issues in OpenType tables to discuss and fix). Many historic scripts are covered. Most new scripts created in the last 50 years (for languages that were still not written, notably in Africa) are covered, including experimental ones (like Deseret).

The goal of the collection is the full repertoire of Unicode (almost all Unicode 7.0 is covered), including musical symbols, technical symbols, weather and games symbols, emojis (soon color emojis too)...

This bug report was closed more than two years ago.
If you have specific requests for specific fonts, please file a separate report for each font/language and explain the impact/usecases.

You're wrong, this bug is a tracking bug referenced directly by OPEN bugs (they are even listed at top of this page!!)

And it was specific to a "collection" of fonts made to be used together, and for all the documented scripts and languages.

I don't need to detail all of then, refer to the site which documents everything. I just was describing its content and licence instead of just posting a blank URL.
It is important for Wikiemdia, notably for many scripts that don't have decent free fonts (e.g. Burmese in Myanmar) in the current collection of free webfonts (that should remain open and cannot be "closed" when many scripts are still lacking the most basic support with donts that are really readable and with correct metrics).

Again: This ticket is closed as fixed. If you want some font, file a new ticket.

Thanks for closing a blocking task, but this tracking bug is still not "closed, resolved" (there's still the missing resolution of blocking task T33950 to update the Malayalam fonts, and possibly newer tasks for other scripts still missing an installation of their fonts, e.g. Burmese).

I have NOT requested the addition of new fonts, I just read the dependencies correctly. If you close this tracking bug, you won't see task T33950 or other newer tasks for other fonts. A tracking bug (open collection) most often is never closed/fixed.

Note the description of this bug: effectively it lists "Latin, Greek, Cyrillic" but note the "etc." which concerns in fact all scripts covered by Unicode (including newer ones such as "Emojis", or existing scripts that are extended by a new Unicode version, or the listed bugs in Bengali).

What has been really closed/fixed is only the support for some Latin characters (such as the open o in the original bug).

If you want to really close this bug, please create another tracking bug, remove the "tracking" tag from this one, and reassign the blocking tasks to the new tracking bug.

Could we add the Noto fonts collection (made by Google for Android but available for everyone with a SIL licence)

Only one font in this collection is special: the colored Emoji fonts which requires the implementation of the colorful extension of OpenType, or specific handling in text renderers of browsers (that dont does not install on Windows, which considers it invalid, including in Windows 10). However the monochromatic Emoji fonts works everywhere.

This collection is in constant evolution but supports a lot of scripts, including with 6 weights for SVK fonts in 4 styles (Simplified Chinese, Traditional Chinese, Japanese, Korean). Most scripts include a bold variant (few of them have italic, but the italic style may be synthetized in browsers like Chrome/Chromium). The list of supported scripts is impressive, including for old scripts.

That Noto collection also offers coherent metrics that allow pages with multiple scripts to have with coherent layouts. Font hinting is partially implemented.

https://www.google.com/get/noto/

@Verdy_p could you link to a SVG on Commons needing them?

The Noto font becomes popular notably because it is the default on Android, but also on many websites for correct support of mulinigual contents. Many scripts don't have relaiable fonts that work across browsers and OSes. Noto provides this support.
And anyway this bug tracker is NOT only for SVG, it is also for use in plain text on wikis. Adding this would allow more reliable rendering of all multilingual wikis: the fonts would be proposed in the ULS (Universal Language Selector), whose support for many scripts is still very poor (or that only works with some browsers/OSes, or does not support many "extended" letters).
The Noto collection targets the full support of the whole Unicode repertoire (there are some recent additions still not implemetned or a few rare clusters with incorrect ligatures, but these bugs are now rare: the Noto has been made to be usable as the default font for all modern languages on Android, Google sites, instant messaging, or blogs. It is made to allow easy reading on screen. It is then far better than the proprietary Arial font (the free version of Arial is very defective, with only very basic support for Latin, and no support for the IPA).
This collection would be very beneficial for all wikis (notably in Wiktionnary) or on Commons for multilingual descriptions. It does not matter if those fonts are used or not in SVG, but they are usable at any time (and for many scripts, only the Noto offers the sufficient support).
There are about 80 fonts in this set to cover all scripts, styles and weights. These fonts line up correctly (unlike most of the non-Latin fonts currently proposed in ULS that have incorrect metrics).

This tracker is too wide: I suggest we discard it and track new fonts requests using the two following columns of the two relevant boards:

  1. Adding fonts on the UniversalLanguageSelector board
  2. Fonts to install on the Wikimedia-SVG-rendering board

That would avoid such confusion on request scope.

Why splitting? Can't the same fonts be used in the ULS for text display and for rendering of SVG (or other graphic formats that we could support, including PDFs that don't contain their own embedded fonts) After all, the SVG format is now part of HTML and the same support for webfonts is usable in browsers to render a wiki text or an SVG image directly (without even needing to use the Wikimedia thumnails renderer) ?
I don't see why Wikimedia would support two distinct sets of fonts. If those fonts are free and embeddable, they can be used undistincly in SVG or in Wiki text. Some graphic extensions will also benefit from them, withe the possibility to render as text or as vector graphics.
Those fonts could be used as well for rendering synchronized subtitles (in more languages) in videos.

And in fact, if a text can be displayed by localizatrion of MediaWiki or in the text of an article, the same text should render as well in graphics without more limitations. All free webfonts in ULS should then be available also for SVG rendering.

(Note: these Noto fonts exist both as webfonts served by Google or that may be served by Wikimedia servers, or as normal installable OpenType fonts that a renderer would used directly without needing webfont extension; their redistritution is also permitted by their SIL license; "Noto" is a short term for "No tofu": their intend is clearly to allow display all languages of the world on the web, in applications, in user interfaces, when creating documents, when printing them, without the frequent "tofu" boxes we see everywhere on our Wikis and that limit the number of contributors that can view or edit the wikis).

This is a tracking bug. If you want font XYZ, file a separate ticket for font XYZ. Thanks!

@Verdy_p: those are currently two separate technologies, hopefully this clarifies why it needs to be in two separate tasks:

  • SVG is not served directly, but rasterized in PNG (see also T5593). So some software on the server needs to parse SVG and draw a PNG image of it, and texts need to be rendered using some specific font.
  • Fonts in ULS are rendered on client-side, so the font needs to exist on the client machine, o downloaded somewhere for it to work.

@Ciencia__Al_Poder, fonts in USL are served as Webfonts, they DON'T need to be preinstalled on the client. Clients will cache the webfonts if needed. In fact Google already includes all the Noto fonts in its own Webfont server. Fonts just need to be installed on the server.

Fonts can also be installed on the SVG to PNG renderers, but we are not required in MediaWiki to get only PNG thumbnails. SVG images can be displayed directly now in many browsers. There are user preferences for this task. Prerendered PNG images will only be useful for small icons (less than 64x64 pixels), because their size will be frequently smaller than a full SVG. This would save much work on SVG-to-PNG renderers: when viewing full size SVG, we don't actually need the PNG, the SVG will be roughtly the same size or smaller than PNG, and SVG files can be rendered with webfonts as well (webfonts will be downloaded on demand, and kept in browser caches).

@Verdy_p I'm very well informed about how webfonts work. Thanks. My point was to explain you why that needs to be done in two separate tasks: One for ULS and other for SVG to PNG rendering. Note that fonts in SVG client side rendering may not be affected by ULS.

So no objection I close this tracking bug is not anymore relevant now we've several setup points for fonts and columns to track these request at each of them?

SVG in client side may still reference web fonts... that may be served by Wikimedia ! SVGs may then be compressed to a "tiny SVG" format where recognized fonts in the SVG will be tuned to reference the Webfonts without embedding it. Clients that have these fonts preinstalled will not query the webfonts server to get a copy in their cache, they'll use their own copy directly.
SVG files may also embed translations and ULS will allow selecting the language used to render the SVG with language preferences. miltilingual SVG specifications really exist and there are examples of such SVG already in Commons.

Dereckson claimed this task.

Yes but that's not relevant: people don't link in their SVG CSS to publicly served fonts. And we don't have a canonical fonts. hosting accessible for all.

I'm so closing this tracking bug as not relevant anymore.

@Verdy_p request could be tracked at T138139.

Any new request should be tagged with one or two projects, according what you want:

What I want is in fact for both. You've just splitted the task into two, it was not the case a few hours before.
There's absoltuely no reason to support internationalization fonts for ULS that would not be supported in SVG rendering, and the reverse is also true. Splitting was not necessary, even if these are two tasks to perform separately. People will want to use the same texts in SVG or in articles, and SVG can now be generated as part of the HTML (without separation into a plain SVG file, as they can be rendered directly by the client without using a server-side SVG-to-PNG renderer). Verious extensions are already generating SVG content on the fly, which can be templated as well. The Maps extension for example generates SVG in a canvas (in adition to displaying prerendered tiles for map background layers. In this generated SVG the text is rendered on the client-side, and will use ULS preferences. International texts can come from wiki templates or from Wikidata queries.
Extensions such as timelines will render better texts; MathML extensions will generate formulas directly without needed server-side rendering in PNG images. And so on. People will not accept "tofus" in their text which should be the same in the plain text of their article and in embedded graphics.

You can create one task per font if you wish. But as we will need to create to create two commits, one in operations/puppet for SVG rendering, one in ULS, please be sure to tag them with both projects, or write explicitly in the task you wish both.

Please also note SVG rendering is WMF specific, ULS available for every extension user, WMF or not.