Page MenuHomePhabricator

Document required fonts for Chromium-Render service
Closed, ResolvedPublic

Description

When using Chromium-PDF on system default font set (using english locale), chromium-pdf uses incorrect fonts while generating PDF. This leads to unreadable PDF. In order to prevent this issue please find and document which fonts are required by`chromium-pdf` service to render properly all supported languages and dialects.

Example of broken render

Article: https://en.wikipedia.org/wiki/Mahatma_Gandhi
Chromium render url: http://chromium-pdf.wmflabs.org/en.wikipedia.org/v1/pdf/Mahatma_Gandhi/letter


What needs answering?

We'd like to install the missing fonts on the computer generating PDFs, but first we need to identify what those fonts should be. Language engineering/design may be able to help provide us with a list of fonts that are used across our different language wikis. Can you help?

Event Timeline

Strange. The newly generated PDFs don't have this issue.

@pmiazga is trying to repro and will report back after the weekend.

ovasileva lowered the priority of this task from High to Medium.Dec 12 2017, 5:14 PM
ovasileva moved this task from Incoming to Needs Prioritization on the Web-Team-Backlog board.

I think I fixed that by installing lots of extra fonts. Let's use this ticket to document all required fonts.

pmiazga renamed this task from Chromium renders article with broken fonts to Document required fonts for Chromium-Render service.Feb 15 2018, 2:53 PM
pmiazga updated the task description. (Show Details)

please find and document which fonts are required by`chromium-pdf` service to render properly all supported languages and dialects.

Can you add instructions in the tasks on how this will be done? Does this mean Nirzar or a designer needs to list all font family names we need to install?

Could we do this on a case by case basis as we get bug reports?

I'm not sure, this requires some analysis first as we have to install correct Debian packages. Nirzar can provide us all font family names but those might not match Debian packages. We can do it case by case but first we have to verify the most popupar wikis and check that PDF generated for those have correct fonts.

I'd recommend hitting all 298 birds with one stone...

For language support, the Google Noto font is known for having the widest range of characters available, it's specifically designed for that.
There's a debian package available.
To give you an of how many languages are covered by Noto, the entire font family weighs 1 GB!

@pmiazga @Jdrewniak can you describe what needs to be done here and how to do it? Trying to figure out if this makes sense as a design task.

Also note that there are still several 'missing font' tickets open, specifically:

Those are the two i ran into today at least, there might be more.

Related: T169828 T181200 - It seems these font issues could easily be resolved by quickly syncing with the required people - Reading Infrastructure or/and Marko, Olga, Alex and a reading web engineer

Previously, we were using clean distro, everything installed by hand, most probably because of that we didn't have all fonts installed, furthermore we didn't know which fonts to install.

From my knowledge - this task was fixed by applying existing puppet class mediawiki::packages::fonts on proton servers (see https://github.com/wikimedia/puppet/blob/production/modules/mediawiki/manifests/packages/fonts.pp).
With this set of fonts most of the languages started to render correctly. If there are issues with it ( looks like there are (see DJ comment https://phabricator.wikimedia.org/T182608#5233883), we need to update the mediawiki::packages::fonts puppet class to support new fonts.

LGoto raised the priority of this task from Medium to High.Jul 15 2020, 4:01 PM

I haven't yet checked whether the tasks about language coverage mentioned above are resolved, but I recently updated the fonts used following the production traffic switchover to k8s: https://gerrit.wikimedia.org/r/c/mediawiki/services/chromium-render/+/612353

I verified that https://en.wikipedia.org/api/rest_v1/page/pdf/Mahatma_Gandhi returns a PDF rendered properly. I don't see any fonts missing.

Here is the list from: https://gerrit.wikimedia.org/r/c/mediawiki/services/chromium-render/+/612353

fonts-liberation,
fonts-noto,
fonts-noto-cjk,
fonts-noto-cjk-extra,
fonts-noto-color-emoji,
fonts-noto-extra,
fonts-noto-mono,
fonts-noto-ui-core,
fonts-noto-ui-extra,
fonts-noto-unhinted

@Jgiannelos Are these fonts covering all unicode and multi-lingual characters or is the scope of this just for English alphanumeric characters?

fonts-noto is supposed to be a font family to support a wide variety of languages, its not only for English.