Page MenuHomePhabricator

Wikisource: Make it possible to use any installed fonts in ebooks
Closed, ResolvedPublic5 Estimated Story PointsNov 4 2020

Description

As a Wikisource user, I want access to all system fonts, so I can have broader font support for more languages.

Background: At the moment, WESxport comes bundled with a bunch of fonts, which can be selected for inclusion in ebooks. There are a few reasons for moving to a different way of managing fonts: the included ones need to be periodically updated; they don't cover all languages; is bad practice to embed external resources in a project repository; and adding new fonts is annoying.

We should change to make it possible to use any font that's available on the host system. There are many of these, covering nearly all Wikisource languages. The current UI has a dropdown list of fonts that we recommend, also showing what languages they're good for. This should be moved into config.php, e.g.:

	'fonts' => [
		'FreeSerif' => 'FreeSerif',
		'Linux Libertine' => 'Linux Libertine',
		'Libertinus' => 'Libertinus',
		'Mukta' => 'Mukta (Devanagari)',
		'Mukta Mahee' => 'Mukta Mahee (Gurmukhi)',
		'Mukta Malar' => 'Mukta Malar (Tamil)',
		'Lohit Kannada' => 'Lohit Kannada',
	],

This will mean that any time we want to add a new font or change an existing one, it's a simple config change and doesn't require any code changes. Updates will also happen via the normal operating system mechanisms.

This task is a follow up to T254918: Wikisource Ebooks: Investigate Font rendering Issues [8Hours].

Acceptance Criteria:

  • Make it possible to use any installed fonts in ebooks
  • Share list of fonts supported by this change, so Ilana & Satdeep can inform the community & they can test out the changes when ready
NOTE: If you have questions about fonts & font support, @SGill is a recommended resource.

Details

Due Date
Nov 4 2020, 5:00 AM

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ifried renamed this task from Make it possible to use any installed fonts in ebooks to Wikisource: Make it possible to use any installed fonts in ebooks.Sep 15 2020, 5:55 PM

Share list of fonts supported by this change, so Ilana can inform the community & they can test out the changes when ready

We'll basically be able to add any fonts that are installable in Debian: https://packages.debian.org/stable/fonts/

At the moment this is a limited list (the following is from the staging server) but can be added to at any time:

$ fc-list 
/usr/share/fonts/truetype/dejavu/DejaVuSerif-Bold.ttf: DejaVu Serif:style=Bold
/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf: DejaVu Sans Mono:style=Book
/usr/share/fonts/truetype/liberation/LiberationSansNarrow-Italic.ttf: Liberation Sans Narrow:style=Italic
/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf: DejaVu Sans:style=Book
/usr/share/fonts/truetype/liberation/LiberationSans-Regular.ttf: Liberation Sans:style=Regular
/usr/share/fonts/truetype/liberation/LiberationMono-BoldItalic.ttf: Liberation Mono:style=Bold Italic
/usr/share/fonts/truetype/liberation/LiberationSerif-Italic.ttf: Liberation Serif:style=Italic
/usr/share/fonts/truetype/liberation/LiberationMono-Bold.ttf: Liberation Mono:style=Bold
/usr/share/fonts/truetype/liberation/LiberationSansNarrow-Regular.ttf: Liberation Sans Narrow:style=Regular
/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf: DejaVu Sans:style=Bold
/usr/share/fonts/truetype/liberation/LiberationSerif-Bold.ttf: Liberation Serif:style=Bold
/usr/share/fonts/truetype/liberation/LiberationMono-Regular.ttf: Liberation Mono:style=Regular
/usr/share/fonts/truetype/liberation/LiberationSans-Italic.ttf: Liberation Sans:style=Italic
/usr/share/fonts/truetype/liberation/LiberationSerif-BoldItalic.ttf: Liberation Serif:style=Bold Italic
/usr/share/fonts/truetype/liberation/LiberationSansNarrow-BoldItalic.ttf: Liberation Sans Narrow:style=Bold Italic
/usr/share/fonts/truetype/dejavu/DejaVuSansMono-Bold.ttf: DejaVu Sans Mono:style=Bold
/usr/share/fonts/truetype/liberation/LiberationMono-Italic.ttf: Liberation Mono:style=Italic
/usr/share/fonts/truetype/liberation/LiberationSans-BoldItalic.ttf: Liberation Sans:style=Bold Italic
/usr/share/fonts/truetype/liberation/LiberationSerif-Regular.ttf: Liberation Serif:style=Regular
/usr/share/fonts/truetype/lohit-kannada/Lohit-Kannada.ttf: Lohit Kannada:style=Regular
/usr/share/fonts/truetype/liberation/LiberationSansNarrow-Bold.ttf: Liberation Sans Narrow:style=Bold
/usr/share/fonts/truetype/liberation/LiberationSans-Bold.ttf: Liberation Sans:style=Bold
/usr/share/fonts/truetype/dejavu/DejaVuSerif.ttf: DejaVu Serif:style=Book

The process of adding a new font will to be a) install the requisite package; and b) add the required font family name to the wsexport config (e.g. 'LiberationSans' to support the four variants of that font).

ARamirez_WMF set the point value for this task to 5.Oct 2 2020, 12:02 AM
ARamirez_WMF moved this task from Needs Discussion to Ready on the Community-Tech board.
ifried added a subscriber: SGill.
ARamirez_WMF changed the subtype of this task from "Task" to "Deadline".

The above patch is merged and deployed to the test site. I'm afraid I got it wrong about the Mukta fonts though, and they don't actually seem to be available in Debian. Oops. But others are for those languages, for instance Lohit-Tamil. I'll make a new patch to update the list, but first @SGill do you have an opinion of the fonts that should be in the list?

Any font can actually be used by adding it directly in the URL, but for the web UI we want a limited list (although, maybe that's still undecided? I'm not sure what the state is at the moment of the design of the web UI).

e.g. https://wsexport-test.wmflabs.org/?fonts=Lohit-Tamil

The currently available fonts on the staging site are below, and we can install any that are listed at https://packages.debian.org/buster/fonts/

aakar
Ani,অনি  Dvf
AnjaliOldLipi
Chandas
Chilanka
DejaVu Sans
DejaVu Sans Mono
DejaVu Serif
Dyuthi
FreeMono
FreeSans
FreeSerif
Gubbi
Jamrul
Kalapi
Kalimati,नालिमाटी
Karumbi
Keraleeyam
Liberation Mono
Liberation Sans
Liberation Sans Narrow
Liberation Serif
Likhan
Linux Biolinum Keyboard O
Linux Biolinum O
Linux Libertine Display O
Linux Libertine Initials O
Linux Libertine Mono O
Linux Libertine O
Lohit Assamese
Lohit Bengali
Lohit Devanagari
Lohit Gujarati
Lohit Gurmukhi
Lohit Kannada
Lohit Malayalam
Lohit Odia
Lohit Tamil
Lohit Tamil Classical
Lohit Telugu
Manjari
Manjari,Manjari Thin
Meera
Mitra Mono,\u09ae\u09bfতি\u09cd\u09b0
Mukti Narrow,মুক্তি  পাতনা,Mukti Narrow Bold
Nakula
Navilu
ori1Uni,utkal
padmaa-Bold.1.1,padmaa,padmmaa
padmaa,padmmaa
Pagul
Pothana2000
Rachana
RaghuMalayalamSans
Rasa
Rasa,Rasa Light
Rasa,Rasa Medium
Rasa,Rasa SemiBold
Rekha
Saab
Sahadeva
Samanata
Samyak Devanagari
Samyak Gujarati
Samyak Malayalam
Samyak Tamil
Suruma
Uroob
Vemana2000
Yrsa
Yrsa,Yrsa Light
Yrsa,Yrsa Medium
Yrsa,Yrsa SemiBold
ARamirez_WMF changed Due Date from Oct 21 2020, 4:00 AM to Nov 4 2020, 5:00 AM.Oct 22 2020, 7:38 PM

We talked about this in standup this morning, and have decided that the font questions can be handled in separate tickets, so this is ready for QA.

The main things to test are that fonts are included (or not) in the ebooks, that they can be provided as the fonts param in the URL (note plural), and that there's no backwards compatibility breaking (although, I'm not quite sure what we should do about the font names where we can't install the exact same thing… probably add more workarounds as I've done for linux-libertine — but I'll wait to hear that everything's groovy as-is first).

There's going to be more discussion I'm sure about default fonts for particular languages, but for now FreeSerif continues to be the default for non-latin scripts (although, our list of "latin" is a bit odd: [ 'fr', 'en', 'de', 'it', 'es', 'pt', 'vec', 'pl', 'nl', 'fa', 'he', 'ar' ];).

@Samwilson I am seeing some discrepancies in the fonts that are included in the epub files.

  1. Passing the Mitra Mono,\u09ae\u09bfতি\u09cd\u09b0 to the fonts= parameter, no font file is included in the epub (when you extract it). I have tried URI encoding it as well (which I think is, for example, https://wsexport-test.wmflabs.org/book.php?lang=en&format=epub-3&page=Appellate_Division_Quorum_Act%2C_1955&fonts=Mitra%20Mono,%E0%A6%AE%E0%A6%BF%C3%A0%C2%A6%C2%A4%C3%A0%C2%A6%C2%BF%E0%A7%8D%E0%A6%B0).

Sshing into the VM, I can run:

tools.wsexport-test@tools-sgebastion-07:~$ fc-list :family="Mitra Mono" file
/usr/share/fonts/truetype/fonts-beng-extra/mitra.ttf:
  1. With fonts=padmaa,padmmaa, the epub has only two files:
OPS/fonts/padmaa.ttf
OPS/fonts/padmaa-Medium-0.5.ttf

But running on the VM, I get three:

fc-list :family="padmaa,padmmaa" file
/usr/share/fonts/truetype/fonts-gujr-extra/padmaa-Medium-0.5.ttf: 
/usr/share/fonts/truetype/fonts-gujr-extra/padmaa.ttf: 
/usr/share/fonts/truetype/fonts-gujr-extra/padmaa-Bold.1.1.ttf:
  1. DejaVu fonts miss some style variants, like Oblique, Condensed, ExtraLight.

For example, fonts=DejaVu%20Sans includes:

OPS/fonts/DejaVuSans-Bold.ttf
OPS/fonts/DejaVuSans.ttf

But fc-list :family="DejaVu Sans" file lists:

/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf: 
/usr/share/fonts/truetype/dejavu/DejaVuSans-Oblique.ttf: 
/usr/share/fonts/truetype/dejavu/DejaVuSans-BoldOblique.ttf: 
/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf: 
/usr/share/fonts/truetype/dejavu/DejaVuSansCondensed-Oblique.ttf: 
/usr/share/fonts/truetype/dejavu/DejaVuSansCondensed-BoldOblique.ttf: 
/usr/share/fonts/truetype/dejavu/DejaVuSansCondensed-Bold.ttf: 
/usr/share/fonts/truetype/dejavu/DejaVuSans-ExtraLight.ttf: 
/usr/share/fonts/truetype/dejavu/DejaVuSansCondensed.ttf:

Thanks for finding those errors @dom_walden.

I've made a new patch that takes into account all font weights, and also lists all available fonts in the dropdown. After talking with @Prtksxna yesterday, it sounds like this is worthwhile at this stage, because it will be able to evolve into a dynamic dropdown that shows only those fonts that are applicable to whatever language is currently selected in the form.

PR: https://github.com/wsexport/tool/pull/270

One trouble with this is that it has to convert fontconfig's weight numbers into CSS's ideas of the same. (The same is done for slant/style, but that's one-to-one with normal, italic, and oblique.) This is the map:

                         fontconfig   CSS
thin            weight          0     100
extralight      weight          40    100
ultralight      weight          40    100
light           weight          50    200
demilight       weight          55    300
semilight       weight          55    300
book            weight          75    400
regular         weight          80    500
normal          weight          80    500
medium          weight          100   600
demibold        weight          180   700
semibold        weight          180   700
bold            weight          200   800
extrabold       weight          205   900
black           weight          210   900
heavy           weight          210   900

Now that we can use Beta on the test site, we can set up test pages such as https://en.wikisource.beta.wmflabs.org/wiki/Font_tests

Samwilson added a subscriber: dmaza.

Most of this is merged and on the test site, but there's a bug that @dmaza found with the padmaa font being listed twice (as padmmaa and padmaa-Bold.1.1). I'll make a fix for this now.

Most of this is merged and on the test site, but there's a bug that @dmaza found with the padmaa font being listed twice (as padmmaa and padmaa-Bold.1.1). I'll make a fix for this now.

PR: https://github.com/wsexport/tool/pull/280

The last tweak to the fonts' list is merged and live on the test site.

On the test site (and, since T267361, on the production site as well), you will now see a much larger selection of fonts in the dropdown. I guess this includes all available fonts.

  1. Passing the Mitra Mono,\u09ae\u09bfতি\u09cd\u09b0 to the fonts= parameter, no font file is included in the epub (when you extract it)...

This is fixed. For example, https://wsexport-test.wmflabs.org/?lang=en&page=Appellate_Division_Quorum_Act%2C_1955&format=epub-3&fonts=Mitra+Mono

  1. With fonts=padmaa,padmmaa, the epub has only two files:...

...

  1. DejaVu fonts miss some style variants, like Oblique, Condensed, ExtraLight...

There are still some missing font variants in the ebooks. Have raised T268136.

ifried added a subscriber: ifried.

This has been released, and the font variant issue will be followed in another ticket. For this reason, I'm marking this work as Done.