Page MenuHomePhabricator

Wikisource Ebooks: Investigate Font rendering Issues [8Hours]
Closed, ResolvedPublic

Description

As a Wikisource user, I want the font rendering issue to be investigated, so that the root cause(s) can be identified and a plan of action can then be developed in response to the findings.

Acceptance Criteria:

  • Investigate font rendering issues on Wikisource, in which text is displayed as boxes rather than the proper text
  • Be sure to look into Indic languages, in particular, as well as a range of other language types (such as Roman letters, RTL, etc)
  • Determine key reasons why this issue is occurring in ebook exports
  • Share recommended or potential technical solutions/next steps, if we wish to fix this issue

Notes:

  • Note that, sometimes, the issue seems to occur when the user does not select the font in WSExport. This could potentially be fixed by automatically populating the font field with the selection that is relevant to the language code. However, be mindful of the fact that some books are multi-lingual.
  • We don't have supported fonts for all languages in Wikisource (for example, for Kannada --> we have no way to get text in Kannada). We do, however, support other Indic languages, including those with Devanagari script.
  • Note that, sometimes, when the user does select the font in WSExport, there are conjunct consonant issues (are these related/can they be tackled together?).

Visual Examples:

In the images below, you will see that the files are not properly exported from Tamil Wikisource (image #1) or Kannada Wikisource (image #2). Rather, the text displays as rectangles. This is due to the fact that Kannada is not included in the “Include fonts” section, which creates various issues, such as the one below. Meanwhile, Tamil is included in “Include Fonts,” but there are still issues.

Image #1:

Tamili Wikisource First page Screenshot - even after include Tamil font.png (1×751 px, 60 KB)

Image #2:

Kannada Wikisource - Font not rendered because Kannada font is not available.png (1×751 px, 27 KB)

Event Timeline

ifried added a subscriber: SGill.

@SGill The team may be discussing this investigation ticket during estimation tomorrow. I would love if you could check this ticket out & let me know if there is anything else that should be added. Thanks!

Has ULS been implemented suitably? Should this be a UniversalLanguageSelector issue?

@SGill and I have discussed this ticket together, and it is ready for the team to estimate.

ifried renamed this task from Wikisource: investigate font rendering issues [placeholder] to Wikisource: investigate font rendering issues.Jun 11 2020, 4:16 PM
ifried updated the task description. (Show Details)
ifried renamed this task from Wikisource: investigate font rendering issues to Wikisource Ebooks: Investigate Font rendering Issues.Jun 11 2020, 10:35 PM
ifried added a project: WS Export.
ARamirez_WMF renamed this task from Wikisource Ebooks: Investigate Font rendering Issues to Wikisource Ebooks: Investigate Font rendering Issues [8Hours].Jun 11 2020, 11:56 PM
ARamirez_WMF moved this task from Needs Discussion to Up Next (June 3-21) on the Community-Tech board.

It looks like Kannada Wikisource's WSExport gadget has not been translated and is using the wrong language code: https://kn.wikisource.org/wiki/%E0%B2%AE%E0%B3%80%E0%B2%A1%E0%B2%BF%E0%B2%AF%E0%B2%B5%E0%B2%BF%E0%B2%95%E0%B2%BF:Gadget-WSexport.js
(not that that fixes the font problem, but it should be fixed anyway).

Does anyone know what font should be included for Kannada? It looks like the Ubuntu standard is Lohit Kannada (which maybe is https://www.fontsc.com/font/lohit-kannada although I don't know if that's its canonical home), so we could probably add that.

No, it looks like https://www.google.com/get/noto/#sans-knda is better (it's got bold). It's licensed with SIL Open Font License 1.1, which I'm assuming is okay for us.

Either way, we need to adapt the font-including to permit .ttf as well as .otf (different filenames and mime type is all, I think).

I started an idea on this at https://github.com/wsexport/tool/pull/230 — it switches to making system fonts available, and there are for example a bunch of Kannada-appropriate fonts already installed:

tools.wsexport-test@tools-sgebastion-07:~$ fc-list :lang=kn
/usr/share/fonts/truetype/noto/NotoSansKannadaUI-Regular.ttf: Noto Sans Kannada UI:style=Regular
/usr/share/fonts/truetype/noto/NotoSansKannada-Bold.ttf: Noto Sans Kannada:style=Bold
/usr/share/fonts/truetype/noto/NotoSansKannadaUI-Bold.ttf: Noto Sans Kannada UI:style=Bold
/usr/share/fonts/truetype/noto/NotoSerifKannada-Regular.ttf: Noto Serif Kannada:style=Regular
/usr/share/fonts/truetype/Navilu/Navilu.ttf: Navilu:style=Normal
/usr/share/fonts/truetype/noto/NotoSansKannada-Regular.ttf: Noto Sans Kannada:style=Regular
/usr/share/fonts/truetype/lohit-kannada/Lohit-Kannada.ttf: Lohit Kannada:style=Regular
/usr/share/fonts/truetype/Gubbi/Gubbi.ttf: Gubbi:style=Normal
/usr/share/fonts/truetype/noto/NotoSerifKannada-Bold.ttf: Noto Serif Kannada:style=Bold

I haven't yet confirmed that all the fonts we want are available that way though. If they're not, it feels like it'd be better to install them system-wide rather than adding them just to the wsexport repo.

It looks like Kannada Wikisource's WSExport gadget has not been translated and is using the wrong language code: https://kn.wikisource.org/wiki/%E0%B2%AE%E0%B3%80%E0%B2%A1%E0%B2%BF%E0%B2%AF%E0%B2%B5%E0%B2%BF%E0%B2%95%E0%B2%BF:Gadget-WSexport.js

I've fixed the links, but it still needs translation.

Samwilson changed the task status from Open to Stalled.Aug 2 2020, 6:10 AM

The above patch for extra fonts is ready to go, but we're going to hold off until T256018 is resolved, so we have a baseline from which to measure things.

I'd appreciate if the outcome would also fix T228591

The process for installing new fonts for wsexport would be a) making sure it's installed on the VPS (i.e. via the OS package manager); and b) adding it to the config file.

Should this be a UniversalLanguageSelector issue?

As Aklapper says above, probably not. But if we get to the point of wanting to select from a huge list of languages (and fonts) then we might want to look at ULS as a way to do that. So far, it looks like we're talking about maybe a dozen fonts, so it's not necessary.

The more specific task for the system font usage is now: T261479: Wikisource: Make it possible to use any installed fonts in ebooks (I'll move the above patch to point to that one).

This investigation is now complete, and we now have a plan for how to move forward. We will pursue the proposed solution in T261479. For this reason, I'm marking this investigation as Done.