Page MenuHomePhabricator

Half R rendering problem with fonts in PDF books in Devnagari scripts
Closed, ResolvedPublicBUG REPORT

Description

PDF exported version of books do not display half R correctly in some cases. For e.g. note how the word मिश्र is written on this page...

https://hi.wikisource.org/wiki/%E0%A4%86%E0%A4%9C_%E0%A4%AD%E0%A5%80_%E0%A4%96%E0%A4%B0%E0%A5%87_%E0%A4%B9%E0%A5%88%E0%A4%82_%E0%A4%A4%E0%A4%BE%E0%A4%B2%E0%A4%BE%E0%A4%AC

Download the PDF version and note the wrong rendering of the same word.

Or visit this page...
https://ws-export-test.wmcloud.org/
And try to export any "Marathi" book for e.g. श्रीग्रामायन

It is very difficult to read that PDF version. The plain text and RTF export looks OK. The words having त्र क्र प्र श्र are not displayed correctly. This issue can be seen in both, Hindi and Marathi languages those use the same Devnagari script.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Hi @shantanuo, thanks for taking the time to report this and welcome to Wikimedia Phabricator!
Please see https://www.mediawiki.org/wiki/How_to_report_a_bug and provide all required information how things currently look, and how they should look.

I appreciate your guidance. I am trying to report an issue that I am facing on Marathi Wikisource. There is an option to download pages in PDF format. But…

The characters in PDF file look like this:
त्‌र क्‌र प्‌र श्‌र

They should look like this:
त्र क्र प्र श्र

Due to this bug, export to PDF option is useless for Devanagari users. (script used by Marathi, Hindi and other south Asian languages)

Aklapper renamed this task from Half R rendering problem in PDF books noted in Devnagari Scripts to Half R rendering problem with fonts in PDF books in Devnagari scripts.Jun 30 2021, 11:08 PM
shantanuo changed the subtype of this task from "Task" to "Bug Report".Jul 12 2021, 1:32 PM

The "Download" buttons points to the incorrect font. But "export to PDF" option available on the same page uses the correct font as shown in this image...

https://commons.wikimedia.org/wiki/File:Wrong_correct_font_rendering_on_same_page.png

Can you use the same font in both places? In other words the characters like त्र क्र प्र श्र should look the same in both PDF files irrespective of the option used to download them. This is the issue faced by Hindi/ Marathi/ Gujarati and other Asian languages.

Is this happening on hiwikisource or mrwikisource?

In the latter, it looks like the old gadget is still enabled. It should be removed, as it's functionality has been completely replaced by the Wikisource extension.

Also, note that you can change the default font that is embedded, by editing https://hi.wikisource.org/wiki/MediaWiki:WS_Export.json and https://mr.wikisource.org/wiki/MediaWiki:WS_Export.json

Is this happening on hiwikisource or mrwikisource?

Both. I have not checked but I guess also on Gujarati and other Asian languages like Bengali. (T258124 ?) I have requested the authority on marathi wikisource to change the default font.

https://mr.wikisource.org/wiki/%E0%A4%B8%E0%A4%A6%E0%A4%B8%E0%A5%8D%E0%A4%AF_%E0%A4%9A%E0%A4%B0%E0%A5%8D%E0%A4%9A%E0%A4%BE:QueerEcofeminist#Half_R_rendering_probelm_in_PDF_books

Should I request the same person to remove the old gadget?

Should I request the same person to remove the old gadget?

Yep, that would be best. They don't have to delete the actual gadget pages, just remove it from the gagets-definition page.

NRodriguez subscribed.

@shantanuo following up to check in and see if @Samwilson's suggestion to remove it from the gagets-definition page solved it

It has not solved my problem. I can not remove the "Download" button (shown in the image mentioned above) because that allows to download all the transcluded pages - while "Download this page" link in the left navigation bar will download only the current page. The display is correct for the link but the full book is not included, while the big blue "Download" button will download all the pages in PDF version but that is not legible due to this bug.

If you have access to server side statistics, you can check the total number of downloads of English / French and compare them with Marathi/ Hindi or Guajarati count. There are many reasons for the low count of Indian language downloads, but the most important reason is that PDF download does not render half R correctly.

I had a bit more of a look into this, but ran into a new bug: T290053. But anyway, it looks like the gadget confusion is now fixed, which is great.

However, it looks like the default fonts for mrwikisource and hiwikisource have not been changed yet, and that this is the source of the error (probably similar for other Indic languages).

You can manually select a different font in the web form; can you confirm that a different font solves the half R rendering problem? If so, which font?

When was the bug T290053 introduced? When I posted this report, the PDF was getting generated correctly. The "Half R" rendering was the only problem. Now there is no PDF :(

Please send me an email shantanu dot oak at gmail.com so that I can send you more information. (I am not sure if that is allowed) I do not want to share it here.

I have tried several fonts, but none of them are correct. I guess it is using only one default font and ignoring the user selection.

Here is how you can reproduce the issue.

  1. Visit this page...

https://tinyurl.com/frumez76

  1. Click on big blue "Download" button on right. You get a PDF with incorrect rendering of "Half R".
  1. Click on the link "PDF म्हणून उतरवा" on left. You get the correct pdf.

Can you tell which font is being used in step 3? That PDF is correct but does not include the entire book (i.e. transcluded pages)

The administrator of marathi wikisource may be too busy. He has not replied to my request of changing the font.

Sorry, I didn't mean to confuse things; that other bug is not related to the font issue here.

Can you check a PDF with the Noto Sans Devanagari font? That looks to be what the other PDF export system is using.

The generated PDF after using Noto Sans Devanagari font represents Half R incorrectly. No matter which font I select, it is generating exactly the same PDF. Shouldn't the content look in different style when I change the font?

Is it possible to connect on IRC?

The same is the case with english wikisource. Even if I use different fonts, it generates exactly the same PDF. It says "Choose from 204 available fonts." but uses only one.

shantanuo claimed this task.

When the bug T288782 was fixed, it resolved this issue automatically. :)

Can someone change the font from "FreeSerif" to "Noto Serif Devnagari" on all sites using devnagari script? For e.g. hindi language...

https://hi.wikisource.org/wiki/%E0%A4%AE%E0%A5%80%E0%A4%A1%E0%A4%BF%E0%A4%AF%E0%A4%BE%E0%A4%B5%E0%A4%BF%E0%A4%95%E0%A4%BF:WS_Export.json

I tried to contact the admins of respective languges, but not able to convince the importance of this change.

I think this is something that individual communities should probably decide on. hewikisource would also benefit from a changed font, I think.

I guess if a local interface-admin isn't available to make the change, you could contact a steward: https://meta.wikimedia.org/wiki/Steward_requests/Miscellaneous

One of the steward answered saying "If their admins not convinced, we cannot do anything either". This is very surprising reply. It means local admins (who are not capable of understanding the basic issues like this) are the last and final authority! The number of actitve contributors is so low that the elections are a joke. I am happy that I learned a lesson not to waste time by raising issues or writing on wikipedia.

@shantanuo: See https://meta.wikimedia.org/wiki/Requesting_wiki_configuration_changes which covers not having many active contributors. (I'm afraid that problem isn't solved by reducing the number of contributors even more. :)

@shantanuo: What makes you think that "admins are not interested"? I don't see that from the links you provided. As written before, this needs local discussion first.

Also, as this ticket was only about the half R rendering problem which is resolved, please file a separate ticket for changing fonts once there is consensus, by following https://meta.wikimedia.org/wiki/Requesting_wiki_configuration_changes - thanks.