Page MenuHomePhabricator

Tamil language characters not displaying properly while downloading in pdf format
Closed, ResolvedPublic

Description

This issue copied from https://github.com/wsexport/tool/issues/235


balajijagadesh opened this issue 13 days ago:

While downloading pdf files from ta.wikisource.org the Tamil characters appear like a box. to prevent that a font was added in the tool. While downloading with mukta-malar font for Tamil language the content appears properly. But the file name generated is filled with arbitrary characters and also headings generated with first page also appears in box. Kindly resolve the issue because of these I have noted reduction in pdf downloads for Tamil Language from ta.wikisource.org
"
􀀀􀀀􀀀􀀀 􀀀􀀀􀀀􀀀􀀀 􀀀􀀀􀀀􀀀􀀀
􀀀􀀀􀀀 􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀
18􀀀􀀀􀀀 􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀
07/02/20 􀀀􀀀􀀀􀀀􀀀 􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀 􀀀􀀀􀀀􀀀􀀀􀀀􀀀 􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀􀀀"

content appearance in the first page of pdf download.

The issue can be replicated by trying to download using the follwing link for example. https://tools.wmflabs.org/wsexport/tool/book.php?lang=ta&fonts=mukta-malar&page=கந்த+சஷ்டி+கவசம்&format=pdf-a5

Event Timeline

Restricted Application added a project: Community-Tech. · View Herald TranscriptJul 16 2020, 4:10 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ifried added a subscriber: ifried.Nov 13 2020, 10:30 PM

This issue now appears to be resolved with the changes we made, according to my tests. Would this be correct, @Samwilson? Also pinging @Balajijagadesh for input. Thanks!

The problem seems resolved now.

ifried closed this task as Resolved.Nov 14 2020, 2:55 PM
ifried claimed this task.

Great, thanks, everyone! I'll mark this ticket as resolved.