Page MenuHomePhabricator

Old accented Russian characters in filenames not displayed in Chrome
Open, LowestPublic

Description

Among my batch uploads are many using non-English filenames. After a lot of investment of my time understanding Python encoding and unicode, there are still failures and this one in old fashioned accented Russian is particularly odd, as when I manipulate it, the name displays correctly in pop-ups, browser URL input, and when creating reports using Unicode reusing the name as it exists, with apparent bad characters when displaying the image page:

https://commons.wikimedia.org/wiki/File:Kosmos,_biblii%CD%A1a_prirody_-_sochinenie_A.N._Benera;_Perevod_s_ni%CD%A1emet%CD%A1skago_(1870)_(14801829943).jpg

Is this a known error, or something that is fixable for the Wikimedia Commons filenames display?

Event Timeline

What problem is this task about specifically? About issues with the uploading process or about displaying on Commons? How do you "manipulate it" exactly? What is the correct name and what is the incorrect name?

Or to summarize my questions: Could you please consider splitting your task description into sections like "Steps to reproduce, "Expected outcome" and "Actual outcome"?
That would make this task easier to understand... Thanks in advance.

The issue is that the accented name appears fine in links, popups etc. see the first image below, but when the exact same unicode text is used for a filename, it gets displayed as bad character marks on the Wikimedia Commons image page, see second image.

Screen Shot Link.png (38×1 px, 22 KB)

Screen Shot Filename.png (94×1 px, 35 KB)

This may actually be a browser / OS font rendering issue issue -- on Mac OS X I see the boxes loading https://commons.wikimedia.org/wiki/File:Kosmos,_biblii%CD%A1a_prirody_-_sochinenie_A.N._Benera;_Perevod_s_ni%CD%A1emet%CD%A1skago_%281870%29_%2814801829943%29.jpg in Chrome but I see the combining character in Firefox.

If I go into the web inspector in Chrome and remove "Linux Libertine" from the font-family list for ".mw-body h1, .mw-body h2" selector, it starts rendering correctly. Note that I do not have a font called "Linux Libertine" on my system!

On more careful inspection it's not 'Linux Libertine' that's the problem but 'Georgia' (which was getting quasi-removed from the list during the backspacing of deleting 'Linux Libertine').

So.... evil Microsoft fonts. ;)

Yes, that makes sense. I've been using Chrome, and when I shift over to Firefox the Commons image page renders its Filename perfectly well.

Okay, in the light of the fact that 'Georgia' on Chrome is being naughty, yet on the same browser it will correctly display the name in links (presumably not using Georgia), is there a wider bug to be reported with implications for which font for displaying Commons image page file names is chosen for platform multi-language compatibility?

I am using Safari in my iPad, and my browser renders the filename correctly.

This is strange in Chrome. It renders correctly at the address bar and tabs, but not in the browsing screen.

I am using Google Chrome version 48.0.2564.116 (Official Build) m (32-bit), revision 700a0e589ecfa7e0f65cace17e2f75470c4adf9d-refs/branch-heads/2564@{#706}, and OS is Microsoft Windows XP Version 5.1.2600 SP3.

Old accented Russian characters in filenames (Chrome).PNG (800×1 px, 430 KB)

Aklapper renamed this task from Old accented Russian characters in filenames to Old accented Russian characters in filenames not displayed in Chrome.Mar 14 2016, 11:14 AM
Aklapper triaged this task as Lowest priority.