Page MenuHomePhabricator

Some languages in UploadWizard's dropdown list are displayed twice
Closed, ResolvedPublic

Description

Some languages in UploadWizard's dropdown list are displayed twice. This seems to be caused by the backwards-compatibility language codes that we have.

  • Aromanian (rup and roa-rup)
  • Samogitian (sgs and bat-smg)

image.png (968×1 px, 122 KB)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

The same is for "Belarusian (Tarashkievica)" (one is displayed in belarussian and other in english)

Also take a look at Crimean Tatar language. I think "Crimean Tatar" and "Crimean Turkish" are the same.

crh.png (961×1 px, 71 KB)

dr0ptp4kt triaged this task as Medium priority.Aug 7 2017, 4:28 PM
dr0ptp4kt moved this task from Untriaged to Needs Design on the Multimedia board.
dr0ptp4kt added subscribers: Nirzar, dr0ptp4kt.

@Nirzar what treatment do you suggest?

short term fix:

  1. if the language string matches then prefix the string with language code in brackets

e.g. (RUP) Aromanian | (ROA-RUP) Aromanian

prefix because of possibility of truncation.

long term solution

  1. Have language codes for everyone but well formatted with tags. something similar to language selection on reading web experience.

if the language string matches then prefix the string with language code in brackets

e.g. (RUP) Aromanian | (ROA-RUP) Aromanian

prefix because of possibility of truncation.

I think in this case it would be more appropriate to just remove one of the entries – in all of the cases we've seen so far, one of the two codes is deprecated and it's wrong to use it.

If we end up doing this later, be careful with the prefix – it would mess with the sorting order of the list, and it would make it more difficult to type-to-search in the long list.

Change 370874 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/UploadWizard@master] Skip duplicate deprecated language codes in the language dropdown

https://gerrit.wikimedia.org/r/370874

Also take a look at Crimean Tatar language. I think "Crimean Tatar" and "Crimean Turkish" are the same.

crh.png (961×1 px, 71 KB)

"Crimean Tatar" and "Crimean Turkish" are two names for a single language, but in this case, there is a difference – the two entries are:

'crh-cyrl' => 'Crimean Turkish (Cyrillic script)',
'crh-latn' => 'Crimean Turkish (Latin script)',

It seems that only one of them has a name in Ukrainian, so it's more confusing (see T172219).

in all of the cases we've seen so far, one of the two codes is deprecated and it's wrong to use it.

if this is true, then we should remove it as you suggested

Change 370874 merged by jenkins-bot:
[mediawiki/extensions/UploadWizard@master] Skip duplicate deprecated language codes in the language dropdown

https://gerrit.wikimedia.org/r/370874