Page MenuHomePhabricator

Completely remove Kazakh language converter
Closed, ResolvedPublic

Description

Please completely remove the Converter of the Kazakh language, because of so many bugs and very low usage statistics.
Delete locales:

kk-arab, kk-cyrl, kk-latn
kk-cn, kk-kz, kk-tr

langvars.png (627×1 px, 36 KB)

Only need kk locale

Migrate all translations ...../kk-cyrl to ...../kk

Related topics:

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Peachey88 renamed this task from Completely revome Kazakh language converter to Completely remove Kazakh language converter.Nov 18 2020, 8:10 PM
Peachey88 updated the task description. (Show Details)
Amire80 raised the priority of this task from High to Needs Triage.Nov 19 2020, 7:56 AM

Are you sure it should be completely removed?

I visited Almaty for the Turkic Wikimedia Conference in 2012, and there was a Kazakh man from China who spoke very passionately about the need for a converter to Latin and showed me many Chinese websites in Kazakh in the Arabic script. (I know Russian, but he doesn't, so somebody had to translate what he said from Kazakh to Russian for me.) I'd like to see a wider community discussion about this, with participation from people who use different alphabets.

Does the mere existence of a converter get in the way of doing something?

The reason for low usage is not necessarily that people don't need it. The reason may be that the feature is difficult to use and people aren't aware of it. This can be fixed by better design. And generally, how do you know that the usage is low? Do you have statistics?

And what are the bugs in the converter? Perhaps they can be fixed?

Finally, I've heard on the news that Kazakhstan plans a transition to a new Lаtin alphabet, so a transliterator may be needed in a major way soon.

I do agree, however, that the extra locales (kz, tr, cn) can probably be removed, and that alphabets are enough, as suggested in T250604, although I'm not sure that it's easy to do. It's possible that they have to be preserved in some way for

Are you sure it should be completely removed?

I visited Almaty for the Turkic Wikimedia Conference in 2012, and there was a Kazakh man from China who spoke very passionately about the need for a converter to Latin and showed me many Chinese websites in Kazakh in the Arabic script. (I know Russian, but he doesn't, so somebody had to translate what he said from Kazakh to Russian for me.) I'd like to see a wider community discussion about this, with participation from people who use different alphabets.

Does the mere existence of a converter get in the way of doing something?

The reason for low usage is not necessarily that people don't need it. The reason may be that the feature is difficult to use and people aren't aware of it. This can be fixed by better design. And generally, how do you know that the usage is low? Do you have statistics?

And what are the bugs in the converter? Perhaps they can be fixed?

Finally, I've heard on the news that Kazakhstan plans a transition to a new Lаtin alphabet, so a transliterator may be needed in a major way soon.

I do agree, however, that the extra locales (kz, tr, cn) can probably be removed, and that alphabets are enough, as suggested in T250604, although I'm not sure that it's easy to do. It's possible that they have to be preserved in some way for

Hello. Wikipedia in China banned. I have started a community consensus: https://kk.wikipedia.org/wiki/Уикипедия:Форум/Техникалық#Түрлендіргіш/Converter. Also discussed about this many times in our community's telegram group.

Page Views by country: https://stats.wikimedia.org/#/kk.wikipedia.org/reading/page-views-by-country/normal|map|last-month|~total|monthly

Too many bugs:

  1. https://commons.wikimedia.org/wiki/File:Errors_in_Kazakh_Wikipedia_(Mobile_view)_-_2.jpg
  2. https://commons.wikimedia.org/wiki/File:Langvars.png
  3. https://commons.wikimedia.org/wiki/File:Converter_errors.jpg
  4. https://commons.wikimedia.org/wiki/File:Converter_errors_2.jpg
  5. https://commons.wikimedia.org/wiki/File:Converter_errors_3.jpg
  6. https://commons.wikimedia.org/wiki/File:Converter_errors_4.jpg
  7. https://commons.wikimedia.org/wiki/File:Converter_errors_5.png
  8. https://commons.wikimedia.org/wiki/File:Converter_errors_6.png
  9. https://commons.wikimedia.org/wiki/File:MediaWiki_fallback_chains_in_kkwiki.jpg

All these bugs can be addressed. I'll report them separately.

The fact that Wikipedia is banned in China is also not a reason to remove the converter completely. The ban may be lifted in the future.

@Amire80, as you wrote earlier, the converter has a number of problems

The converter contains the code Cyrl2Latn and Latn2Cyrl due to two simultaneous conversions, foreign words are translated into Cyrillic, this causes problems

For example,
This is how registered Wikipedia users see

image.png (402×1 px, 242 KB)

And not registered users

image.png (389×1 px, 236 KB)

As you can see, New York City was transliterated as Неу Ёрк Цітy this should not be due to such problems and the converter is not needed

It would be nice to remove Latn2Cyrl as it transliterates everything written in Latin letters

Yes, we would like to remove the locales kk-KZ, kk-CN and kk-TR since if there are alphabets why then countries

And I would also like to consider adding an offline converter that translates only the content of the article that does not affect the interface

@Amire80 And also the Arabic script of the Kazakh also has errors when converting the digit is translated into old Arabic numerals and the Kazakhs of China use modern Arabic numerals

If possible, I would like to make corrections since I know all 3 alphabets of Kazakh

@Amire80 who can solve these problems? @MuratKaribay also found some critical problems. We receive complaints from users every day but we don't have permission to remove this useless converter. I'm so grateful for your help!

And also the Arabic script of the Kazakh also has errors when converting the digit is translated into old Arabic numerals and the Kazakhs of China use modern Arabic numerals

Hi, could you file separate, dedicated tasks about problems, please (if not existing yet) by following https://www.mediawiki.org/wiki/How_to_report_a_bug ? Thanks a lot! (See also T199895: Arabic transliteration in Kazakh and Kurdish which might be related.)

Leave it open. It's probably not going to be removed entirely, but at the moment it does have several important bugs, and this task here has good documentation of these. I'll open new tasks separately soon (like, within a month), and close this one.

@Amire80: Would you open these separate task? Thanks!

@Amire80 this screenshot represents in traslatewiki.net only allow translate to kk-cyrl not kk. We need translate to kk. Because many many reasons.
https://commons.wikimedia.org/wiki/File:Converter_errors_6.png

Winston_Sung subscribed.

Based on the language status, it would be inappropriate to completely remove KkConverter.

However, we could reduce the number of language variants in KkConverter, as it really caused confusion and inconvenience,

We should T250604: Remove language variants kk-cn, kk-kz, kk-tr from the Kazakh language converter instead.

@Amire80 @Winston_Sung @Aklapper the problem is not solved. I guess you need to remove the converter completely.

The problem is not solved. I guess you need to remove the converter completely.

It is of course not solved for now as none of the changes and tasks are solved for now.

I'm afraid this won't solve the problem for unregistered wikipedia users.
I propose to solve the problem of converter conflicts by deleting or disabling the converter.

I'm afraid this won't solve the problem for unregistered wikipedia users.
I propose to solve the problem of converter conflicts by deleting or disabling the converter.

Could you provide some examples? As there seems to be something not responsible for the converter.

@Winston_Sung you can open the article as incognito and see how unregistered users see the conversion. For example, in this article, the original name of the city New-York is converted to Cyrillic.

You can open the article as incognito and see how unregistered users see the conversion. For example, in this article, the original name of the city New-York is converted to Cyrillic.

That's actually T39617: Do not convert text marked as being in another language with a lang attribute .

Provide a current workaround here:

https://kk.wikipedia.org/wiki/Template:Lang?action=edit

Template:Lang
{{#ifeq:{{{1|}}}|kk|<span lang="{{{1}}}" xml:lang="{{{1}}}">{{{2}}}</span>|<span lang="{{{1}}}" xml:lang="{{{1}}}">-{{{{2}}}}-</span>}}<noinclude>{{Doc}}</noinclude>
MNeisler subscribed.
MNeisler unsubscribed.

I remember fixing some things several months ago. What problem do you still see? Particularly, what problem do you think unregistered users still see?

What problem do you still see? Particularly, what problem do you think unregistered users still see?

Probably referring to https://kk.wikipedia.org/wiki/Нью-Йорк?variant=kk-cyrl / https://kk.wikipedia.org/wiki/Нью-Йорк?variant=kk-kz if the browser language preference is set to kk-KZ .

What problem do you still see? Particularly, what problem do you think unregistered users still see?

Probably referring to https://kk.wikipedia.org/wiki/Нью-Йорк?variant=kk-cyrl / https://kk.wikipedia.org/wiki/Нью-Йорк?variant=kk-kz if the browser language preference is set to kk-KZ .

Does it actually happen? I managed to set it to "kk", and I don't see a problem. How can I set it to "kk-kz"? In the preferences, I can only choose "Kazakh", which sets it to "kk".

How can I set it to "kk-kz"? In the preferences, I can only choose "Kazakh", which sets it to "kk".

Then that might be something about user content language variant preference.

That's what I wonder about: How can anyone set that preference? Most people don't bother changing their browser's language preference at all, and even if I try to do it, I can't.

That's what I wonder about: How can anyone set that preference? Most people don't bother changing their browser's language preference at all, and even if I try to do it, I can't.

Hello! It's dependent on chosed language of phone exactly. Some of users can use Kazakh interface, and some of them Russian. Cuz of this that problem appears.

Change 972474 had a related patch set uploaded (by Winston Sung; author: Amir Sarabadani):

[mediawiki/extensions/Wikibase@master] Drop Kazakh language variants

https://gerrit.wikimedia.org/r/972474

Change 972472 had a related patch set uploaded (by Winston Sung; author: Amir Sarabadani):

[mediawiki/core@master] Drop language coverter for Kazakh

https://gerrit.wikimedia.org/r/972472

Change 972472 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@master] Remove language coverter for Kazakh

https://gerrit.wikimedia.org/r/972472

Change 972474 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Remove Kazakh (kk) language variants

https://gerrit.wikimedia.org/r/972474

Change 972472 merged by jenkins-bot:

[mediawiki/core@master] Remove language coverter for Kazakh

https://gerrit.wikimedia.org/r/972472

Ladsgroup claimed this task.
Ladsgroup subscribed.

This will be deployed in a week or two.

Change 976237 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/services/parsoid@master] Remove kk lang conversion tests: kk converter no longer exists

https://gerrit.wikimedia.org/r/976237

Change 976237 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Remove kk lang conversion tests: kk converter no longer exists

https://gerrit.wikimedia.org/r/976237

To-do:

  • Post-deployment checks / QA
    • Check if there're any issues caused by the removal of KkConverter
  • Consider remove user language options based on F33918336
    • langvars.png (627×1 px, 36 KB)

This is resolved, create a separate task for follow ups. Also the second one shouldn't be done, we decided that lang variants are going to stay, just lang conversion is being removed.

We also don't keep tasks open "just in case things go wrong". If things go wrong, then it can be re-opened.

This is resolved, create a separate task for follow ups.

I thought we close tasks when deployed instead of merged?

Also the second one shouldn't be done, we decided that lang variants are going to stay, just lang conversion is being removed.

Emmm... I thought the report is kind of also about the overwhelming language options for kk-* ?

Change 977788 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.19.0-a7

https://gerrit.wikimedia.org/r/977788

Change 977788 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.19.0-a7

https://gerrit.wikimedia.org/r/977788