Page MenuHomePhabricator

Remove Akan support from MediaWiki, ULS, and Wikimedia servers
Open, In Progress, Needs TriagePublic

Description

The Akan language (ak) was added to MediaWiki in the mid-2000s. It was a mistake: Akan is a language group, which includes the Twi and Fante languages. They are related, but distinct. Both of them are also supported in MediaWiki.

There was a whole Wikipedia in the Akan language, but it was recently closed after there was clear consensus that the language of its content is the same as in the Twi Wikipedia.

There are very few localizations in Akan in translatewiki, and they can also be merged into Twi (I've already checked with a native speaker).

Some links for reference:

To avoid confusion and unnecessary contribution to a wrong language, the support for the Akan language should be removed as much as possible. Existing useful content should be merged into Twi or Fante, and the rest should be deleted.

Some things I can think of:

  • remove ak from Wikimedia servers configuration (patch for ImportSources)
  • remove ak from cxserver configuration (patch)
  • remove ak from translatewiki configuration (patch)
  • move content in ak on Wikidata to tw or fat or delete it. Wikidata query requested here.
  • remove ak from language-data (or at least declare it as deprecated) (pull request)
  • remove ak from Names.php (or at least declare it as deprecated) (patch)
  • remove ak from the Wikimedia Portal
  • move Wikipedia Android app localizations from ak to tw (done in translatewiki by @Amire80; deleted automatically in export)
  • remove ak from mobile apps' configurations. "Both iOS and Android apps have scripts that run periodically and update our list of supported wikis. If a certain wiki has been sunset, it will be picked up automatically and removed."- dbrant (pull request)
  • remove ak as a language from jquery.ime (but maybe unify the code for Twi, Akan, and Nzema) (pull request)
  • remove ak from SiteMatrix language list configuration

If anyone can think of anything more, please add it.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

According to Glottolog, this is the same language that it names Akan, to encompass the following dialects: ''Agona'', ''Ahafo'', ''Akyem Bosome'', ''Asen'', ''Dankyira'', ''Fante (or Fanti)'', ''Kwawu'', as well as the group of ''Twi'' dialects. (Glottolog however does not perceive the former dialects to form a single "Fante" group).

ISO 639 as encoded them as two separate languages, but with Glottolog's view, Akan [ak][aka] should be a macrolanguage and more likely a single language, encompassing its two main groups both seen isolate languages (rather than dialect groups for Glottolog), ''Fante'' [fat] and ''Twi'' [tw][twi]. And this is exactly what ISO 639-3 also indicates: https://iso639-3.sil.org/code/aka.

Macrolanguages are for languages that are highly mutually intelligible, at least in their written form for the case of Chinese languages but not in their oral forms (however [zh/zho] in ISO 639-1/2 was more inclusive than needed and in ISO 639-3 it is more precisely mapped to [cmn] for Mandarin, but not for example Hakka [hak]). But generally other macrolanguages ignore the written form (including the script used when there are multiple ones) and focus on their oral forms where mutual understanding is possible and languages are easily mixed when their relative speakers are in frequent contact, even if they have their own traditions defined separately, sometimes with conflicting views (Serbo-Croatian is a good example).

I don't know if Twi and Fanti speakers are really conflicting on defining their traditions, or if these apparent separation just has other social or geographical causes (where contacts may be frequent but not enough to mix them, as populations are not completely integrated and mixed so that ''Akan'' is viable as a mixed language (possibly with an emerging standard as a "hat language", jsut like what occured in most modern European languages, which eroded all other languages they encompassed and that survive only as regional dialects with less defined standards).

Serbo-Croatian however is a case where its standard as a macrolanguage is eroding to the separation of its member languages, so separating them is a good option to avoid conflicts.

I'm not sure this is the case for Akan, which even with its definition as a macrolanguage remains a minority language within a large set of other Afro-Semitic and European languages, where official languages are recognized and actively supported: as there been repeated cases of edit conflicts between Fanti and Twi speakers and no way to conciliate them, or are there instead efforts to join them and to stabilize an Akan "hat" standard?

Being a macrolanguage in ISO 639-3 is not enough to say that it is to be "deprecated". It is however a good reason to separate language families (like Rajasthani or Bihari in ISO 639-1/2, that should not even have been in ISO 639-3 but only in ISO 639-5 for Bihari, and nowhere for Rajasthani, as these geographical groups are both ill-defined for use in ISO 639-3 for terminologic purposes like translations, they may make sense only in bibliographic purposes in foreign libraries that use much weaker classification criterias than those of ISO 639-3 and ISO 639-2/T)

The good question is then: are Fanti and Twi diverging or converging into a shared Akan form?

Change 941372 had a related patch set uploaded (by Amire80; author: Amire80):

[operations/mediawiki-config@master] Remove ak from wgImportSources

https://gerrit.wikimedia.org/r/941372

Change 941372 merged by jenkins-bot:

[operations/mediawiki-config@master] Remove ak from wgImportSources

https://gerrit.wikimedia.org/r/941372

Mentioned in SAL (#wikimedia-operations) [2023-07-31T09:19:31Z] <ladsgroup@deploy1002> Started scap: Backport for [[gerrit:941372|Remove ak from wgImportSources (T333765)]]

Mentioned in SAL (#wikimedia-operations) [2023-07-31T09:20:56Z] <ladsgroup@deploy1002> amire80 and ladsgroup: Backport for [[gerrit:941372|Remove ak from wgImportSources (T333765)]] synced to the testservers mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option)

Mentioned in SAL (#wikimedia-operations) [2023-07-31T09:27:42Z] <ladsgroup@deploy1002> Finished scap: Backport for [[gerrit:941372|Remove ak from wgImportSources (T333765)]] (duration: 08m 10s)

Change 951090 had a related patch set uploaded (by Srishakatux; author: Srishakatux):

[translatewiki@master] Remove ak from translatewiki

https://gerrit.wikimedia.org/r/951090

Change 951199 had a related patch set uploaded (by Srishakatux; author: Srishakatux):

[mediawiki/core@master] Remove ak from core

https://gerrit.wikimedia.org/r/951199

Change 951199 merged by jenkins-bot:

[mediawiki/core@master] Remove ak from core

https://gerrit.wikimedia.org/r/951199

Change 951090 merged by jenkins-bot:

[translatewiki@master] Remove ak from translatewiki

https://gerrit.wikimedia.org/r/951090

Two items are remaining still in this task:

  • move content in ak on Wikidata to tw or fat or delete it
  • remove ak from mobile apps' configurations

I am unsure how to approach them and which repositories/files to change. Thoughts?

Note that the name Akan doesn't show up in the SiteMatrix anymore, but it still has a row in there, with 3 closed projects and 5 missing (redlinked) projects. Surely this is not the desired final state of the SiteMatrix in relation to this task...?

Wikidata likely refers to updating labels and statements for Q-items on the wiki. Maybe also interlanguage links if those are still there.

SiteMatrix also lists closed wikis, so I don't see a problem with that. It would be different if those wikis were completely removed, which they have not been.

Wikidata likely refers to updating labels and statements for Q-items on the wiki. Maybe also interlanguage links if those are still there.

SiteMatrix also lists closed wikis, so I don't see a problem with that. It would be different if those wikis were completely removed, which they have not been.

Even if we keep the language data in the table, how to bring back the name in the language column on the left?

It's no longer possible to change ak values in Wikidata and Commons in any way because MediaWiki says the language code is invalid.

haslabel:ak now gives an error even though there are thousands of items with ak labels.

SiteMatrix also lists closed wikis, so I don't see a problem with that. It would be different if those wikis were completely removed, which they have not been.

Even if we keep the language data in the table, how to bring back the name in the language column on the left?

Autonyms come from Names.php

haslabel:ak now gives an error even though there are thousands of items with ak labels.

SPARQL (and presumably Quarry) still seems able to report ak labels (https://w.wiki/7P28), so that's something.

It's no longer possible to change ak values in Wikidata and Commons in any way because MediaWiki says the language code is invalid.

Can confirm that this also applies when using wikibase-cli - it returns an API error either for adding or removing ak labels. (code: 'badvalue', info: 'Unrecognized value for parameter "language": ak.',)

In terms of sitelinks, there are currently 325 akwiki sitelinks on Wikidata (https://w.wiki/7N$e) - all to internal pages (templates, categories, etc). I am guessing these can be more straightforwardly deleted if desired.

In terms of sitelinks, there are currently 325 akwiki sitelinks on Wikidata (https://w.wiki/7N$e) - all to internal pages (templates, categories, etc). I am guessing these can be more straightforwardly deleted if desired.

Note that Wikidata allows sitelinks to closed wikis, so they shouldn't be removed if the wiki is only going to be closed and not deleted.

Just posted this onwiki but to keep it all on one place - it looks like there are still 484k items with Akan labels (plus another 146k descriptions and 173k aliases, but I'm guessing those also have labels).

I don't see any obvious way of Wikidata being able to remove these unless ak is temporarily re-enabled as a language code, at which point it would be straightforward for a bot to remove them all, or else copy them to a new language code (tw?) if that's preferred.

Looks like some of the changes will have to be reverted so that other items can be completed. Can someone look into this please?

Autonyms come from Names.php

… or from InitialiseSettings.php's wmgExtraLanguageNames; since the problem with ak missing – as far as I can tell – is specific only to Wikidata, I would suggest adding it there.

Change 955007 had a related patch set uploaded (by Srishakatux; author: Srishakatux):

[operations/mediawiki-config@master] Add Akan language

https://gerrit.wikimedia.org/r/955007

… or from InitialiseSettings.php's wmgExtraLanguageNames; since the problem with ak missing – as far as I can tell – is specific only to Wikidata, I would suggest adding it there.

If I understood correctly, this Wikidata-specific change was needed in the operations/mediawiki-config repo.

Autonyms come from Names.php

… or from InitialiseSettings.php's wmgExtraLanguageNames; since the problem with ak missing – as far as I can tell – is specific only to Wikidata, I would suggest adding it there.

The comment I was replying to there was about making the name appear on https://meta.wikimedia.org/wiki/Special:SiteMatrix though.

{{#language:ak}} no longer displays an autonym now either (e.g. https://species.wikimedia.org/wiki/Diplopoda)

I don't see any obvious way of Wikidata being able to remove these unless ak is temporarily re-enabled as a language code, at which point it would be straightforward for a bot to remove them all, or else copy them to a new language code (tw?) if that's preferred.

ak is a macrolanguage containing both tw and fat. Some things can be fixed by a bot, like the items where the same text was copied to all labels, but the rest need to be checked by someone who is able to distinguish the two to determine which one should be used. There are some which include both tw and fat which need to be split (e.g. https://www.wikidata.org/wiki/Q7802), some where the ak label does not match either of the tw or fat labels (e.g. https://www.wikidata.org/wiki/Q41487), some which are in a different language entirely (e.g. https://www.wikidata.org/wiki/Property:P585)...

Hello,

Adding more context, the "ak" issue is also breaking some language-agnostic bot code (see: https://github.com/LeMyst/WikibaseIntegrator/issues/617) .

Basically, I am unable to write to items that have an "ak" label using WikibaseIntegrator even if I am not touching the "ak" text.

The code fetches the item from the API (with an ak label), adds an Item-valued statement, and then is unable to write the content back because of the ak label.

Hello,

Adding more context, the "ak" issue is also breaking some language-agnostic bot code (see: https://github.com/LeMyst/WikibaseIntegrator/issues/617) .

Basically, I am unable to write to items that have an "ak" label using WikibaseIntegrator even if I am not touching the "ak" text.

The code fetches the item from the API (with an ak label), adds an Item-valued statement, and then is unable to write the content back because of the ak label.

Pywikibot only modifies objects for which you've provided a key. So you can omit item.labels['ak'] and it won't be changed.

ABorbaWMF subscribed.

I don't see Akan on the iOS app and "Twi" and "Fante" are available. Tested on 7.4.2 (2657). Moving to signoff.

Interwiki links to akwiki are behaving strangely now. On https://meta.wikimedia.org/wiki/Meta:Administrators, it's displayed with the other links, but shows "ak:Wikipedia:Administrators" where the language name normally is. On https://incubator.wikimedia.org/wiki/Wp/sli/Wikipeedia:Administratoren, it's not recognised as an interwiki link and is displayed as an inline link at the end of the page instead.

Change 955007 merged by jenkins-bot:

[operations/mediawiki-config@master] Add Akan language

https://gerrit.wikimedia.org/r/955007

Mentioned in SAL (#wikimedia-operations) [2023-10-12T13:25:02Z] <kartik@deploy2002> Started scap: Backport for [[gerrit:955007|Add Akan language (T333765)]]

Mentioned in SAL (#wikimedia-operations) [2023-10-12T13:25:51Z] <kartik@deploy2002> kartik and srishakatux: Backport for [[gerrit:955007|Add Akan language (T333765)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2023-10-12T13:34:10Z] <kartik@deploy2002> Finished scap: Backport for [[gerrit:955007|Add Akan language (T333765)]] (duration: 09m 39s)

Limck620 removed Due Date.