We are getting repeated complaints about T123171: First languages in the Latin alphabet like ace and af are stuck in sidebar—languages that appear seemingly randomly. The real reason is usually that there are not enough languages to suggest, and the software picks the first few languages alphabetically.
This is usually not useful to most readers, because the languages indeed appear to be random and they are unlikely to be clicked. While there can be several way to suggest languages in a clever way, such as T70071 and T70077, they may take time to implement, and they are not immediately urgent.
However, there is one simple thing that can be done to improve this: Create a list of 20 or so "universally common languages", which will be shown in the initial compact links list if the usual algorithm runs out of suggestions (for example, prioritizing other Scandinavian languages is useful in the Norwegian Wikipedia even when not browsing from Norway).
So, for example, if French doesn't appear in the CLDR suggestions for the reader's country, Aramaic may be shown there, but the chance that French is more useful to the reader than Aramaic is much higher.
The easiest thing to do is to combine these two lists:
- https://en.wikipedia.org/wiki/List_of_languages_by_total_number_of_speakers
- https://meta.wikimedia.org/wiki/List_of_Wikipedias
... and this will give us:
[ 'zh', 'en', 'hi', 'ur', 'es', 'ar', 'ru', 'id', 'ms', 'pt', 'fr', 'de', 'bn', 'ja', 'pnb', 'pa', 'jv', 'te', 'ta', 'ko', 'mr', 'tr', 'vi', 'it', 'fa', 'sv', 'nl', 'pl' ]
This should be a variable, named something like $wgUniversalLanguageSelectorFallbackCommonLanguages, because it might be useful to customize it for some sites.
To clarify, these languages must only be added to the list as the last fallback. If Afrikaans is actually one of the suggested languages, e.g. from CLDR, and French is not, then Afrikaans must take precedence over French.
Criteria for inclusion
Languages with:
- Over 50 mln speakers
- With "Lahnda" converted to two varieties of Punjabi, which are its written versions.
- Has a Wikipedia with
- 20,000 articles
- depth of at least 5