Page MenuHomePhabricator

Show languages that a wiki community added to the top of the interlanguage list in the Compact list
Closed, ResolvedPublic

Description

According to https://meta.wikimedia.org/wiki/Interwiki_sorting_order , some communities asked for particular languages to appear at the top of the interlanguage list. They appear in the Interwiki_config-sorting_order message. These are usually languages that are more or less likely to be known to the speakers of the wiki's content language.

Examples:

They can be added to the compact list.

Comment: Given Niklas's comment at the bottom, this should be done by reading the sortPrepend configuration and not the MediaWiki:Interwiki_config-sorting_order message. This is more robust.

https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/InitialiseSettings.php#L16107-L16214

Details

Related Gerrit Patches:
mediawiki/extensions/UniversalLanguageSelector : masterCompactLinks: support sortPrepend from WikiBaseClient

Event Timeline

Amire80 created this task.Jun 29 2016, 7:09 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJun 29 2016, 7:09 PM
Amire80 updated the task description. (Show Details)Jun 29 2016, 7:09 PM

I will add such a page on ladwiki when this is ready to implement.

Restricted Application added a project: UniversalLanguageSelector. · View Herald TranscriptJun 29 2016, 7:42 PM

For the Scottish Gaelic (gd) wiki, Irish Gaelic (ga) should certainly be top of the interlanguage list, probably followed by Manx Gaelic (gv). And vice-versa, for the Irish Gaelic wiki, Scottish Gaelic should be top of the Interlanguage list. What should we do to get this implemented? - Start a formal discussion and vote on the wiki, and then put in a formal request once we have consensus?

Technically, editing MediaWiki:Interwiki_config-sorting_order by a local admin should be enough, but I'm pinging @Lydia_Pintscher to check whether it's enough for Wikidata to pick it up. Also, please read https://meta.wikimedia.org/wiki/Interwiki_sorting_order .

How to make the community decision about this is up to your community :)

Thanks for the interest!

AFAIK Wikidata/Wikibase is doing nothing special there. It'd need adapting the compact language links feature?

daniel added a comment.EditedJul 1 2016, 9:15 AM

Wikibase does apply sorting to language links, governed by the interwikiSortOrders and sort options in $wgWBClientSettings, and implemented by the InterwikiSorter class. As far as I know, this happens before the "compact-interwiki" feature is applied. I suppose compact-interwiki will sort again, ignoring the original order - but I could be wrong. It may be a good idea to investigate how Wikibase interacts with the compact-interwiki stuff. @aude may know more about this.

Making Wikibase aware of MediaWiki:Interwiki_config-sorting_order independently of the compact-interwiki feature would perhaps be good, though. This is particularly relevant in the context of Extension:Cognate: at Wikimania, I talked to @Nikola_Smolenski and @gabriel-wmde about having a separate extension for sorting interwiki links, so the same code can be used with Wikibase and with Cognate. I suppose we should also have the compact-interwiki feature in mind when thinking about this.

I suppose compact-interwiki will sort again, ignoring the original order

ULS-CompactLinks works by hiding the interwiki items that are not chosen as candidates by our compacting strategy. There is no reconstruction of interwiki links list. So any sorting order coming from server side will be respected. Here is an example:

https://hu.wikipedia.org/wiki/MediaWiki:Interwiki_config-sorting_order defines English as top order.

Here is the screenshot of compact links(after beta feature enabled)

Also the HTML

So, as long as we have sorted list which respect sort options, compact-links wont change that order. However, note that the language selector after opening "X More languages" button will show the languages sorted and arranged based on geographic regions and script groups.

Wikibase does apply sorting to language links, governed by the interwikiSortOrders and sort options in $wgWBClientSettings, and implemented by the InterwikiSorter class.

There is also sortPrepend option which is what is needed here.

Making Wikibase aware of MediaWiki:Interwiki_config-sorting_order independently of the compact-interwiki feature would perhaps be good, though.

A potential problem here is that whenever a new language wiki is added, sorting orders on all Wikipedias would have to be updated. I'm not sure if there ever was a way to do it well.

sortPrepend certainly sounds like the most promising facility mentioned so far. We could use it (after discussion and consensus) to put Irish Gaelic (ga) at the top of the Interwiki links on the Scottish Gaelic (gd) wikipedia. I see now from https://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php that various wikis are already using this facility..

At least, it would have been promising were it not that compact interwiki lists has just been enabled by default on gdwiki. And if I understand correctly what people have been saying, this means that Irish Gaelic is very unlikely to appear in the interwiki list at all, even if it is top of the sort order. For IP users, it will not appear in the interwiki list at all, except possibly for users in the Irish Republic. For logged-in users it will not appear unless they have seleccted Irish Gaelic as an accept-language in their browser options, which is extremely unlikely, or unless they have previously sought out and used the Irish Gaelic wikipedia, which again rather unlikely. The IP facility in compact interwiki lists is presently coming up with a very unhelpful selection of languages, which is what resparked my interest in this topic which has been in my mind for some time.

What I think would be very useful is if the sortPrepend parameter was used to ensure the languages in question always appeared in the interwiki list, at the top, regardless of whether or not compact interwiki lists was in operation.

This is an issue for lots and lots of small languages, which may still have only scanty coverage of many topics, but which may be close enough to other languages, either major or minor, for mutual understanding. The Corsicans are asking for shortlist links to Sardu and Sicilianu, for example (https://www.mediawiki.org/wiki/User_talk:Runab_WMF). This issue arose in another project which I was involved in (http://multidict.net/) and caused me to add an alternate/cognate language facility (http://multidict.net/multidict/languages.php).

This is an issue for lots and lots of small languages, which may still have only scanty coverage of many topics, but which may be close enough to other languages, either major or minor, for mutual understanding.

For small languages there is an interesting issue: they may cover less topics but many of those topics are likely to be covered by many other languages. In those cases, the status quo of a flat list of hundreds of languages does not help much. Once the current ticket is solved, languages considered relevant by the community will be much easier to find.

And if I understand correctly what people have been saying, this means that Irish Gaelic is very unlikely to appear in the interwiki list at all, even if it is top of the sort order. For IP users, it will not appear in the interwiki list at all, except possibly for users in the Irish Republic. For logged-in users it will not appear unless they have seleccted Irish Gaelic as an accept-language in their browser options, which is extremely unlikely, or unless they have previously sought out and used the Irish Gaelic wikipedia, which again rather unlikely.

I found interesting that you considered unlikely that users had navigated from the Scottish Gaelic Wikipedia to the Irish Gaelic Wikipedia. If they are related languages, I'd expect users to navigate across them, and I'd expect that being more likely with a list they can search (as Compact language links provides) than a really long list they need to scan (the usual interlanguage list).

Scottish Gaelic and Irish Gaelic are not so close that Scottish Gaelic speakers are likely to be immediately aware of the possibility of trying the Irish Gaelic wikipedia. Lots of users of the Scottish Gaelic wikipedia will not even be aware that the Irish Gaelic wikipedia exists, that it has a lot of articles on topics likely to be of interest to Scottish Gaelic speakers (and vice-versa), and that with a bit of effort it can be understood - especially with a bit of help from tools such as www.intergaelic.com which are starting to appear. Some languages are so close together that there is a very very high level of mutual understanding: written Norwegian (nb) and written Danish (da) for example, but Scottish Gaelic and Irish Gaelic are not that close, and Manx Gaelic is harder to understand because of the different spelling system.

Because the three native language names (endonyms) are so similar and close together in alphabetic order: Gaeilge, Gaelg and Gàidhlig, users were actually much more likely to spot the other languages in the long list than they are now with the Compact language links - which is why promoting closely related languages to the top of the list has become even more useful.

thanks @Caoimhin for the details and additional context!

As I see it, the actual code change that is needed is that compact interwiki lists should have the feature to "force" some languages into the list - either by having a configuration parameter for that, and/or by recognizing Wikibase's sortPrepend.

sortPrepend seems ready made for the task. Some wiki communities have already decided what is important for them (https://noc.wikimedia.org/conf/highlight.php?file=InitialiseSettings.php), and it otherwise would become redundant when compact interwiki lists are introduced.

The only question which occurred to me is whether there ought to be two parameters: a longer list with up to half a dozen languages to be recommended to IP users, and a shorter list of one or two languages to be recommended to all users? Or perhaps allow for adding a weighting to languages in sortPrepend, to be combined with the weighting from the users Wikipedia browsing history (if any), and the browser accept-languages, etc? We would then give a very high weighting to Irish Gaelic, lower to Manx Gaelic, and lower still to a few other languages.

To clarify the scope:
The language codes that appear in the Interwiki_config-sorting_order message in the same wiki will be added as the languages shown in the Compact list. Explicit choices by the user will take precedence. Their priority will be before the geo-located languages.

To clarify: I recommend sortPrepend to avoid the issues with sr and sv for example.

Amire80 triaged this task as High priority.Jul 20 2016, 6:00 AM
Amire80 updated the task description. (Show Details)Jul 20 2016, 7:12 AM
Cocu awarded a token.Jul 22 2016, 10:03 AM
Nikerabbit moved this task from Backlog to In Review on the Language-Q1-2016-17 Sprint 2 board.

Change 301113 had a related patch set uploaded (by Nikerabbit):
CompactLinks: support sortPrepend from WikiBaseClient

https://gerrit.wikimedia.org/r/301113

Change 301113 merged by jenkins-bot:
CompactLinks: support sortPrepend from WikiBaseClient

https://gerrit.wikimedia.org/r/301113

Nikerabbit updated the task description. (Show Details)Aug 4 2016, 7:51 AM
Amire80 closed this task as Resolved.Aug 4 2016, 9:19 AM

Verified in production in the Hebrew Wikipedia.

Amire80 moved this task from QA to Done on the Language-Q1-2016-17 Sprint 2 board.Aug 4 2016, 9:19 AM

Thanks for this! I guess now we should have a discussion over on gdwiki, and then submit a request for 'sortPrepend' => [ 'ga' ] or [ 'ga', 'gv' ] or whatever we agree? I am still a bit confused, though, because I had a look at pdcwiki and pflwiki and the languages appearing at the top of the compact language list seemed to be more influenced by 'wgImportSources' than by 'sortPrepend'.

I might be wrong, but wgImportSources isn't used anywhere in Compact Links, and I do see German and English at the top, which are the sortPrepend languages.

For Gaelic, and for all other languages, only a small number of languages should be added if at all, and only if the other compacting strategies don't give a result that is useful to readers.

Thanks for the advice!

For pdcwiki, I had been looking at the homepage and was surprised to see Pälzisch, a very closely related language, near the top of the list. But maybe the homepage has its own rules? When I look at other pages (e.g. Nei Yarrick Schtadt; Berlin), I do indeed see just the sortPrepend languages, Deutsch and English, at the top of the list.

It's because they limited the number of languages on the Main Page and used {{noexternallanglinks}} there.

Compact Links respect this setting and only uses the languages that are defined on the page, but the fact that the Compact Links exists makes such an approach less necessary.

Thanks. I understand better now.

This mechanism has stopped working. See the comment I added to https://phabricator.wikimedia.org/T153900.