Page MenuHomePhabricator

Interface language selection for unregistered users on Wikimedia projects
Closed, DuplicatePublic

Assigned To
Authored By
Nikerabbit
Oct 28 2016, 12:58 PM
Referenced Files
F4884563: Spectacle.L19047.png
Nov 29 2016, 6:19 PM
F4884541: pasted_file
Nov 29 2016, 6:13 PM
Tokens
"Like" token, awarded by Steinsplitter."Like" token, awarded by daniel."Mountain of Wealth" token, awarded by Danmichaelo."Mountain of Wealth" token, awarded by Nemo_bis.

Description

Type of activity: Unconference session
Main topic: none

The problem

Interface language selection is only available to registered, logged-in users on Wikimedia projects. This is especially problem for multilingual projects including Wikisource, Wikidata and Commons.

Some wikis use workarounds to simulate this feature, but those do not scale and increase our technical debt.

Language selection (both manual and automatic) has been implemented in UniversalLanguageSelector and is already in use (both registered and unregistered) by many third party wikis, given it is enabled by default.

This issue has been discussed before, but due to unclear status of ownership (language? editing? reading?) and unclear status of blockers has stalled it. The main issue seems to be making some trade-offs to work within the current caching infrastructure.

There are two directions that allow gradually going towards the ideal end state.

By functionality:

  • Enable manual language selection
  • Enable automatic language selection

By scope:

  • Enable for multilingual wikis
  • Enable for all wikis

Expected outcomes

  • Look for consensus that this feature is something we want to do.
  • Identify remaining blockers (if any) and document them.
  • Agree on steps how to get towards the goal.

Input from anyone, but especially from community representatives (for impact, use cases), operations (technical feasibility), performance (user experience) mobile, and even product is helpful for making informed decisions.

Current status of the discussion

  • Unclear how to proceed

Draft proposal

  1. Add support for varying caches by value of language cookie.
  2. Enable manual language selection for multilingual wikis
  3. Evaluate feasibility of extending based on increase in resource usage and discuss again

The way forward from here could be

  1. Stop expanding
  2. Expand to non-multilingual wikis
  3. Try out automatic language selection on a small multilingual wiki [1]

[1] Possibly using using a different approach than reading the Accept-Languages request header in PHP side. For example using JavaScript to suggest a language change if we are confident, that would then set the language cookie and work like manual language selection.

See also:

Similar feature requests:

Discussion of the technical background, Varnish caching: T233609: [SPIKE 4hrs] What is technically feasible in terms of logged-in/logged-out users?

Related Objects

Event Timeline

This would have been great in the "Multilingualism" track!

There's a broader question here about "anonymous readers", too -- I wonder if we could have a slightly more general question about how to improve the anonymous reading experience. Many sites will let you create a temporary account based on a cookie which stores your preferences without requiring you to select a username/login. Something like that would let users select their preferred reading language/script and have it persist, w/o requiring them to go through the registration/login process.

Qgil subscribed.

Note that "Building a sustainable user experience together" has been closed as a main topic. Specific session proposals still might be pre-scheduled, based on their own merit.

Serving different content from the same URL is not nice in general. At least for multi-lingual wikis with localized content, such as Commons and Wikidata, it would be important to use different URLs for the different language versions, so allow linking and bookmarking of specific renderings. Besides that, it is good practice to use different URLs for different content.

For this reason, I propose T114662: RFC: Per-language URLs for multilingual wiki pages as a solution for the caching aspect this problem, instead of splitting the web cache based on cookies.

I did not see your comment here before I replied in T114662. There is an argument to be made whether the content in Commons and Wikidata are different in different interface languages: my opinion is that it is the same content with slightly different representation (translated).

Linking and bookmarking is already possible, though cumbersome, but it can be made easier.

Your solution would also create pressure on caches. That has a cost and in this task, aside from the implementation details, I am seeking consensus for taking that cost.

@Nikerabbit

my opinion is that it is the same content with slightly different representation (translated).

Looks like this is our fundamental disagreement. To me, the language is not a minor detail to be hidden, but a major feature to be exposed.

Linking and bookmarking is already possible, though cumbersome, but it can be made easier.

But uselang URLs are not cached, and (IIRC) not indexed.

Your solution would also create pressure on caches.

Yes it would, thanks for mentioning that. I failed to make this explicit in my RFC, since the reason I created it in the first place was the request to make the language versions cacheable.

@Nikerabbit

my opinion is that it is the same content with slightly different representation (translated).

Looks like this is our fundamental disagreement. To me, the language is not a minor detail to be hidden, but a major feature to be exposed.

For Wikidata it might make sense, but I am not sure how it would interact with the existing language selection mechanism. They seem already to use ULS when JavaScript is enabled to display some other languages besides English. Perhaps search engines should see it all?

For Commons, my understand is that it will affect translated image descriptions, but not much more besides the license blurbs etc.. Is there something else?

Looks like this is our fundamental disagreement. To me, the language is not a minor detail to be hidden, but a major feature to be exposed.

For Wikidata it might make sense, but I am not sure how it would interact with the existing language selection mechanism. They seem already to use ULS when JavaScript is enabled to display some other languages besides English.

But not for anons, because of caching! That's why I started all this! When you are logged out and click the ULS icon, it tells you to log in, nothing else.

We would of course continue to use ULS - it would just cause people to go to a different URL.

Perhaps search engines should see it all?

Of course! But how? I don't know of a way to tell Google to index different versions of a page that are served from the same URL.

For Commons, my understand is that it will affect translated image descriptions, but not much more besides the license blurbs etc.. Is there something else?

The license blurbs, but also the image descriptions, if they were localiezd.
And once we have structured media meta-data (aka MediaInfo), all the meta-data will be localized, just like on Wikidata.

For policy pages, help pages, etc that are using the Translate extension, this would not (yet) change anything.

also the image descriptions, if they were localiezd

Uh? Do you mean you also want to "command" LanguageSelect with this URL thing?

But not for anons, because of caching! That's why I started all this! When you are logged out and click the ULS icon, it tells you to log in, nothing else.

I mean this. This is using ULS's suggested languages to show Swedish, Finnish and Russian to me, even though I am not logged in.

pasted_file (555×972 px, 77 KB)

Of course, anonymous users cannot actually select interface language currently, that's what you are saying.

Uh? Do you mean you also want to "command" LanguageSelect with this URL thing?

It sets the effective user language, just like uselang. In fact, it sets uselang. It does everything uselang does.

@Nikerabbit Ah I see what you mean. But that only shows the translated labels, in a few languages.

When you change your user language, everything on the page changes language. And we want that to bee seen by search engines. For every language.

When you change your user language, everything on the page changes language.

Not really. Only the interface. :) (Which for most people is that part on the three sides of the screen which they never look at.)

When you change your user language, everything on the page changes language.

Not really. Only the interface. :) (Which for most people is that part on the three sides of the screen which they never look at.)

We were talking about Wikidata. On Wikidata, everything on the page changes when you switch the user language.
This will also be the case on Commons in the future, and is already partially the case.
In both cases, the (effective) content language (or rather, the effective parser target language) is determined by the user's interface language (that is, by uselang).

I'd also like to make this happen for content that uses the Translate extension (Help pages, etc), but that needs to be discussed separately.

I'm proposing this as the definition of multilingual wiki: the content is (or tries to be) in the user interface language.

I'd also like to make this happen for content that uses the Translate extension

ULS already tries to switch users to the correct subpage for their language, since we set $wgTranslatePageTranslationULS to true.

I'm proposing this as the definition of multilingual wiki: the content is (or tries to be) in the user interface language.

Seems quite backwards.

@Nikerabbit Hey! As developer summit is less than four weeks from now, we are working on a plan to incorporate the ‘unconference sessions’ that have been proposed so far and would be generated on the spot. Thus, could you confirm if you plan to facilitate this session at the summit? Also, if your answer is 'YES,' I would like to encourage you to update/ arrange the task description fields to appear in the following format:

Session title
Main topic
Type of activity
Description Move ‘The Problem,' ‘Expected Outcome,' ‘Current status of the discussion’ and ‘Links’ to this section
Proposed by Your name linked to your MediaWiki URL, or profile elsewhere on the internet
Preferred group size
Any supplies that you would need to run the session e.g. post-its
Interested attendees (sign up below)

  1. Add your name here

We will be reaching out to the summit participants next week asking them to express their interest in unconference sessions by signing up.

To maintain the consistency, please consider referring to the template of the following task description: https://phabricator.wikimedia.org/T149564.

AFAIK this event didn't happen (though I'm not sure why): can the task be closed or should we reuse it for some other discussion format (RfC? IRC session? office hours?).