Page MenuHomePhabricator

Allow anonymous users to change interface language on Wikimedia wikis with ULS
Open, LowPublic

Description

Since the previous bugs I was thinking would address this issue were closed (T35677#382858 and T44186#474966) I'm opening one specifically about the request to allow interface language selection by unregistered users (fix T5665) via UniversalLanguageSelector.

Context

On our biggest multilingual wikis, Commons and Wikidata, the (monolingual) cache gets bypassed by using a gadget which makes everyone use the uselang URL parameter. Hence, as discussed in T114662, there are no significant performance problems with making the cache more permissive.

Using Commons' AnonymousI18N.js globally is not feasible until we can have central gadgets to deploy it (T31272: Implement Gadgets 2.0).

Such a feature would help e.g. on this discussion w:pt:Wikipédia:Esplanada/propostas/Uso do português de Portugal, pt-PT (4mar2012)

The problem

Interface language selection is only available to registered, logged-in users on Wikimedia projects. This is especially problem for multilingual projects including Wikisource, Wikidata and Commons.

Some wikis use workarounds to simulate this feature, but those do not scale and increase our technical debt.

Language selection (both manual and automatic) has been implemented in UniversalLanguageSelector and is already in use (both registered and unregistered) by many third party wikis, given it is enabled by default.

This issue has been discussed before, but due to unclear status of ownership (language? editing? reading?) and unclear status of blockers has stalled it. The main issue seems to be making some trade-offs to work within the current caching infrastructure.

There are two directions that allow gradually going towards the ideal end state.

By functionality:

  • Enable manual language selection
  • Enable automatic language selection

By scope:

  • Enable for multilingual wikis
  • Enable for all wikis

Draft proposal

  1. Add support for varying caches by value of language cookie.
  2. Enable manual language selection for multilingual wikis
  3. Evaluate feasibility of extending based on increase in resource usage and discuss again

The way forward from here could be

  1. Stop expanding
  2. Expand to non-multilingual wikis
  3. Try out automatic language selection on a small multilingual wiki [1]

[1] Possibly using using a different approach than reading the Accept-Languages request header in PHP side. For example using JavaScript to suggest a language change if we are confident, that would then set the language cookie and work like manual language selection.


See Also:
T44186: Anonymous users can't pick language on wikidata.org
T3135: Interface language on multilingual Wikimedia projects should default to browser language preferences

Details

Reference
bz56464

Related Objects

StatusAssignedTask
ResolvedNone
ResolvedNikerabbit
ResolvedNone
OpenNone
OpenNone
OpenNone
OpenNone
DuplicateNikerabbit
ResolvedNone
ResolvedNone
Resolvedsanthosh
DeclinedPginer-WMF
ResolvedAmire80
ResolvedAmire80
OpenNone
ResolvedNone
ResolvedNikerabbit
ResolvedNikerabbit
DeclinedNone
Resolvedsanthosh
Resolvedori
DeclinedNone
Resolvedtomasz
ResolvedNone
ResolvedNone
ResolvedNone
ResolvedNone
ResolvedNone
Resolvedbrion
DeclinedNone
DeclinedNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Ok, I see that with the new structure this issue is scattered over three sub-departments (Reading which serves the main hundreds millions stakeholders for the issue but has no resources, Editing which has LangEng potential resources but is blocked, Operations which is the only sub-dep which could move something at this point): adding relevant people.

What does it take to get something moving here? AFAICS, the steps are:

  • Operations (or Infrastructure) pick a [small] production wiki where to set $wgULSAnonCanChangeLanguage to true and test it [this should take few minutes of work];
  • if it doesn't work, Operations (or Infrastructure) remove the wiki from caching/Varnish to confirm that's the issue [this may take hours?];
  • if it is, Operations (or Infrastructure) find out what should change in Varnish/whatever and fix it, or determine what else is needed that they can't fix and talks with Language to define the way forward [this would probably take several days];
  • if ULS changes are needed, Editing sets some resources for Language to work on them at some point, potentially helped by Reading to assess impact and therefore priority;
  • following steps defined by the above.
Nemo_bis added a subscriber: Epic.May 3 2015, 8:45 AM
faidon added a subscriber: faidon.May 4 2015, 1:18 PM

Caching. Either the language cookie would be ignored leading to cache poisoning or it would bypass the cache. Nobody has tested whether significant amount of users would switch language to cause load issues. Though now with HHVM, it should be less of an issue (cache hits used to be much faster compared to uncached requests, including all logged in users. Now the difference should be smaller.).

That's certainly not true. Caching isn't about HHVM/upstream's performance, it's about beating the speed of light with our caching PoP network as well. It's frontend performance, not backend performance we're more interested about.

faidon added a comment.May 4 2015, 1:21 PM

Ok, I see that with the new structure this issue is scattered over three sub-departments (Reading which serves the main hundreds millions stakeholders for the issue but has no resources, Editing which has LangEng potential resources but is blocked, Operations which is the only sub-dep which could move something at this point): adding relevant people.
What does it take to get something moving here? AFAICS, the steps are:

This is ULS, which still falls under the (intact) Language team, AFAIK. Right, @Nikerabbit?

We clearly have an opinion about caching etc. and we can/should coordinate on it, but deploying features like that is certainly not something ops' would do alone (unless it's an emergency, obviously). This isn't an operations/infrastructure request, in my opinion.

Arrbee added a subscriber: Arrbee.May 6 2015, 12:53 PM
Restricted Application added subscribers: Matanya, Aklapper. · View Herald TranscriptAug 22 2015, 6:27 PM
Nemo_bis renamed this task from Anonymous users can't pick language on WMF wikis ($wgULSAnonCanChangeLanguage is set to false) to Anonymous users can't pick language on WMF wikis with ULS ($wgULSAnonCanChangeLanguage is set to false).Nov 28 2015, 11:00 PM
555 added a subscriber: 555.Nov 29 2015, 9:15 PM
jayvdb added a subscriber: jayvdb.Nov 29 2015, 11:40 PM

... This isn't an operations/infrastructure request, in my opinion.

Who can/should perform step 1 identified by @Nemo_bis in earlier at T58464#1255257? (step 1 was: set $wgULSAnonCanChangeLanguage to true on a [small] production wiki and test it)

Change 255953 had a related patch set uploaded (by TTO):
Enable $wgULSAnonCanChangeLanguage on testwiki

https://gerrit.wikimedia.org/r/255953

TTO added a subscriber: TTO.Nov 30 2015, 3:08 AM
Qgil added a comment.Nov 30 2015, 8:46 AM

This task has been referenced in a thread at the wikisource-l mailing list: http://thread.gmane.org/gmane.org.wikimedia.wikisource/2635

santhosh added a subscriber: santhosh.EditedNov 30 2015, 9:18 AM

The detailed analysis on the feasibility, mainly from the Ops perspective, is at T43451: ULS causes pages to be cached with random user language. That was in 2012. If the infrastruture changes if any after that is favorable now, we should re-assess.

TTO added a comment.Nov 30 2015, 9:30 AM

Since 2012, we have migrated from Squid to Varnish, so the situation might need to be reevaluated.

ori added a subscriber: ori.EditedNov 30 2015, 9:40 AM

By design, testwiki responses are never cached in varnish, so it's not a great wiki to target if you're hoping to learn anything about how enabling $wgULSAnonCanChangeLanguage would interact with the caching layer. You could change your patch to target test2wiki instead, but even then I don't expect we would learn much that we don't already know.

@Nikerabbit succinctly explained the crux of the problem in T58464#994780. I'll try to expand on that comment a little.

The WMF's Varnish layer handles the Cookie header in the following way: unless the cookie header sent by the client includes a session token, the cookie header is not treated as significant for the purposes of a cache lookup.

Thus the following request is a cache miss:

$ curl --silent --head --cookie "session=$(date +%s)" "https://de.wikipedia.org/wiki/Wikipedia:Hauptseite" | grep X-Cache
X-Cache: cp1065 miss (0), cp2016 miss (0), cp2016 frontend miss (0)

Whereas this request is not:

$ curl --silent --head --cookie "xyz=$(date +%s)" "https://de.wikipedia.org/wiki/Wikipedia:Hauptseite" | grep X-Cache
X-Cache: cp1065 hit (2), cp2016 hit (2), cp2016 frontend hit (54)

Even though in both cases, the cookie value is novel.

Unless Varnish is configured to treat the language cookie set by ULS as cache-relevant, on every request, one of the following two scenarios will happen: either the page will be served from the cache, in which case its content will be oblivious to the visitor's language preference, or (if the page is not in the cache) it will be cached with HTML specific to the visitor's language preference and then served to subsequent visitors, regardless of whether or not they have the same language preference set. This is what would happen if you simply set $wgULSAnonCanChangeLanguage to true (for any wiki other than testwiki) and took no further action. (Edit: it is what happened; see T43451.)

So, for this feature to work, we would need to configure Varnish to vary the cache based on the language preference. Increasing the number of variants that are cached for each page would increase the demand for memory on the Varnish hosts, causing Varnish to be more aggressive in evicting older cache entries, leading to a decrease in the cache hit-rate and a consequent increase in load on the application servers and in latency. It would represent a significant change to the WMF's site architecture, and it would require rigorous capacity planning, informed by metrics data from production and from experiments which simulate substantial load. Testing this by enabling it on test2wiki would be a bit like testing the capacity factor of the local power station by keeping your night-lamp on all night.

Qgil added a comment.Nov 30 2015, 9:57 AM

The description doesn't define whether the ultimate goal is to have this feature enabled in all Wikimedia wikis or only in the multilingual ones.

If the priority are the multilingual ones, we could start there. Somewhere we have data about how many anonymous users we have visiting mediawiki.org, Meta, Commons... The impact on server use caused by changing this configuration on i.e. mediawiki.org probably is a lot smaller than the big wikis. In fact, why not starting to test this change in mediawiki.org, being ready to revert if the change is problematic?

ori added a comment.Nov 30 2015, 10:02 AM

I started writing my response before Quim added a link to the thread on wikisource-l, so I did not know the context. It may be feasible to turn this on if it were scoped to wikisource.

Change 255953 abandoned by TTO:
Enable $wgULSAnonCanChangeLanguage on testwiki

Reason:

testwiki responses are not cached in varnish

Is that documented anywhere?

In any case, I don't think this is going to be very useful.

https://gerrit.wikimedia.org/r/255953

So suddenly this is again an operations request? ;)

Increasing the number of variants that are cached for each page would increase the demand for memory on the Varnish hosts

If it were just a matter of RAM, who can decide to buy more? :)

ori added a comment.EditedNov 30 2015, 10:16 AM

Storing page content and user chrome separately and combining them either on the client (as recently proposed in T106099: RFC: Page composition using service workers and server-side JS fall-back) on edge caching servers would be ideal, but it is at least two years away from full deployment on all platforms (in my estimation; others may disagree).

It may be feasible to turn this on if it were scoped to wikisource.

What is the difference in Wikisource? What does it take to start enabling language choice in Wikisource?

Restricted Application added a subscriber: JEumerus. · View Herald TranscriptJan 5 2016, 5:25 PM
coren added a subscriber: coren.Jan 5 2016, 5:29 PM

FWIW, excepting commons (for obvious reasons), most wikis where this feature would be desirable/necessary seem to me to be the smaller wikis and it's not immediately clear that the performance impact would be all that catastrophic if the caches did vary on a language selection cookie for them.

Use case for me is the Wikimania2017 wiki, where it is very important that visitors be able to switch to (at least) French for the interface and contents without having to register an account (and - in an ideal world - according to browser language prefs if only to originally set the cookie).

coren awarded a token.Jan 5 2016, 5:30 PM

As a user spending some time translating help content on Wikidata, I would much prefer that new users could actually find the translated content.

It’s also kinda hard to promote Wikidata as a multilingual site, when it’s not so until you register.

Danny_B removed subscribers: Epic, wikibugs-l-list.
Nikerabbit renamed this task from Anonymous users can't pick language on WMF wikis with ULS ($wgULSAnonCanChangeLanguage is set to false) to Allow anonymous users to change interface language on Wikimedia wikis with ULS.Jul 12 2016, 9:17 AM
Nikerabbit edited projects, added Commons; removed Patch-For-Review.
Nikerabbit updated the task description. (Show Details)
Restricted Application added subscribers: Poyekhali, Steinsplitter. · View Herald TranscriptJul 12 2016, 9:17 AM

This task not blocked by T58292: Make ULS more lightweight. ULS is already deployed to all users, just that some features are not enabled for anonymous.

He7d3r updated the task description. (Show Details)Oct 22 2016, 1:26 PM

Commons has its own custom language selector enabled for anonymous users. It also comes with a banner at the top advising the user to change their language, depending on their browser language:

After you choose a language other than English, ?uselang=foo is automatically added to the URL of any links you click.

(source code: https://commons.wikimedia.org/wiki/MediaWiki:AnonymousI18N.js)

And all page views with ?uselang= are uncached. So, at least for Commons, I think just enabling this (and making pages with the ULS cookie set not cached) would be just fine. This is already the case, I don't think we can do any worse. :)

Steinsplitter added a comment.EditedMar 26 2017, 8:17 AM

So, at least for Commons, I think just enabling this (and making pages with the ULS cookie set not cached) would be just fine.

Sounds reasonable, then we can remove the ja hack from commons.

The situation at Commons has been mostly discussed on the parent task T5665.

Krinkle removed a subscriber: Krinkle.Mar 27 2017, 10:28 PM
Nemo_bis updated the task description. (Show Details)Mar 28 2017, 6:16 PM
He7d3r updated the task description. (Show Details)Apr 13 2017, 5:22 PM
He7d3r removed a subscriber: wikibugs-l-list.
Restricted Application added a subscriber: PokestarFan. · View Herald TranscriptAug 1 2017, 11:02 PM