Page MenuHomePhabricator

Consider increasing the number of previous languages that ULS stores
Closed, ResolvedPublic

Description

The current storage of previous languages is limited to 5 languages:

	mw.uls.setPreviousLanguages = function ( previousLanguages ) {
		try {
			localStorage.setItem(
				mw.uls.previousLanguagesStorageKey,
				JSON.stringify( previousLanguages.slice( -5 ) )
			);
		} catch ( e ) {}
	};

Should we increase it? See T123171 for some discussion that prompted this ticket.

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald Transcript

Thanks for creating the ticket.

I think previous choices are the best prediction mechanism to reflect the languages the user is interested in. The rest of the heuristics act as a fallback: trying to figure out when we don't know. But if we know the languages the user actually uses, we should take advantage of that.

Do you anticipate any problems of a higher limit?

Thanks for creating the ticket.

I think previous choices are the best prediction mechanism to reflect the languages the user is interested in. The rest of the heuristics act as a fallback: trying to figure out when we don't know. But if we know the languages the user actually uses, we should take advantage of that.

Do you anticipate any problems of a higher limit?

Not something severe, except that people will keep seeing some language that once visited, but doesn't interest them any more. This should be addressed broadly by fixing T122001 or something like it (this would also fix T96547, for example).

As a reminder: this is what we use currently for compact language links (definition 1)

// Add user-defined assistant languages on wikis with Translate extension.
this.filterByAssistantLanguages( languages ),

// Add previously selected languages.
// Previous languages are always the better suggestion
// because the user has explicitly chosen them.
this.filterByPreviousLanguages( languages ),

// Add all common languages to the beginning of array.
// These are the most probable languages predicted by ULS.
this.getCommonLanguages( languages ),

// Finally add the whole languages array too.
// We will remove duplicates and cut down to required size.
languages

Most other code uses frequent languages (same as common above) which is defined as follows (definition 2):

mw.config.get( 'wgUserLanguage' ),
mw.config.get( 'wgContentLanguage' ),
mw.uls.getBrowserLanguage()
mw.uls.getPreviousLanguages()
mw.uls.getAcceptLanguageList()
$.uls.data.getLanguagesInTerritory( countryCode );

For compact language links we display 7-9 languages in the sidebar according to the definition 1. Then we display up to 16 languages in the beginning of the list as "common languages" according to the definition 2.

Then there is the issue that clicking on the sidebar does not add the language to frequently used language. That only happens via the language selector. Then there is the issue that list of frequent language is not shared across Wikipedias. Due to these issues majority of the people wont even reach the 5. For power users this is different of course, and I see no technical issue rising the limit to some reasonable number, but what should that number be? 7? 10? 15?

One more thing: it seems we are not removing duplicates from the previous languages, so one language can easily take three spots out of the current five.

One more thing: it seems we are not removing duplicates from the previous languages, so one language can easily take three spots out of the current five.

In case it was not mentioned previously:

  • We should not show repeated results.
  • We should not show more languages to the user than the ones we were showing before this change. We are giving more priority to "previous choices" if they exist, but not increasing the compact list of languages.

For power users this is different of course, and I see no technical issue rising the limit to some reasonable number, but what should that number be? 7? 10? 15?

I'd propose to target at least the number of maximum links that we are showing at a time with the compact language links. That is, 9. So that it is possible to "customise" them if a user wants to.

One more thing: it seems we are not removing duplicates from the previous languages, so one language can easily take three spots out of the current five.

Sorry I was imprecise. This is only for storage. On display we remove duplicates. I can do patch/task to remove duplicates before saving to use all the available spots.

One more thing: it seems we are not removing duplicates from the previous languages, so one language can easily take three spots out of the current five.

Sorry I was imprecise. This is only for storage. On display we remove duplicates. I can do patch/task to remove duplicates before saving to use all the available spots.

Thanks for the clarification. So the problem here would be that if we keep 9 slots due to duplicates only a subset of them would be usable making the effective total smaller than intended. In that case, I think it makes sense to eliminate duplicates to make sure the 9 slots are relevant.

Change 289853 had a related patch set uploaded (by Nikerabbit):
Increase the number of stored previously selected languages to 9

https://gerrit.wikimedia.org/r/289853

Change 289853 merged by jenkins-bot:
Increase the number of stored previously selected languages to 9

https://gerrit.wikimedia.org/r/289853

Arrbee moved this task from QA to Done on the Language-Q4-2016-Sprint 3 board.