For the Translation recommendation type, when the user does not provide a seed, we currently rank on pageviews in the source. Let's add an option to rank by the number of different languages the article exists in. Randomly shuffle tied articles on each request.
I'm against using randomness as a tie-breaker when we could just use pageviews to break ties - thoughts?
- Get most popular articles for last N days
- Sort based on number of wikis articles are present in, falling back to some tiebreaker
A few examples to compare:
@schana the results look more useful with rank_method=sitelinks for a few other examples I tried from en to fa.
@Pginer-WMF we're testing with showing relevant articles not w.r.t. their pageviews but the number of Wikipedias that have that article. Check https://recommend.wmflabs.org/?s=en&t=de&seed=Agriculture&search=related_articles&rank_method=sitelinks for example. My question for you is: do you think we should change the "incentive" on each card in GapFinder from xxk recent views (where xx is the number of pageviews) to a different metric based on the rank_method we use. For example, should we change it to "xx Wikipedia's have this article" in this case?
The motivation information does not need to be the same as the information used for selection, but both alternatives could work well. Some considerations:
- Regardless of the criteria for selecting articles, if we think the number of views is the best motivator we can keep it. In this case, we may want to show it only for articles that cross a certain threshold of views. In this way it could be used as an additional criteria in order to select the articles with most impact (in terms of views) from those that are more relevant (based on the presence in different wikis).
- Showing the number of Wikipedias (or languages) that already have the article helps as both a motivator and an explanation on why the articles are suggested.
It would be great to research more on what motivates people more. From my interaction with translators, I think that both pieces of information would be understood as how "popular" the article is, so I don't expect much difference. It may be good to make the different changes in separate steps in order to measure the impact of the algorithm change first and the motivator change later.
@schana will this have impact on what CX is surfacing in Suggestions? If yes, let's wait a couple of more weeks to see if we can observe the result of the readmore change.
@Pginer-WMF Do you suggest that we give the user of recommend.wmflabs.org an option for switching between different ranking methods? This can be desirable for some editors as they are sometimes they are looking for high pageview articles, and sometimes for those that are more likely to be present in other Wikipedias. If we should give them such an option, what is your design recommendation?