Page MenuHomePhabricator

Finalize strategy for automatic generation of top-read articles list
Closed, ResolvedPublic

Description

Now that I have made an initial attempt at pulling an automated list of top-read Indian articles (T271310), we may want to adjust the strategy for better results before implementing the list generation (T271312).

Event Timeline

AMuigai moved this task from Watching to Analyst on the Inuka-Team board.

We now have a clear method for selecting trending articles:

  • Get the most-viewed articles in the country, properly accounting for traffic to redirects
  • Remove articles where less than 10% or more than 90% of the traffic came from computers (as opposed to mobile devices)
  • For each day, remove any articles that were shown in the previous 7 days.
  • Manually remove a set of bad recommendations (e.g. "Main_Page", "News", "Pornography", "XXX_(film_series)") if they show up.

I've backtested this method from 1 February to 14 March and the results look reasonably good.