It seems older articles are getting lost in trending, presumably due to the purging strategy.
We should take a days dump of events from Wikipedia and locally work out which notable articles are being lost during purging.
Currently when the list of articles exceeds max_pages we purge the older articles.
Open questions
- Are certain trending pages getting purged? (Use the result of T159967 to debug)
- What is a typical max_pages for the period of a day?
- What is an acceptable/performant value of max_pages
- What strategies could we use to ensure max_pages is rarely exceeded? Can we purge under other criteria?