The frequency of false positives delivered by our breaking news algorithm is high. Based on a quick analysis of the logs being delivered weekly, we deliver ~200 results/week. Only 10% of those are accurate, in my opinion. We're taking the backwards approach to cutting down.
ToDo
- Exclude all results from simplewiki
- Create an exclude list of templates. In other words -- if the article has the following templates, it cannot be breaking news
- if any category/template in the article includes a death year we are not in. See here for an example. Please include all other language versions.
Acceptance criteria
- create ignore/un-include list of templates in code to be malleable - can easily add and remove templates as we iterate
- Update internal documentation to reflect the change
Test Strategy
- Francisco will keep monitoring and requesting changes as the logs inform results