TL;DR — Using a spike detection model to detect breaking news articles on Wikipedia.
Our objective is to build into our API schema, a Breaking News boolean signal as part of our Credibility Signals feature, that can reveal in payload — particularly for users of our Realtime: Stream API — whether editing activity on Wikipedia indicates the writing of entries concurrent with breaking news happening around the world.
Breaking News as an anomaly signal is a category. Because of the nature of breaking news and Wikipedia, explained above, BN Credibility Signal will grow and change. At first, this will operate as a 24h boolean for new entries.
Done is
- Dark Launch signed off https://phabricator.wikimedia.org/T343742
- Public demo released https://phabricator.wikimedia.org/T344340
- Marketing work complete
- Community announcements complete
KPIs
- all "big unforeseen" news in English.
- lower ingestion latency of new entries for reusers.