Project Page https://github.com/notconfusing/WIGI
Talk Page https://wikimania2015.wikimedia.org/wiki/Submissions/The_Pleasures_and_Pains_of_Analyzing_All_the_Wikis_in_Realtime
Tutorial Abstract
What started off as the problem of tracking citations eventually lead us to develop a much more general solution - a tool to track all the edits of all Wikis in realtime. With this a new world of possibilities opens up: tracking the trends in what people are writing about, allowing users to receive alerts on edits based on custom queries on article and edit content. These ideas are far away, but we can bring them closer by joining together in building the platform. This introduction is a tutorial in what exists so far in drinking and filtering the Recent Changes (RC) Stream.
Technologies we will cover:
- RCstream and websockets.
- Wikimedia labs.
- Mediawiki diff API.
- Wikitext parsing.
- Stream rebroadcasting.
We also hope to brainstorm and organize future uses and development of a community platform.
Our Future Uses Brainstorm:
- Using the changes queue directly
- Trend tracking with dynamic topic modeling (More on this here)
- Real-time wikimedia analytics in the style of social media analytics and search
- Alerts based on stream queries.