In T185319 the Analytics team took ownership of the next developments of IRCRecentChanges. This task is meant to track the work to be done.
Description
Description
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | • Nuria | T185319 IRC RecentChanges feed: code stewardship request | |||
| Duplicate | None | T232483 Port IRCRecentChanges to Kafka |
Event Timeline
Comment Actions
Yesterday I had a chat with @faidon about this project and this is what I gathered:
- we currently run a patched ircd daemon on kraz (role::mw_rc_irc in site.pp) that serves irc.wikimedia.org
- bots join channels (like #enwiki, etc..) and listen for updates from rc-pmtpa, like recent changes feed, and take actions accordingly. There are ~264 clients using irc.wikimedia.org.
Outstanding issues:
- kraz runs Debian Jessie, so it needs to be upgraded to Stretch/Buster during the next months before Jessie's LTS deadline expires.
- maintaining a patched ircd daemon is cumbersome and not scalable
Proposal from Faidon:
- write a custom stateless daemon to run on Kubernetes based on https://gist.github.com/paravoid/3419e0b5ae1f24b6ea21906a142f2f47
- the daemon should be stateless and not sharing state
- the daemon should offer a "sandbox" to each client/bot joining, offering a "private"-like IRC channel with only rc-pmtpa writing updates. In this way running the daemon on multiple pods in kubernetes wouldn't require to share state (like the list of connected clients, etc..)
- the daemon should pull recentchanges from Event Streams, and feeding it to every channel/client.
The above is not mandatory but only a suggestion about how to proceed :)
Comment Actions
As first step, we (Analytics) are going to create a quick design doc / one-pager about how the architecture should look like in T234234