Page MenuHomePhabricator

Wikibugs does not rejoin channels automatically following a BNC restart
Open, HighPublicBUG REPORT

Description

[14:59:24] <taavi>	 !log tools.wikibugs $ toolforge jobs restart irc # to get it to join channels after the bouncer pod was moved to a different node
[14:59:26] <stashbot>	 Logged the message at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools.wikibugs/SAL
[14:59:49] <taavi>	 ^ bd808: seems like wikibugs's auto channel joining logic gets broken after a znc restart

The ZNC container recreates its config on startup so currently there is no way to use stickychan to manage the rejoins on the ZNC side.

Event Timeline

bd808 moved this task from Backlog to Ready to Go on the Wikibugs board.

Not sure what the fix is here yet, but having the bot drop offline anytime the znc pod gets rescheduled is not goodly.

If the python client can detect the BNC restart, a reasonable solution might be clearing self.joined_channels on that event. The current message send logic short circuits when the target channel is in the joined_channels list and would otherwise await a join command before proceeding.

If the python client can detect the rejected message send ([2026-03-20 01:14:38.903176] (wikibugs/libera) IRC -> ZNC [:tantalum.libera.chat 404 wikibugs #wikimedia-dev :Cannot send to nick/channel]) one reasonable solution would be a trap on that error that does a channel join and then resends.

If the python client can detect the BNC restart, a reasonable solution might be clearing self.joined_channels on that event. The current message send logic short circuits when the target channel is in the joined_channels list and would otherwise await a join command before proceeding.

According to the examples at https://github.com/gawel/irc3/blob/main/examples/mybot.py, this seems to be doable in the server_ready (or even connection_lost?) methods.

If the python client can detect the rejected message send ([2026-03-20 01:14:38.903176] (wikibugs/libera) IRC -> ZNC [:tantalum.libera.chat 404 wikibugs #wikimedia-dev :Cannot send to nick/channel]) one reasonable solution would be a trap on that error that does a channel join and then resends.

Detecting the ERR_CANNOTSENDTOCHAN numeric is trivial, but the plumbing required to access the specific message text in the handler for that would not be. Also note that that numeric can be triggered for other reasons (like a channel being moderated) than just not being in there.

If the python client can detect the BNC restart, a reasonable solution might be clearing self.joined_channels on that event. The current message send logic short circuits when the target channel is in the joined_channels list and would otherwise await a join command before proceeding.

According to the examples at https://github.com/gawel/irc3/blob/main/examples/mybot.py, this seems to be doable in the server_ready (or even connection_lost?) methods.

I wonder actually if with the bouncer in place it wouldn't just be simplest to always send the JOIN. I would expect the BNC to ack that nearly instantly. The JOIN short circuit may be an optimization that we can get along fine without.