Page MenuHomePhabricator

wikibugs - throttle output, don't get kicked for flooding
Closed, ResolvedPublic

Description

wikibugs keeps quitting for Excess Flood every once in a while. throttle output somehow to avoid that.

16:44 -!- wikibugs [tools.wiki@wikimedia/bot/pywikibugs] has quit [Excess Flood]
16:45  * greg-g keeps killing wikibugs today by bulk editing tasks :)
16:51 < mutante> i wonder if we can ask freenode for an exemption so that it is allowed to flood more


16:51 -!- Irssi: Join to #freenode was synced in 1 secs
16:52 < mutante> is there a process to ask for an exemption for a friendly bot, so that it is allowed a bit more flooding before "quit (excess flood)"?

16:53 < CoJaBo> mutante: Have the bot throttle output?
16:54 < mniip> mutante, unfortunately no
16:54 <+jayne> mutante: there's no way change that on a per-connection (much less per-account) basis. freenode's limits are already more generous than most. You need to teach the bot to be better-behaved.

Event Timeline

Dzahn raised the priority of this task from to Needs Triage.
Dzahn updated the task description. (Show Details)
Dzahn added a project: Wikibugs.
Dzahn added subscribers: Dzahn, greg.

We could do this by overriding IrcBot.send_line, although I'm not completely sure how to do this in an asyncio-friendly way. Obvious options are:

  • a simple yield asyncio.sleep(time), or maybe BaseEventLoop.call_at. Might give issues with ordering and late PONG replies
  • using a queue and getting a value from there every now and then is cleaner. We can even use a priority queue to make sure PONGs have priority. But I'm not immediately sure how to write the part that reads from the queue and hook that into the event loop.

We could implement the rate limit on the wikibugs.py side and only push something into redis every second? Yay hacks.

Change 238792 had a related patch set uploaded (by Legoktm):
Wait at least 1 second before pushing into redis

https://gerrit.wikimedia.org/r/238792

Change 238792 merged by jenkins-bot:
Wait at least 1 second before pushing into redis

https://gerrit.wikimedia.org/r/238792

So now someone needs to try a batch edit and we'll see if it floods off? :D

On Sept 17 (PDT):
[06:55:14] <-- wikibugs (tools.wiki@wikimedia/bot/pywikibugs) has quit (Excess Flood)

:|

13:04 < wikibugs> Release-Engineering-Team, User-greg: Publish WMF code-hosting exception policy - https://phabricator.wikimedia.org/T109919#1849975 (greg)
13:04 < wikibugs> Release-Engineering-Team, User-greg: Write draft/strawman code-hosting exception guideline - https://phabricator.wikimedia.org/T109920#1849974 (greg)
13:04 < wikibugs> Release-Engineering-Team, User-greg, WorkType-NewFunctionality: Tag some portion of RelEng team tasks with "New" or "Maint" - 
                  https://phabricator.wikimedia.org/T109375#1849977 (greg)
13:04 -!- wikibugs [tools.wiki@wikimedia/bot/pywikibugs] has quit [Excess Flood]
13:06 <+greg-g> heh, sorry wikibugs

That was for 19 tasks being edited, btw.

greg triaged this task as Medium priority.Dec 3 2015, 9:26 PM

Worth bumping it up to two seconds and trying that out? Flooded off quite a few times today

Change 263931 had a related patch set uploaded (by Samtar):
Redis delay to 2 seconds

https://gerrit.wikimedia.org/r/263931

In T112032#1932995, @Samtar wrote:

Flooded off quite a few times today

My bad :) T123302

Change 263931 merged by jenkins-bot:
Bump redis delay to 2 seconds to avoid flooding

https://gerrit.wikimedia.org/r/263931

I think it's working - wikibugs is going mad on -dev at the moment but the rate looks good and it's not being flooded off :D

@Samtar looks like you fixed it indeed. I saw mass edits that would have always kicked the bot in the past.. but it stayed online. very cool :)

Dzahn claimed this task.

i'll say yes.. we can always reopen it if needed

meme, src="tech-barnstar", above="tech barnstar", below="for samtar"

T119829:

17:59 -!- wikibugs [tools.wiki@wikimedia/bot/pywikibugs] has quit [Excess Flood]

He leaved after a big job....

I think freenode probably has two sets of limits, e.g. max x messages per 10 seconds, and max 5x messages per 5 minutes. Your bulk change might cause it to hit the latter.

@Luke081515 I think it's currently working the best it's going to - the current redis delay prevents all but absolutely massive jobs from causing fn to kick wikibugs. Perhaps the issue is now on what wikibugs actually notifies the channel of?

I think we can solve the actual issue: Figure out, which numbers of messages are allowed in 5 minutes, and we can solve this, if we count how many messages were send in the last minutes. The bug is not what wikibugs send, this was because herald was active, if you for example remove a CC, so you can not solve this by changing the thing which were shown by wikibugs.

This is still an issue nowadays. wikibugs just got kicked again for flooding after a mass edit on Phabricator.

See discussion in T237109:

So I think the rate limiting is largely working as intended; being disconnected in times of large numbers of messages is not a huge issue, and actually potentially beneficial (as messages remaining queued before and during the reconnect get dropped -- this prevents the bot from building up a potentially hours-long backlog).

16:36 -!- wikibugs [~wikibugs2@wikimedia/bot/pywikibugs] has quit [Excess Flood]