Page MenuHomePhabricator

Libera Chat may throttle bot connections from tools
Closed, ResolvedPublic

Description

Right now we have public IPs assigned to some but not all exec nodes. Freenode loves bots running on the nodes with public IPs, and rejects most/all bots that connect from the nodes w/out public IPs.

In the short run we just need public IPs on all exec nodes. In the long run we should get freenode to lift that throttle for us if possible.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

One of the WMFGCs for IRC:

<AlexZ> If someone tells me what IPs or range need a whitelist I can email the folks who deal with that at freenode.

We're going to fix it both ways. I added floating IPs to the remaining 10 exec nodes. I've also emailed ilines@freenode.net to ask for a lift on the connection limit. In most cases, traffic from Labs comes from a single IP: 208.80.155.255

In a perfect world we would ask IRC to lift the throttle from all public Labs IPs as well, but that might be a big ask. The exact set of public ips assigned to only tools exec nodes is hard to predict.

With most of the IPs in the labs public /25 it's at least possible to determine which underlying instance is the source. I would expect the .255 NAT IP to be hard to get whitelisted.

Andrew changed the task status from Open to Stalled.Dec 20 2016, 3:26 PM
Andrew removed Andrew as the assignee of this task.

No response from freenode

scfc triaged this task as Low priority.Feb 16 2017, 9:30 PM
scfc moved this task from Backlog to Ready to be worked on on the Toolforge board.

Nope, I never heard anything back.

after speaking with a staffer today there is no issue adding an iline but the box needs to ensure an ident daemon is running for so each individual user with access has a unique identity for them or their bots. if staff see refusals they will easily up the limit for the host, but with a workaround in place they aren't likely to see such

Luke081515 changed the task status from Stalled to Open.Jul 10 2017, 10:36 PM

after speaking with a staffer today there is no issue adding an iline but the box needs to ensure an ident daemon is running for so each individual user with access has a unique identity for them or their bots. if staff see refusals they will easily up the limit for the host, but with a workaround in place they aren't likely to see such

We had some chats about this task in irc today after looking at public IP usage generally in Cloud Services related to our Neutron SDN networking plans. It looks like there may be a couple of identd services that are NAT aware:

In theory, running either of these services on our outbound NAT host(s) and all of the grid engine exec nodes would allow an ident request to the NAT'ed ip to find its way to the appropriate grid node to determine the tool account that is actually opening the irc connection.

I am more familiar with oident but I believe either would be suitable

if connecting clients start connecting without the ~ in the username field it's working perfectly and hopefully we shouldn't see any further connection errors. if so we can push freenode again for that iline.

So is this still an ongoing issue for anyone?

Figured it's been a year since the last response here, so I'd give it a poke and see!

So is this still an ongoing issue for anyone?

Figured it's been a year since the last response here, so I'd give it a poke and see!

We have not had any new complaints that I am aware of, but we do still have the public IPv4 addresses in-place on the grid exec nodes from T151704#2827832 which I believe largely solved the problem by spreading the irc bots run from Toolforge across a larger pool of IPv4 addresses as seen by Freenode. We have not done any work towards the NAT aware identd service idea which would in theory let us re-apply for the iline change and remove the public IPv4 usage across the grid. This may be something that we try to work on in the coming months as a preparation step for other changes that we will be making to Toolforge that include migrating the exec nodes to a new network.

T216370: IP address list for grid nodes / Freenode iline request has put a bandaid over this problem for now, but I'm going to work on getting oidentd setup such that a public service is running on the network gateway nodes that handle our public IPs and clients are running on all of the Toolforge grid engine nodes. This should make it easier to discuss and adjust iline limits with Freenode staff/admins.

Change 493767 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/puppet@production] wmcs: Add profiles for oidentd proxy and client modes

https://gerrit.wikimedia.org/r/493767

Change 493767 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Bryan Davis):
[operations/puppet@production] wmcs: Add profiles for oidentd proxy and client modes

https://gerrit.wikimedia.org/r/493767

Change 493767 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] wmcs: Add profiles for oidentd proxy and client modes

https://gerrit.wikimedia.org/r/493767

T216370: IP address list for grid nodes / Freenode iline request has put a bandaid over this problem for now, but I'm going to work on getting oidentd setup such that a public service is running on the network gateway nodes that handle our public IPs and clients are running on all of the Toolforge grid engine nodes. This should make it easier to discuss and adjust iline limits with Freenode staff/admins.

@Az1568 we have deployed oidentd in proxy mode for the Toolforge job grid nodes. Can you check and see if Freenode can properly get ident lookup responses now?

after speaking with a staffer today there is no issue adding an iline but the box needs to ensure an ident daemon is running for so each individual user with access has a unique identity for them or their bots. if staff see refusals they will easily up the limit for the host, but with a workaround in place they aren't likely to see such

@charitwo can you check with Freenode staff to see if they are getting proper ident responses from Toolforge irc bots now? They should be seeing something like tools.stashbot as the response for connections from the ~stashbot@wikimedia/bot/stashbot userhost for example.

1559347684 00:08:04 [card] -!- stashbot [~stashbot@wikimedia/bot/stashbot]

the ~ means no response

1559347684 00:08:04 [card] -!- stashbot [~stashbot@wikimedia/bot/stashbot]

the ~ means no response

I thought the ~ just meant that the response does not match the registered account name. Hmm... I'll see what debugging I can do to find out where things are breaking down.

the IRC account name on freenode doesn't matter, the ssh user is what must match that field

aborrero moved this task from Doing to Soon! on the cloud-services-team (Kanban) board.
aborrero raised the priority of this task from Low to Medium.Feb 11 2021, 5:36 PM

A bit too early to ask, however: Does that still happen on libera.chat? :P

A bit too early to ask, however: Does that still happen on libera.chat? :P

I'm sure it will. The limitations are based on abuse prevention on the IRC network side and not really specific to the network per se. I think that T278584: Promote use of SASL for Cloud VPS/Toolforge hosted Libera.chat / Freenode IRC bots really our only way forward that is going to work well.

Krinkle renamed this task from Freenode sometimes throttles bot connections from tools to Libera Chat may throttle bot connections from tools.Sep 3 2021, 11:10 PM
dcaro claimed this task.
dcaro subscribed.

I think we can close this in favor the one @bd808 mentioned (T278584: Promote use of SASL for Cloud VPS/Toolforge hosted Libera.chat / Freenode IRC bots), and continue the work there (no need for two tasks addressing the same issue xd).