Page MenuHomePhabricator

mirror hitting MaxRequestWorkers
Closed, ResolvedPublic

Description

Ober the last few weeks we have had some sporadic alerts for:

(ProbeDown) firing: (2) Service mirror1001:443 has failed probes  (http_mirrors_wikimedia_org_ip4) -  https://wikitech.wikimedia.org/wiki/Runbook#mirror1001:443 -

In the Apache logs we see the following

[Fri Nov 10 10:20:29.712676 2023] [mpm_event:error] [pid 3244005:tid 140056102948160] AH00484: server reached MaxRequestWorkers setting, consider raising the  setting

And looking at netstat we can see hundreds of ip's from byte dance address space e.g. bytespider-110-249-202-187.crawl.bytedance.com.

As a side note and possibly a red herring but i notice that IPv6 seems to continue to accept connections which suggests to me that appears may have dedicate threads for v6 and v4. with all v4 threads at max but some free thread available for v6?

As a fix we can increase MaxRequestWorkers but i wonder if we can also add something to ask bytdance to be a bit nicer. When i checked theyre were crawling us ffrom multiple (100+) ip's from at least the following prefixes 111.225.149.0/24, 110.249.201.0/24, 111.225.148.0/24, 110.249.202.0/24, 10.249.201.0/24, 111.225.149.0/24, 111.225.148.0/24

Event Timeline

We also see a bunch of aws ip's. and at least one random internet thread suggest this could be related

I sent a mail to spider-feedback@bytedance.com (the address given in their user agent), asking them to throttle.

MoritzMuehlenhoff claimed this task.

We reached out and they fixed their spider, can be closed.