Both wmcloud.org and toolforge.org are suffering from traffic overload.
- Obviously this is caused by crawlers and bots.
- Rather than a wikipedia there is nothing to explore for search engines nor archives.
It should be ensured that no User Agent containing one of the following strings shall receive content:
archive.org_bot AwarioBot Amazonbot bingbot Brightbot CCBot ClaudeBot DataForSeoBot DotBot DuckDuckBot Googlebot GPTBot IABot libwww-perl MojeekBot OAI-SearchBot PerplexityBot PetalBot PriEcoBot SemanticScholarBot SemrushBot SeznamBot Thinkbot TelegramBot Twitterbot YandexBot
A German technical village pump issue tells more.
- The list has been collected from a current toolforge log file within 24 h one week ago.
- The tool could not answer any query any more.
- Especially Petal=Huawei caused the overload.
- After filtering as described the tool answered quick and faster than ever.
Rather than implementing individual defensive action into every single tool, wmcloud.org and toolforge.org should maintain a common solution applied to both domains.
- xtools@wmcloud are also suffering from overload.
- In T393487#11024836 it is claimed that “these wikis all have robots.txt files that tell all crawlers to ignore the sites”.
- Well, obviously not. Otherwise those queries would not have been found in recent log file.
On the other hand, the IP blocking at BETA should be terminated as soon as possible. IP ranges are not a good idea to distinguish bots from human beings over months.