Page MenuHomePhabricator

Decide whether QBA-bot's proxy blocker is allowed
Open, Needs TriagePublic

Description

QBA-bot runs on ruwiki and ruwikiquote and has inserted 2.2 million ipblocks rows on each wiki. This constitutes a majority of cluster-wide ipblocks rows -- enwiki has 1.3 million blocks, and of those, only 95502 are of IP addresses and ranges. Commons has only 83788 blocks in total.

Google Translate tells me that it is "a bot to block open anonymizing proxies that should not be used to edit Wikimedia Foundation projects." Inserting 2M rows into the database of every wiki and regularly updating those rows seems like an inefficient way to achieve that goal. We could have an extension that maintains a single shared table, and sets appropriate permissions in the style of the TorBlock extension. This assumes that the addresses are indeed open proxies -- the method used by QBA-bot to detect proxies is apparently not disclosed.

1538 blocks on ruwikiquote are of /16 ranges, representing about 100 million IP addresses. It seems unlikely to me that these ranges are filled with open proxies. Perhaps QBA-bot is blocking unallocated ranges. This seems unnecessary and heavy-handed to me.

I am suggesting that someone review QBA-bot's proxy blocker and decide whether it should be allowed.

Event Timeline

Hello,

I would gladly shut down the bot when we get a global proxy protection that works.

The blocking of proxies on ruwiki was performed long time before my arrival in the project. The actual version of the bot was created in 2016 in order to stop persistent project-breaking wave of vandal attacks over proxies, which can could not be stopped by all other means. The bot didn't stop the attacks completely (not all proxies could be found by it), but at least their severity was greatly reduced.

I could provide you more information about the attacks and the methods used, but not in this open ticket – I don't want to provide a "vandalism how to" to general public.

has inserted 2.2 million ipblocks rows on each wiki. This constitutes a majority of cluster-wide ipblocks rows -- enwiki has 1.3 million blocks, and of those, only 95502 are of IP addresses and ranges. Commons has only 83788 blocks in total

The number of open proxies worldwide doesn't correspond to the size of an Wikipedia project.

the method used by QBA-bot to detect proxies is apparently not disclosed

The method is disclosed, but not to general public. This is why sysops and bureaucrats of the ruwiki, who are aware of how the bot works, have no objections.

P.S.: Saw that you've added the Anti-Harassment group to this ticket. Is my assumption correct, that there was a complaint about the bot? One of our LTAs, who is not able to evade the bot, now has switched his tactics and tries to combat the bot by filling the complaints everywhere. So, I assume this ticket could be the result of his next complaint, after all the previous ones were rejected. I guess his next move will be the petition to Jimbo Wales...

P.S.: Saw that you've added the Anti-Harassment group to this ticket. Is my assumption correct, that there was a complaint about the bot?

No, there is no complaint. Anti-harassment are actually an engineering team who maintain the blocking code. So monitoring usage of the IP blocks feature and potentially replacing your bot with a MediaWiki extension are both within their area of responsibility. The team responsible for responding to complaints of harassment is called Trust and Safety.

LSobanski subscribed.

I don't believe there is an explicit action for DBA here so I'm untagging us from this task. Please add us back if there are specific questions we can help with.