Page MenuHomePhabricator

AbuseFilter should allow filtering based on the use of a specific OAuth application
Closed, ResolvedPublic

Description

It is not practical to expect tool developers to re-implement all of MediaWiki's anti-abuse abilities because that's just a ton of work, and a lot of those abilities require access to private material (e.g. Toolforge tools do not have access to users' IP addresses, so no way to check against IP blocks).

As some tools are repeatedly being abused for vandalism, we intend to expose the OAuth consumer ID as an AbuseFilter variable (oauth_consumer). This will allow wiki admins more flexibility in designing and writing rules depending on the tool in question. Please be careful when writing these kinds of rules, as many tools are intended to be used by newcomers at workshops, hackathons, etc. Blocking a tool outright, while it might stop abuse in the short-term, is more likely to be detrimental to your wiki in the long run.

Event Timeline

Change 748784 had a related patch set uploaded (by Legoktm; author: Majavah):

[mediawiki/extensions/OAuth@master] Add AbuseFilter variable for used OAuth consumer

https://gerrit.wikimedia.org/r/748784

Change 748784 merged by jenkins-bot:

[mediawiki/extensions/OAuth@master] Add AbuseFilter variable for used OAuth consumer

https://gerrit.wikimedia.org/r/748784

Not grok'ing this exactly, I will just view it in action. We should write good instructions, with all the appropriate caveats on the mediawiki guidance page. Thanks.

I had kept the original description vague because it was kind of BEANsy but this was actually discussed in a public channel so it's probably OK. You can see the logs at https://wm-bot.wmflabs.org/libera_logs/%23wikimedia-cloud/20211220.txt.

The gist is that people are vandalizing through PAWS, which (just like literally every other server-side tool) hides the users' IPs from CheckUser or hard blocks. Adding the AbuseFilter variable will allow writing more specific rules targetting just PAWS vandalism, whether via rate limits or observed patterns that might not be suitable for all edits. @MarcoAurelio might have some other ideas. I do think it will be a bit of experimentation will be necessary to figure out how to best use this.

@Legoktm I wonder if WMCS proxies could just encrypt the user's IP/XFF and push it into a header to be forwarded to MediaWiki, which could then decrypt it and provide accurate CheckUser/block information?

@Legoktm I wonder if WMCS proxies could just encrypt the user's IP/XFF and push it into a header to be forwarded to MediaWiki, which could then decrypt it and provide accurate CheckUser/block information?

I suppose it could, but that would also require teaching OAuth libraries and/or tools about this new header, and even then it doesn't help if the tool doesn't pass it along or is outside WMCS. I'm also a bit behind on the latest about IP masking, but in general it would be nice if we could start moving away from relying on IPs for abuse prevention...

I suppose it could, but that would also require teaching OAuth libraries and/or tools about this new header, and even then it doesn't help if the tool doesn't pass it along or is outside WMCS.

In general I think all external tools with an abuse risk should be expected to set XFF headers. WMCS tools are only special in that they should not have access to the request IP, so there would have to be some way to pass it through to MediaWiki without making it available to the tool.

I'm also a bit behind on the latest about IP masking

It won't change anything for logged-in users, so it won't change anything for OAuth.

in general it would be nice if we could start moving away from relying on IPs for abuse prevention...

It would be nice, sure, but I don't see any moving-away yet, or even any ideas on what direction to move into; and I'm not sure how much help exposing the OAuth ID to AbuseFilter is going to be.

@Legoktm I wonder if WMCS proxies could just encrypt the user's IP/XFF and push it into a header to be forwarded to MediaWiki, which could then decrypt it and provide accurate CheckUser/block information?

I suppose it could, but that would also require teaching OAuth libraries and/or tools about this new header, and even then it doesn't help if the tool doesn't pass it along or is outside WMCS. I'm also a bit behind on the latest about IP masking, but in general it would be nice if we could start moving away from relying on IPs for abuse prevention...

That'd be very helpful and greatly appreciated :). But...as @Tgr mentioned, there aren't any activites in that direction (in fact, during the 6 years of my adminship, the way patrolling and antiabuse work is done is pretty much the same). So we should at least not give up the only thing we have available ATM.