Page MenuHomePhabricator

Allow blocking open proxies based on X-Forwarded-For header (XFF)
Closed, ResolvedPublic

Description

Would it make sense if blocking an IP also blocked edits from open proxies that have the blocked IP in their X-Forwarded-For header, independently of whether the proxy is on the trusted XFF list?

Use case: Open proxies are currently not blocked on a pre-emptive basis at dewiki. There is one banned user who uses random open proxies for attacks, without caring about XFF. On 28/29 April 2010, for example, he used 12 open proxies within 3 1/2 hours, 6 out of which were transparent. His real IP range is known. If blocking that range applied to transparent proxies, we could prevent that they are abused and wouldn't have to block them.

The X-Forwarded-For header may be forged, but I don't see how that could be a problem in this scenario. Even if it might be possible to "escape" a range block by forging the XFF header, the proxy would then be blocked as having been abused.


Version: unspecified
Severity: enhancement
URL: https://gerrit.wikimedia.org/r/33971
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=35542
https://bugzilla.wikimedia.org/show_bug.cgi?id=15259
https://bugzilla.wikimedia.org/show_bug.cgi?id=34288
https://bugzilla.wikimedia.org/show_bug.cgi?id=42438

Details

Reference
bz23343

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 11:02 PM
bzimport set Reference to bz23343.

nakor.wp wrote:

This would be usufull. Lately at frwiki we had to block dozens of proxies a day because they were used by a banned user. His range of IP address is already blocked and for most of the proxies he used we could trace back to him using the XFF headers.

If we could block him using the XFF headers that would save a lot of disruptive edits and time lost to revert and block.

I also agree. We've wasted a great time blocking transparent proxies (although providing the XFF information) at frwiki. They don't let anyone edit Wikipedia anonymously, but they still can be used to perfectly bypass IP blocks.

If they are reputable proxies then perhaps they could be added to the trusted XFF list.

nakor.wp wrote:

Would this prevent blocked IP to edit thourgh them?

And anyway it still look like a burden it would just report the problem from blocking the proxies at fr.wiki to added them to the trusted list.

(In reply to comment #4)

Would this prevent blocked IP to edit thourgh them?

And anyway it still look like a burden it would just report the problem from
blocking the proxies at fr.wiki to added them to the trusted list.

It would treat the source IP in the XFF as their actual IP by MediaWiki, allowing for normal blocks to work on users behind the proxy IP.

nakor.wp wrote:

Where is this list so I can check what is already in? That would partially solve the problem.

But there is still the issue that this is a post-vandalism solution. We need to prevent this from happening in the beginning.

The way I understand this bug is that XFF IP addresses should be checked for blocking *irrespective* the trusted XFF list.

In August 2010 checkusers saw example of the email spammer that used 19 different IP addresses plus 4 /24 networks and 1 /network full of proxies. But he only had
4 different IP addresses in his XFF headers from a single /16 range of the dynmic IP provider.

Many of those proxies where some located on some dynamic DSL loatons of unknown provenience, so that marking them as "trusted" would be a recipe for disaster. We don't know whether they allow for private IP addresses, we don't know if they allow fake XFF to be passsed over etc.

Besides I don't believe that TrustedXFF list could (and should) be edited and deployed so quickly to fight abuse like this.

This feature could possibly be enabled by default - there is little damage if someone fakes their XFF header and happens to use a blocked IP address, so be it.

It could become problematic with RFC1918 addresses - for example a non-wikimedia wiki that wishes to block some local user 192.168.0.0/24 would be effectively be blocking all poor RFC1918 souls behind some proxies on the whole Internet.

nakor.wp wrote:

We have a VERY disruptive user alone in his IP range, the IP range is already blocked. He vandalizes fr.wp using transparent proxies. If the block of his IP range would also carry though XFF headers this would save contributors, sysops and checkusers a lot of time.

I wouldn't bother changing priority fields on BZ :)

ayg wrote:

There's no obvious reason we shouldn't use untrusted XFF to block people. Untrusted XFF is usually going to be correct, it's just subject to manipulation by malicious users. As long as the most you can do is manipulate yourself into being blocked, it's not a problem to trust it.

The only catch would be if some broken proxies stick random IP addresses in the XFF header and we happen to block them. I'd be totally unsurprised if an appreciable percentage of proxies stick random IP addresses in the XFF header, but it's unlikely they'd happen to be blocked, so it's not too likely to cause false positives.

This doesn't sound like a hard feature to implement. The error message should be distinct from the regular IP block message and it should provide the XFF string and matched IP address, for debugging purposes. It would make open proxy blocking a lot more effective, since many open proxies do send correct XFF headers AFAIK.

I do support being able to block XFF headers.

Admins & CUs know that many vandals and spammers abuse open proxies to abuse our projects. Some of them are transparent so we know their real range but since they use proxy, blocking the underlying *real* range does no effect as they're editting through the proxy, which can be changed any time they wish.

We can start blocking random proxys but that's mostly wasting our time. There are more than enough open proxies in the web. So the abuse continues.

Providing us such a feature would be very good and time-saving.

MediaWiki software perhaps should also check edits against and OP list (dayly updated automatically) and if matched, prevent it.

At the moment there is a spike of spambots utilising XFF, by doubling up on a chain of open proxy servers to attack our sites. So this ability to detect the abuse seems worthwhile to be configured to also prevent the spam.

  • Bug 39980 has been marked as a duplicate of this bug. ***

If I can get confirmation on the desired functionality, I think we can get this implemented.

I think the proposal is that we examine the list of IPs in the XFF header, and if any of them are blocked, we apply that to the current request. Please correct me if I'm wrong about that.

If that's the case, then when we apply the blocks, do we just stop at the first IP address that has a block, and use that block? Or should we look at all blocks that might be applied to the chain, and apply the most restrictive?

Examine the list of IPs - yes.

Regarding the order and the application of the block. I would normally expect that the most specific rule (not necessarily the most restrictive) wins (I am sometimes confused by the current blocking logic but that's what I think it is now).

Unfortunately that means that things like "ipexempt" should be taken in the account - especially to help regsistered users that are hopelessly stuck behind "bad" proxies.

I got bitten recently by the practice of blocking whole ranges of /16 (for example hosting farms) and getting an exception for some IP address when a whole range is blocked might be tricky.

It is an interesting question what to do if, say, specific IP address from the XFF header is softblocked, but the IP source address (the one we are normally using), is, say, within a hardblocked range.

Looking at the use case we have seen recently, it is mean to avoid chasing many IP source addresses or rangeblocking a large ISP in cases where some specific feature is shared in the XFF. But if tthe IP source address/IP source address range is blocked I think this block (even if "softer") should prevail as to avoid situations to fake XFF in order to get around some limitation, for example avoid hardblocking or account creation ban and make us enforce only a soft block.

There is an increasing amount of XFF type spambot accounts taking place into WMF wikis at this time. And increasingly they are going to South American http configured services, and an example of today being 177.87.193.134 I don't believe that this is simply coincidence, and more factor this into a planned means to attack.

As it is usually coming from another insecure proxy, yes, it is not that we have more open means of attack. However, as it is all serious spambots intent on breaking into our wiki in their fruitless attempts to spam. It is ceaseless and we have already seen previous attacks, and often we have already blocked the IP due to direct attacks. We know that they will be back, and we have a series of defences that are not able to prevent this, and this is now a known hole that has had a request for a fix for 18 months.

I would even suggest that we may wish to adapt the title of this bug to something simple.

BLOCK XFF EDITING OF BASED ON ABUSED IP ADDRESS IN (GLOBALLY) BLOCKED LISTS

I don't care a tuppence whether it is an open proxy or not, evil/abused source IP is evil.

Nemo said that my lucidity was not completely there. Here is some CU data of some spambots

YLoretta (talk | contribs | block) (Check) (11:31, 15 November 2012) [2]
(Locked)

177.87.193.134   XFF: 84.109.73.220, 177.87.193.134, 10.64.0.123
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)

KeiraCkf (talk | contribs | block) (Check) (03:04, 14 November 2012) [2]
(Locked)

190.221.146.75   XFF: 84.109.73.220, 190.221.146.75, 10.64.0.130
Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.112 Safari/535.1

Lisette21 (talk | contribs | block) (Check) (11:52, 13 November 2012) [2]
(Locked)

190.220.8.214   XFF: 84.109.73.220, 76.12.1.222, 190.220.8.214, 10.64.0.127
Mozilla/4.0 <a href=>kill spam</a> (Windows NT 5.1; U; en) Presto/2.10.229 Version/11.60

Note that the spambots are over three days, and this is a snippet of a recent set of attacks from just one server

I've changed it to high priority as this is definitely an important enhancement. I know several long-term vandals which could be stopped easier when we had the ability to block open proxies/ips based on the xff header.

Patch needs rework, help welcome I guess.

skizzerz wrote:

I plan on releasing a change set addressing the current issues on Monday or Tuesday. I've been busy over the weekend so I did not have time to work on it then. Also marking this as assigned.

  • Bug 42438 has been marked as a duplicate of this bug. ***

I'd like to add (from the above dup) that it would be useful, especially for networks using carrier-grade NAT, that we should be able to also base blocks off of both public-facing and (private) IPs behind, such as blocking "206.34.7.1/16/xff:10.6.0.0/16" or "206.6.1.8/xff:192.168.2.0/24".

I think Bug 42438 would be good to keep as a separate bug. I'd rather get the basic logic out asap, and then add something like that in later, if it make sense for the usage that we're seeing.

skizzerz wrote:

The most recent patch set is still awaiting review. If you would like to see this bug fixed and have some knowledge of MediaWiki/PHP, please chip in with your feedback and comments at https://gerrit.wikimedia.org/r/#/c/33971/

Really really would like to have this reviewed and in place. It would make for a very happy steward. When you get four for four in checks and you can do nothing but block the relay. OR you have a known vandal at it for months, and all we can do is scream mutely, the frustration is palpable. Thanks.

I have question, how this integrates with global blocking? From time to time we have issues with global blocks vs. local stuff.

(In reply to comment #29)

I have question, how this integrates with global blocking?

I don't think it does...

The current patch is for local blocks only. We would need a patch for GlobalBlocking also to make GlobalBlocking check XFF.

Not global? Oh dear that is going to be a PITA as the abuse via proxy that we are seeing is definitely global. Do we need a separate bugzilla?

Please, no. Sorry for spoiling the party. The patch is local though and the logic needs to be tested (local blocks, global blocks, exempts, etc.)

To everyone wanting to have this implemented quickly:
it would make sense do prepare a set of test cases to check the code against.

Test cases should take into account:

  1. soft- and hardblocking
  2. IPv4 and IPv6 addresses
  3. Single IP and range blocks
  4. Real source IP address and XFF
  5. IP block exemption
  6. Potentially also local and global blocks

Something like,

Case 1. There is a softblock range on "90.4.0.0/16"
Logged-in user "A" from "80.1.2.3" with XFF "90.4.5.6" tries to edit.
Case 2. There is a hardblock range on "90.4.0.0/16"
Logged-in user "A" from "80.1.2.3" with XFF "90.4.5.6" tries to edit.
Case 3. There is a hardblock range on "90.4.0.0/16"
Logged-in user "A" from "80.1.2.3" with XFF "90.4.5.6" tries to edit. The user has "ipblock-exempt"

etc..

We have also stuff like https://www.mediawiki.org/wiki/Manual:$wgBlockCIDRLimit
and maybe other affecting settings.

I would love to have a pretty complete decision tree to test the implementation against. Maybe such a tree should be published on www.mediawiki.org later.

I'd like to get input about comment #11, and the messaging to the users. If one of the stewards could comment, that would be especially helpful!

Local blocking message is currently: "An IP address present in the X-Forwarded-For header, either yours or that of a proxy server you are using, has been blocked. The original block reason was: $1"

  • Do we need to include the full XFF header too?
  • Do we need to include the IP, or list of IPs, that triggered the block?
  • If more than one IP triggers the block, should we include the block reason?

GlobalBlocking message is currently: "One or more proxy servers that your request used has been blocked on all wikis"

  • Both Alex and Siebrand thought "all wikis" was missleading-- should we use "across wikis in this organization" or just it's been "globally blocked", despite the meaning that unfamiliar users may put into "globally"?
  • Again, do we need to include the full XFF header? And the IP(s) that triggered the actual block?

quentinv57 wrote:

Dear Chris,

I'm not sure it's needed to explicit that the block has been made based on XFF headers. As it's not hard to change it, some long-term vandals could bypass this block easily if they know the nature. Same reason as if we blocked people based on their UA...

On the contrary, showing the IP that triggered the block would be useful. Sometimes people report us that they have been blocked without a valid reason, which is good to know. They couldn't report without knowing which IP triggered the block.

But this is only my opinion and does not reflect what other stewards can think. I forwarded the issue to the whole team so they can also comment.

Quentinv57

One more question related to reporting as mentioned by Chris and Quentinv57 above - how can we search for XFF blocks? Should XFF blocks appear in the normal list of blocked IP addresses? (Probably yes). Probably we should use only the information needed to locate the address on the list of block and its history.

Marcin, if I understand your question right, the actual blocks are just normal local or global IP blocks, so there will not be any special list of them. Also, if an autoblock is created for a user, it will only be their actual IP (which may be the last hop in a series of XFF hops).

Although speaking of autoblocks, we may see a weird issue where a user with an unblocked IP address may get auto-blocked because of an IP in their XFF header. The logging for that may make it difficult to figure out why they were blocked... but I would have to test that to see what it actually looked like.

(In reply to comment #36)

I'd like to get input about comment #11, and the messaging to the users. If
one
of the stewards could comment, that would be especially helpful!

Local blocking message is currently: "An IP address present in the
X-Forwarded-For header, either yours or that of a proxy server you are using,
has been blocked. The original block reason was: $1"

  • Do we need to include the full XFF header too?
  • Do we need to include the IP, or list of IPs, that triggered the block?
  • If more than one IP triggers the block, should we include the block reason?

GlobalBlocking message is currently: "One or more proxy servers that your
request used has been blocked on all wikis"

  • Both Alex and Siebrand thought "all wikis" was missleading-- should we use

"across wikis in this organization" or just it's been "globally blocked",
despite the meaning that unfamiliar users may put into "globally"?

  • Again, do we need to include the full XFF header? And the IP(s) that

triggered the actual block?

[another steward opinion]

Thought about it and imho you should mention that it's an XFF block and you should mention th IP address. Otherwise we don't know by which block he's affected. And if someone uses a proxy on purpose he probably already knows about XFF headers as well, so that shouldn't make a difference.

I'd use solely "globally blocked". And yes, same for the global block as I adviced for the local block. We should know what causes the block if someone complains about it. It's our "job" to help people.

Going back to comment #16, and what to do when multiple blocks apply to the same request, I added some logic to gerrit I3e38b94d so that:

  • Blocks that match the block's target IP are preferred over ones in a range
  • Hardblocks are chosen over softblocks that prevent account creation
  • Softblocks that prevent account creation are chosen over other softblocks
  • Other softblocks are chosen over autoblocks
  • If there are multiple exact or range blocks at the same level, the one closer (less hopps) to the server is chosen

So for some examples, I have:

  • 50.1.1.1 - softblock
  • 50.2.0.0/16 - softblock
  • 60.2.0.0/16 - softblock with account creation disabled
  • 70.2.0.0/16 - hardblock

XFF: 1.2.3.4, 70.2.1.1, 60.2.1.1, 2.3.4.5 => Hardblock
XFF: 1.2.3.4, 50.2.1.1, 60.2.1.1, 2.3.4.5 => Softblock w/ AC disabled
XFF: 1.2.3.4, 70.2.1.1, 50.1.1.1, 2.3.4.5 => Softblock

Does this seem reasonable? I'm happy to update the logic, that just seemed like it would give the strongest block, while still allowing a Steward to target an IP (with an exact match instead of a range) if a single IP in a range block is behaving more appropriately than others.

I just updated the GlobalBlocking piece of this (gerrit Ib9fb31ad) to show the following message when a user is blocked by a global block on an IP in their XFF:
"'''One or more proxy servers used by your request is globally blocked'''

The blocked IP address is: $1

  • Start of block: $2
  • Expiry of block: $3"

It's easy to update if that wording seems inappropriate to anyone.

I propose changing it to being as close as possible to the non-XFF global block:

"'''One or more of your proxy servers has been blocked on all wikis.'''

The block was made by $1 ($2).
The reason given is ''$3''.

  • Start of block: $4
  • Expiry of block: $5

You can contact $1 to discuss the block.
You cannot use the \"{{int:emailuser}}\" feature unless a valid e-mail address is specified in your [[Special:Preferences|account preferences]] and you have not been blocked from using it.
Your current IP address is $6.
Please include all above details in any queries you make."

Does this work?

Comment: and it aligns with [[bugzilla:42231]] which is about global block messages.

With many of the blocks that stewards place, as some could be longer term, it may be wiser to point people to a place, or to a generic address, rather than to the blocker. Remembering that we all leave at some point. Stewards blocking may be better deciding to point people to the OTRS, or if it is a hard block, they may only be able to edit at meta.

For the GlobalBlocking (gerrit 42884) change we have:

"'''One or more proxy servers used by your request is globally blocked'''

The block was made by $1 ($2).
The reason given is ''$3''.

  • Start of block: $4
  • Expiry of block: $5

You can contact $1 to discuss the block. You cannot use the \"{{int:emailuser}}\" feature unless a valid e-mail address is specified in your [[Special:Preferences|account preferences]] and you have not been blocked from using it. The blocked proxy address was $6.
Please include all above details in any queries you make."

For local blocks, the message is exactly the same as for current blocks, except that it mentions the XFF header in the reason, and the "Intended blockee" lists the address of the block, which will be the offending IP in their XFF header.

Should we merge with these messages, and then submit a change for the messages when the the work for bug 42231 is finished?

Sounds like a marvellous idea. Getting it into place with a message that says STUFF YOU SPAMMER would have been okay with me, though yours is a lot more professional. :-)

What would need to be done in bugzilla to progress with 42231? I am completely uncertain which would link where, with whom, etc. New bugzilla? Dependencies which way? etc.

Thanks for your work. I am so looking forward to my silly grin when I can slam the door to more of the vandals and spammers that like to prance through the XFF door for evil purposes.

(In reply to comment #45)

Should we merge with these messages, and then submit a change for the
messages
when the the work for bug 42231 is finished?

Are we still waiting for comments on this?
I think merging the messages can wait, for now it's ok to have two separate messages as always. Such a tweak could be done in a fix for bug 15259, for instance (which perhaps is already fixed by the change above, perhaps not).

Ahem. This is now 3 years after the original bug was filed, and it continues to be a major drain on the time and work of stewards and local checkusers throughout the WMF world. This problem is being exploited on a daily basis, because those who should be affected can read this bugzilla just like we can, and they see we're worrying about what the block message should say.

The block message(s) can be tweaked later if necessary. Can we please just get this in place?

(In reply to comment #48)

Ahem. This is now 3 years after the original bug was filed, and it continues
to
be a major drain on the time and work of stewards and local checkusers
throughout the WMF world. This problem is being exploited on a daily basis,
because those who should be affected can read this bugzilla just like we can,
and they see we're worrying about what the block message should say.

The block message(s) can be tweaked later if necessary. Can we please just
get
this in place?

+1 - I totally agree with Risker, please activate this change.

Chris' change approved by Aaron. :) Thanks!

The feature is controlled by the new configuration variable $wgGlobalBlockingBlockXFF, true by default.

REOPENING.

This bug is for normal blocking, not on Global blocking (which is provided with an Extension, not used on normal standalone wikis).

Normal blocking ( gerrit change 33971 ) is still pending review as of today.

Global blocking ( gerrit change 42884 ) is what has been merged.

(In reply to comment #51)

This bug is for normal blocking, not on Global blocking

As far as I can see it's now for both, but sorry for the confusion I made with the two patches...

(In reply to comment #51)

Normal blocking ( Gerrit change #33971 ) is still pending review as of today.

It was merged but then reverted because some tests were failing.

Gerrit change 56715 is the new one (once merged and even reverted, the same changeset cannot be updated)

Gerrit 56715 is now merged, and will get deployed starting on Monday.

The GlobalBlocking change will also be deployed starting Monday.

This patch adds an entry to RELEASE-NOTES-1.21 yet is not on the REL1_21
branch. Should it be backported, or should the entry go into RELEASE-NOTES-1.22 instead?

(In reply to comment #55)

This patch adds an entry to RELEASE-NOTES-1.21 yet is not on the REL1_21
branch. Should it be backported, or should the entry go into
RELEASE-NOTES-1.22
instead?

The latter, I guess. Too bad.

Related URL: https://gerrit.wikimedia.org/r/59620 (Gerrit Change I6a51e3ee07fe7622b9c708c78563795d7a1118fc)

Related URL: https://gerrit.wikimedia.org/r/63141 (Gerrit Change Iba5cfdff2f91434f94e546ead0062ff3d016dba8)