Send a cookie with each block
Closed, ResolvedPublic8 Story Points

Description

I would like to propose an extension of the autoblocker. The goal is to limit an attacker's
ability to continue their bad behavior by obtaining a new IP address (i.e. by redialing an ISP
or resetting a DSL connection).

Whenever someone logs into a mediawiki wiki from an account that has been blocked, in addition
to autoblocking their IP address, send their browser a cookie identifying the block in
question. Then every time someone tries to edit anonymously or create a new account, look for
this cookie and if it exists, see whether the block it references is still active and if so
block the new IP as well.

Such cookies would need to be temporary (just as auto-blocks are temporary (is it 24 hours?))
to avoid catching good people on public computers.

Of course, some black hats would be able to find ways around such cookies (its not all that
hard), but I would be willing to wager that most of the people engaged in juvenile vandalism
are not all that computer savvy.

Two possible conflicts come to my mind. One, if someone is blocked for having an
inappropriate username, it is not obvious that we would necessarily want their IP blocked from
creating a new account. (How is this handled in the present autoblocker?) Though for most
inappropriate usernames, being forced to sit down for 24 hours might not be a bad thing.

The other possible conflict is when a user has multiple accounts. If a sockpuppet account is
blocked, should that necessarily impose a short block on the other accounts as well? Maybe
so. This could be avoided by only checking for the cookie when someone is acting anonymously,
or we could just decide that a block on one should lead to a short block of all (is this the
effect of the current autoblocker?).

Anyway, anything to slow down returning persistent vandals is in my mind a good thing.

-DF

Details

Reference
bz3233

Related Objects

There are a very large number of changes, so older changes are hidden. Show Older Changes
Tgr added a comment.Oct 28 2016, 9:25 PM

An issue came up during CR that I'm unsure about and would like more opinions on.

Imagine the scenario where

  • user A gets blocked (with the autoblock flag)
  • user A logs in on a public machine (library, school classroom etc); the browser gets tagged
  • later, user B uses the same machine

B is now blocked, which in itself is fine (autoblocks always hold the risk of some collateral damage), but they can find out whom the block was meant for (by looking up the block message in the list of blocks, or by getting the block ID from the cookie and looking that up) and thus learn that A uses the same machine (and the exact time of usage too, from the cookie expiration time) and potentially deanonymize that user.

Is that acceptable? It's not that far from autoblock behavior (where B learns that A uses the same IP) but learning e.g. which class a user goes to seems worse than learning who their ISP is.

@Tgr: I can't think of many situations in which a public machine doesn't have the browser reset between users, but obviously there's more potential to share private information on a public machine. I'm not sure there's any way that we can completely prevent that. If you're letting other people share your browser session, you're also potentially sharing your history, saved passwords, cookies for other sites, etc. We can't be expected to completely protect someone's privacy if they aren't using a private session.

Tgr added a comment.Oct 29 2016, 12:13 AM

Silently recording a user identifier seems worse to me than the browser recording passwords when you explicitly ask it to. Fair point about the history though. Also I just realized that we record the username in a cookie straight away, and show it at next login. That should probably be fixed (filed T149465) but that no one complained about it so far shows that you are right about the privacy expectations.

A couple of questions:

  1. @Tgr suggested that the cookie expiry should never be longer than 1 day (or, I'm assuming, whatever the autoblock expiry is; it's 1 day on WMF sites). Alternatively, the cookie will expire whenever the block does, except for infinite blocks which will expire at the autoblock-expiry time. Is there consensus about this? My reading of it until now was to go with the longer cookie expiry.
  2. If a user gets blocked and a cookie set, then they log out, and then another comes along to that computer and tries to log in—they'll also be blocked (that's the point; in most cases this would be the same person.) Which means that there won't be a situation of the second user being independently blocked and us wanting to record a 2nd block ID in on that computer. Would there? (Or equally, of being worried about overwriting the existing cookie value?)

@Tgr suggested that the cookie expiry should never be longer than 1 day (or, I'm assuming, whatever the autoblock expiry is; it's 1 day on WMF sites). Alternatively, the cookie will expire whenever the block does, except for infinite blocks which will expire at the autoblock-expiry time. Is there consensus about this? My reading of it until now was to go with the longer cookie expiry.

Personally, I think either system is fine. I'd love to hear other people's thoughts on it, though.

If a user gets blocked and a cookie set, then they log out, and then another comes along to that computer and tries to log in—they'll also be blocked (that's the point; in most cases this would be the same person.) Which means that there won't be a situation of the second user being independently blocked and us wanting to record a 2nd block ID in on that computer. Would there? (Or equally, of being worried about overwriting the existing cookie value?)

I suppose it's technically possible for 2 users on the same computer (most likely sock-puppets of the same person) to be blocked separately under different IPs. In that case, it's also possible for the 2nd block value to overwrite the 1st in the cookie. Either way, that browser is still blocked, which is the important part. Worst case scenario is that they only get subjected to the shorter block period, but this is a very minor shortcoming of a rare edge case. I don't think it would be worth it to create a system of setting and reading multiple potential cookies to deal with this edge case.

Tgr added a comment.Nov 8 2016, 10:52 AM

Users sometimes get blocked for years; setting years-long cookies with personally identifiable data is problematic (we probably claim in the privacy policy that we don't do it). If 1 day is too short, we could use the default cookie expiration time (which is 30 days by default).

Re: two blocks, it might be possible to avoid restrictions by running into a lighter block (e.g. can send mail, can write to user talk pages). Can't think of a scenario though where that would be a real problem as long as user/IP blocks always take precedence over IP ones.

Anecdotally and from a non-Wikimedia experience, most sockpuppeteers (try to) evade their blocks rather soon after it happens, usually within a month and almost always within 3 months.

Yes, it sounds sensible that the cookie shouldn't live longer than $wgCookieExpiration (I was going to suggest that WebResponse::setCookie() should prevent anyone setting expiration times greater than that, but actually that var is just the *default* expiration time; there should be something like $wgCookieMaxExpiration that'd prevent any living longer... but that's another issue; ignore me).

So, if it's an autoblock, the cookie will expire after the autoblock-expiry; if it's not it'll expire after the earlier of: the block expiry, or $wgCookieExpiration. How's that sound?

@Samwilson: That sounds reasonable to me. We'll just need to add lots of code comments to make sure that is clear in the code and no one accidentally changes it.

Apologies for dragging this out forever, but now that I see the expirations clearly laid out, I'm not sure they are the best solution:

  • If less than $wgCookieExpiration, the Block expiry will be used.
  • If between $wgCookieExpiration and 'infinity', $wgCookieExpiration will be used (defaults to 180 days).
  • If set to 'infinity', then $wgAutoblockExpiry will be used (defaults to 1 day).

A block expiration of 1 year would result in a cookie-block of 180 days, but a block expiration of infinity would result in a cookie-block of only 1 day. In other words, you can get a shorter cookie-block from a longer actual block, which seems awkward. I think it might be better if we settled on either using $wgCookieExpiration or $wgAutoblockExpiry, but not both. My preference would be to use $wgCookieExpiration. So basically, we would match the Block expiration up to $wgCookieExpiration. That should make things less confusing. Thoughts?

Yes, makes sense @kaldari. I agree with you about sticking to just using $wgCookieExpiration whenever a block's expiry is over that (including infinity).

@Tgr: What do you think of that proposal? i.e. cookie expiration will match the block expiration unless it is longer than $wgCookieExpiration, in which case it will be set to $wgCookieExpiration. Since the cookie always points back to the original block and the original block's expiration is always checked when applying the cookie block, I think this should be OK. Also, there is far less potential for collateral damage here than a regular IP autoblock, so I don't think we need to be as conservative.

Tgr added a comment.Nov 15 2016, 12:58 AM

Sounds fine to me, although T5233#2779033 sounds fine too, it just doesn't match the current patch.

$wgCookieExpiration is 30 days btw, the login cookie uses $wgExtendedLoginCookieExpiration.

What is $wgExtendedLoginCookieExpiration set to on WMF wikis? (It's null by default, meaning it uses $wgCookieExpiration).

I think it makes sense to stick with $wgCookieExpiration rather than $wgExtendedLoginCookieExpiration, as the latter is specifically about the login cookie.

I'll update the patch now.

Tgr added a comment.Nov 15 2016, 1:45 AM

Huh, apparently default MediaWiki has 180 days for cookie expiration and the login cookie uses that; Wikimedia has 30 days for cookie expiration and 365 days for login. That seems wrong. Probably the huge default cookie expiration time was set before there was a separate variable for login cookies.

Anyway, using $wgCookieExpiration for a cookie is the natural thing to do. The MediaWiki default should probably be fixed.

Change 48029 merged by jenkins-bot:
Send a cookie with autoblocks to prevent vandalism.

https://gerrit.wikimedia.org/r/48029

kaldari closed this task as "Resolved".Nov 16 2016, 7:15 PM
kaldari moved this task from Needs Review/Feedback to Done on the Community-Tech-Sprint board.
Niharika moved this task from Backlog to Archive on the Community-Tech board.Nov 28 2016, 11:00 AM
DannyH reopened this task as "Open".Jan 30 2017, 7:22 PM

Reopening so that this can be a tracking ticket as we continue to work on refinements...

kaldari closed subtask Restricted Task as "Resolved".Feb 13 2017, 11:45 PM
Samwilson removed Samwilson as the assignee of this task.Feb 14 2017, 10:56 PM
Samwilson added a subscriber: Samwilson.
MusikAnimal closed this task as "Resolved".Thu, Mar 30, 3:54 PM
MusikAnimal claimed this task.

Deployed to English Wikipedia and working beautifully :) Those with access to EventLogging data can monitor eventlogging_CookieBlock to see how often blocks are being enforced by the cookie. We've also been keeping a close eye on the autoblock list and all seems well. We'll deploy to other wikis soon. Huge thanks and congrats to @Samwilson, and all who helped with this!!

Xeno added a subscriber: Xeno.Thu, Mar 30, 6:15 PM

This has been made live on English Wikipedia. It has been suggested to provide the ability to control whether a cookie block is placed separate from the autoblock option due to potential personal liability that could be acquired in countries with "cookie laws".

Where can I find the eventLogging data? I would like to look at the cookieBlock extension.

I looked at the eventlogging data. As of right now, there have been 1134 cookie-based blocks on enwiki. Woah!

Where can I find the eventLogging data? I would like to look at the cookieBlock extension.

It's private data so not everyone can access it. To look at the code you don't need to access the eventlogging data. CookieBlock is not an extension but a part of MediaWiki core. See https://gerrit.wikimedia.org/r/#/c/48029/

Yoshi24517 added a comment.EditedFri, Mar 31, 3:20 AM

I didn't mean the code. I meant the eventLogging data. (Scratch that comment.)

Ah never mind. I see that I have to be a Wikimedia Foundation employee. Thanks anyway. I think it would be really cool to watch though. Anyway thanks for telling me.

TheDJ added a subscriber: TheDJ.Fri, Mar 31, 8:01 AM

due to potential personal liability that could be acquired in countries with "cookie laws".

Can you link those discussions ? It seems rather exceptional if there would be personal liability for that. But maybe WMF-legal could weigh in.

I looked at the eventlogging data. As of right now, there have been 1134 cookie-based blocks on enwiki. Woah!

Yeah... I'm questing if that was set up correctly. If that figure were true we should see a large influx of active autoblocks, which I have not noticed. I shall investigate!

In T5233#3144570, @Xeno wrote:

This has been made live on English Wikipedia. It has been suggested to provide the ability to control whether a cookie block is placed separate from the autoblock option due to potential personal liability that could be acquired in countries with "cookie laws".

I've spoke with legal and they should comment on the discussion soon.

jrbs added a subscriber: jrbs.Sat, Apr 1, 9:26 AM
jeblad added a subscriber: jeblad.EditedMon, Apr 3, 6:23 PM

In some countries we have short dynamic lease of IP addresses, that means a block will start to propagate. We have also some admins that belive blocking IP ranges is a good thing. That means blocking parts of the network for ISPs. When you add this on top the IP block then you effectively block the whole IP-range for that ISP.

Sorry, but this idea is utterly stupid! I'm just waiting for articles about "the new internett-virus spread by Wikipedia"!

Tgr added a comment.Mon, Apr 3, 7:21 PM

The EU data protection working group advisory WP-194 section 3.3 ("cookies set for the specific task of increasing the security of the service that has been explicitly requested by the user") clearly applies here, as long as the cookie is only set on edit attempts (IIRC not the case with the current implementation, but would be a trivial change).

Kjetil added a subscriber: Kjetil.Mon, Apr 3, 7:22 PM
jeblad added a comment.Mon, Apr 3, 7:41 PM

I doubt that interpretation hold for this task, but it surely will not hold at all for T152462 and definitely not for T152953.

jeblad added a comment.Mon, Apr 3, 7:48 PM

Given the statement in T5233#980836 and how this will behave, I can not see how this should be claimed to not be interfering with a multiuser environment. It is used for blocking of IP addresses where the blocked user itself may not be active, but other users might be active. In T152462 there isn't even made any attempt to check if there is a single user? In T152953 multiple users will be involved, possibly an unlimited number?

In T5233#3151976, @Tgr wrote:

... as long as the cookie is only set on edit attempts (IIRC not the case with the current implementation, but would be a trivial change).

One problem with only setting the cookie on an edit attempt is that it relies on the person being blocked then attempting to edit after being blocked but before changing their IP address. That might be quite okay though; I guess most blocked people try to edit after being blocked?

Akeron added a subscriber: Akeron.Tue, Apr 4, 8:13 AM