Page MenuHomePhabricator

Configure WMF wikis to log login attempts in CheckUser
Open, MediumPublic

Description

Ever since T174492 was done, we have had the ability to log login attempts in CheckUser. This task proposes that we should enable this feature in WMF wikis.

Justification

For successful login attempts, the case is clear: it is one additional data point to use for CU purposes.

For failed login attempts, there is still a case to make: these are useful to be logged in that some users (especially those who have advanced permissions such as sysop, but are not using two-factor authentication) have repeatedly reported that their accounts have been targets of several failed login attempts (presumably, at least part of it is done by malicious users that are trying to gain access to their account). Logging the failed attempts will allow CheckUsers to investigate these incidents, and possibly identify another editor who seems to be behind the malicious attempt.

Considerations

One possible question that may come up is: would this be compatible with WMF CheckUser Policy? That policy indicates that "logged actions" are within the scope of CU on WMF. In practice, this includes some of the publicly logged actions (e.g. account creation, page deletion, AbuseFilter logs) but not all of them (e.g. Thanks logs are not stored in CheckUser yet, see T252226). This has also included some activities that are not publicly logged (e.g. revision suppression) and some activities whose logs are not accessible anywhere else on MediaWiki's web interface or API (e.g. sending emails is logged by CheckUser, and so are password resets). Given that failed login attempts are already logged by MediaWiki-extensions-LoginNotify and shown (currently only to the user), storing them in CU logs should be within scope. The other related policies are Access to Nonpublic Personal Data Policy and Privacy Policy both of which are written generically and do not get into the details of which actions are logged or are not.

Another possible question: should we do an RFC about it first? The answer, IMHO, is no, it is not needed. For all of aforementioned items that are currently logged, no RFC was done first to get community consensus on whether or not to include them in the CU logs.

Screenshot

If enabled, the log entries would look like below in a "get edits" query by the IP address. If you query by the username instead, only the first row will be returned.

Further Notes
As discussed further below, it turns out that some bots log hundreds of successful logins per hour (sometimes even per minute) and that could inflate the CU tables significantly. Therefore, we will exclude successful login attempts from the CU logs kept on WMF wikis.

Action Items

  • Introduce $wgCheckUserLogSuccessfulBotLogins (r/605301)
  • Create a patch for operations/mediawiki-config that sets the two global variables.
  • Get approval from Legal
  • Get acknowledgement for T&S
  • Get approval from DBA (@Marostegui - with the conditions agreed during the task's discussion)
  • Enable this for a few wikis, and monitor the growth of DB table size; also get feedback on usefulness of the new data
    • Identify pilot wikis (fawiki and cswiki were selected)
    • Deploy it for the pilot wikis (Done on 2020-09-03)
    • Create a task for monitoring DB growth (see T261999)
  • Enable this for all wikis but a few large ones
    • Identify large wikis to be excluded (see T253802#6536344)
    • Deploy it for all wikis minus the large wikis
    • Monitor DB growth (see T265344)
  • Upon DBA satisfaction, enable at all wikis but loginwiki

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Huji updated the task description. (Show Details)Jun 15 2020, 2:35 PM
Huji updated the task description. (Show Details)Jun 15 2020, 2:40 PM
Huji added a comment.Jun 15 2020, 2:54 PM

Excellent - thanks for the clarification.
Let's not add bot logins and let's enable this slowly to make sure nothing gets out of hand?

I modified the WMF config patch so that it would be enabled only for fawiki (where Ladsgroup and I are CUs, and where we actively use the CU tool) so that we can provide feedback about its effectiveness. We can enable it for a handful of other wikis too, but I would rather enable it for those wikis in which CUs are active and engaged in this kind of discussion. If anyone here has a specific wiki in mind, we can add that. If not, I can send an inquiry on the private CU listserv.

The config patch now globally excludes successful bot logins (and depends on the other patch for $wgCheckUserLogSuccessfulBotLogins to be merged and deployed. By my count, we are at least a month away from both of these being deployed and some initial feedback on usefulness to become available.

Huji renamed this task from Configure WMF wikis to log successful and unsuccessful login attempts in CheckUser to Configure WMF wikis to log login attempts in CheckUser.Jun 15 2020, 4:48 PM
Huji added a comment.Jun 25 2020, 5:52 PM

@DannyS712 when you get the chance, can I ask you to please review https://gerrit.wikimedia.org/r/605301/ ?

I am going to follow up with T&S team to see if they have any input on this. Don't want to lose momentum here.

Huji added a comment.Jun 26 2020, 2:57 PM

@Ladsgroup I understand that due to T256395, all users were logged out and had to log in again. This could provide a once-in-a-longtime opportunity for us to see how much login data can surge up to. Can you pull the numbers from Grafana again, and tell us if and by how much they changed? You may want to do it a few days from now (since not everyone logs in daily, and those who chose to "remember me" will be logging in again). This data would be useful just in case we think something like this might happen again, and estimate its potential impact on CU data volume.

@Ladsgroup I understand that due to T256395, all users were logged out and had to log in again. This could provide a once-in-a-longtime opportunity for us to see how much login data can surge up to. Can you pull the numbers from Grafana again, and tell us if and by how much they changed? You may want to do it a few days from now (since not everyone logs in daily, and those who chose to "remember me" will be logging in again). This data would be useful just in case we think something like this might happen again, and estimate its potential impact on CU data volume.

I just checked. The total number of logins hasn't changed (and reduced a bit) thanks to bots (the bot logins are five times of the non-bot logins in total) but in non-bots logins you can see a large jump once everyone were logged out (which is slowly reducing to its general trend, hasn't reached it yet). My rough estimation (based on the charts) says another incident like this would put an extra 100k human logins on the system.

General note: We should ratelimit logins by bots, logins are CPU intensive (due to complex encryption) and a bot doing logins 600 times a minute is not sustainable (even environmentally).

Stryn added a subscriber: Stryn.Jul 6 2020, 6:45 PM
L235 added a subscriber: L235.Jul 6 2020, 8:35 PM
jrbs added a subscriber: jrbs.Jul 6 2020, 9:46 PM

Just flagging here for the record - T&S has been made aware of this task and have no objections.

Huji updated the task description. (Show Details)Jul 6 2020, 10:38 PM
Huji updated the task description. (Show Details)Jul 8 2020, 6:54 PM

Removing DBA tag as there is nothing for us to act on. However, I will stay on the task to provide feedback once this is released and we start monitoring the table growth.

Change 605301 merged by jenkins-bot:
[mediawiki/extensions/CheckUser@master] Adding $wgCheckUserLogSuccessfulBotLogins

https://gerrit.wikimedia.org/r/605301

Huji updated the task description. (Show Details)Aug 24 2020, 12:12 AM
Huji added a comment.Aug 24 2020, 12:16 AM

The next step is to enable this on a few select wikis. As an active fawiki CU, I suggest we enable it there and I can make sure our CUs reflect on its usefulness and provide feedback in short order. Which other wikis should we target at this time?

Note that we will also have to create a task similar to T257223 to monitor the impact of this new feature on DB size. I am happy to make the task once we enable the feature.

I'm happy to test with cswiki, but perhaps we should confirm the final set with @Marostegui and the DBA team?

I am fine with both fawiki and cswiki. Once we are happy with those we can perhaps move to some other a bit bigger ones.

Change 622631 had a related patch set uploaded (by Huji; owner: Huji):
[mediawiki/extensions/CheckUser@wmf/1.35.0-wmf.5] Adding $wgCheckUserLogSuccessfulBotLogins

https://gerrit.wikimedia.org/r/622631

Does it really need backporting? It’ll be live everywhere by the end of the week anyway on the train...

Huji added a comment.Aug 27 2020, 12:24 AM

Only because without back-porting, I have no way to know if the other patch (which is now scheduled for deployment tomorrow) actually worked.

I am assuming that we must test every patch deployed when we deploy them. No?

Huji added a comment.Aug 27 2020, 12:25 AM

Alternatively, we can push back the deployment of 599492 until next week. But is back-porting really that big of a deal?

You're backporting something to a branch that is only going to exist for probably another hour after... Just seems a bit pointless to me. Sure, if you'd done it say on Monday, it would've lasted for a few days...

Why not just use the window after the train to turn it on?

Change 622693 had a related patch set uploaded (by Huji; owner: Huji):
[mediawiki/extensions/CheckUser@wmf/1.36.0-wmf.5] Adding $wgCheckUserLogSuccessfulBotLogins

https://gerrit.wikimedia.org/r/622693

Change 622631 abandoned by Huji:
[mediawiki/extensions/CheckUser@wmf/1.35.0-wmf.5] Adding $wgCheckUserLogSuccessfulBotLogins

Reason:
Cherry was picked incorrectly.

https://gerrit.wikimedia.org/r/622631

Huji added a comment.Aug 27 2020, 12:44 AM

Why not just use the window after the train to turn it on?

Okay, fair.

Huji updated the task description. (Show Details)Aug 27 2020, 12:48 AM

Change 622693 abandoned by Huji:
[mediawiki/extensions/CheckUser@wmf/1.36.0-wmf.5] Adding $wgCheckUserLogSuccessfulBotLogins

Reason:

https://gerrit.wikimedia.org/r/622693

You're backporting something to a branch that is only going to exist for probably another hour after... Just seems a bit pointless to me. Sure, if you'd done it say on Monday, it would've lasted for a few days...

Why not just use the window after the train to turn it on?

If the train gets rolled back, the database would get filled with a lot of login attempts. Hopefully that wouldn't backfire or break things in a visible way, but the DBAs sounded to me a little bit worried about that happening. A train rollback happens when some serious bug is happening, and I don't think we should add more potential sources of an issue for the train conductor after a rollback. In another words, I'd like to be 100% sure bot logins are excluded, and that can be done only with a backport.

Reedy added a comment.Aug 27 2020, 8:20 AM

You're backporting something to a branch that is only going to exist for probably another hour after... Just seems a bit pointless to me. Sure, if you'd done it say on Monday, it would've lasted for a few days...

Why not just use the window after the train to turn it on?

If the train gets rolled back, the database would get filled with a lot of login attempts. Hopefully that wouldn't backfire or break things in a visible way, but the DBAs sounded to me a little bit worried about that happening. A train rollback happens when some serious bug is happening, and I don't think we should add more potential sources of an issue for the train conductor after a rollback. In another words, I'd like to be 100% sure bot logins are excluded, and that can be done only with a backport.

That's not true. You could wait to enable the feature. That is another way. It's not like this needs to be urgently turned on.

You're backporting something to a branch that is only going to exist for probably another hour after... Just seems a bit pointless to me. Sure, if you'd done it say on Monday, it would've lasted for a few days...

Why not just use the window after the train to turn it on?

If the train gets rolled back, the database would get filled with a lot of login attempts. Hopefully that wouldn't backfire or break things in a visible way, but the DBAs sounded to me a little bit worried about that happening. A train rollback happens when some serious bug is happening, and I don't think we should add more potential sources of an issue for the train conductor after a rollback. In another words, I'd like to be 100% sure bot logins are excluded, and that can be done only with a backport.

That's not true. You could wait to enable the feature. That is another way. It's not like this needs to be urgently turned on.

Sure, waiting for the next train (wmf.7) to account for a rollback would be a way :-).

Huji added a comment.EditedAug 27 2020, 12:05 PM

Okay, so I will push back the deployment by one week?

Asking because I don't see the date/time for wmf.7 listed on https://wikitech.wikimedia.org/wiki/Deployments yet and am assuming it would be exactly one week later than the wmf.6 train departure.

Change 599492 merged by jenkins-bot:
[operations/mediawiki-config@master] Start logging log-ins on select wikis

https://gerrit.wikimedia.org/r/599492

Huji updated the task description. (Show Details)Sep 3 2020, 11:31 PM

Mentioned in SAL (#wikimedia-operations) [2020-09-03T23:31:08Z] <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: 93947391e97be11a9cd7eb4713b274b05d5b371a: Start logging log-ins on select wikis (T253802) (duration: 00m 56s)

Huji updated the task description. (Show Details)Sep 3 2020, 11:45 PM

Thanks @Marostegui!

@Huji Where should we roll this out now? Everything but the big wikis (enwiki, commons, wikidata)? Small+medium would be the obvious choice, but that wouldn't be useful for anyone but stewards (but can serve as a further proof of low dangerousity)?

@Marostegui Do you have any preference?

Thanks for the ping @Urbanecm - I think I like the idea of small + medium and leave it for 24h or so, to make sure nothing obvious arises and then we can continue with bigger ones.
Is that ok?

That's absolutely fine with me :).

Huji added a comment.Sep 22 2020, 3:26 PM

@Urbanecm do we have any way (perhaps through logstash) to estimate the number of logins per wiki? I know it'll be an inaccurate estimate (because it would include bot logins, which won't be logged in CU) but that can help us decide which large wiki to go to first, or last.

I like the idea of small + medium, but out of abundance of caution, I would suggest the step after that not to be "enable for all large wikis" but rather, a phased expansion to large wikis.

Yes @Huji. Here are some numbers (it's always a list generated from logs from 2020-09-20 08:31:38 to 2020-09-21 08:27:19):

  • Logins for all wikis: P12735
  • Logins for small+medium wikis: P12736
  • Logins for large wikis: P12737
Huji added a comment.Sep 22 2020, 8:42 PM

For all but enwiki, the number of logins is within the same order of magnitude as fawiki (i.e. not 10x or more). The only exception seems to be enwiki. How about we roll it out everywhere except the wikis that have the same number of logins as wikidatawiki or higher. A week later, we enable it for all wikis except enwiki. A week after that, also for enwiki. This way, our most sensitive wikis (commons and wikidata, which are the backbone of every other wiki, and enwiki which is an outlier) won't go live all at the same time and we have the chance to notice any issues that -- for whatever reason -- were not detectable at smaller size wikis before they reach the sensitive ones.

Change 629227 had a related patch set uploaded (by Urbanecm; owner: Urbanecm):
[operations/mediawiki-config@master] Enable wgCheckUserLogLogins at all wikis but few large wikis

https://gerrit.wikimedia.org/r/629227

Seems good @Huji. See proposed patch.

Meirae added a subscriber: Meirae.Oct 4 2020, 3:17 PM
Huji added a comment.Oct 10 2020, 12:37 AM

@Urbanecm what is the next step here? Will you be scheduling the patch for deployment?

@Urbanecm what is the next step here? Will you be scheduling the patch for deployment?

Right, thanks for the reminder. I'll deploy it next week, and then we can see how it goes :).

Urbanecm moved this task from Backlog to To deploy on the User-Urbanecm board.Oct 10 2020, 12:00 PM

Change 629227 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable wgCheckUserLogLogins at all wikis but few large wikis

https://gerrit.wikimedia.org/r/629227

Mentioned in SAL (#wikimedia-operations) [2020-10-12T11:28:05Z] <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: 4966e8a6b8ae4e6d5623dd35e65ed8fcf3338bc1: Enable wgCheckUserLogLogins at all wikis but few large wikis (T253802) (duration: 00m 58s)

@Urbanecm what is the next step here? Will you be scheduling the patch for deployment?

This is live at all wikis, excluding:

  • arwiki
  • bnwiki
  • commonswiki
  • dewiki
  • enwiki
  • frwiki
  • wikidatawiki

@Marostegui Please let me know if this causes a DB issue anywhere. Shall I create a follow-up task for monitoring?

alaa added a subscriber: alaa.Oct 12 2020, 1:30 PM

Thanks @Urbanecm - so far nothing has shown up on the graphs.
Let's create a task to monitor the size of the involved tables on eswiki and maybe some other big wikis, for 4 weeks, so we can make sure they don't grow massively.

Thanks for the great work guys. Just want to confirm that this will not be deployed to loginwiki.

There was not a consensus for that when it was discussed on the list, as it would create global CU. Another CU wasn’t sure on this point as it wasn’t documented here, so I wanted to make sure it wasn’t missed.

Huji added a comment.Oct 17 2020, 1:50 PM

Good point @TonyBallioni
In the paste that @Urbanecm shared in T253802#6484459 I did not see loginwiki listed. Granted, that paste only included the most frequent wikis in each category. But I think we should explicitly add loginwiki to the exclusions in a new patch. @Urbanecm would you agree, and do you want to take the lead on this?

Ive been following this for a while. How does this interact with SUL? is the login action only tracked on the home wiki, or all wikis you visit?

Huji added a comment.Oct 18 2020, 2:11 AM

The login action is only tracked on the home wiki.

Change 635819 had a related patch set uploaded (by Urbanecm; owner: Urbanecm):
[operations/mediawiki-config@master] Do not log logins at loginwiki via CU

https://gerrit.wikimedia.org/r/635819

Change 635819 merged by jenkins-bot:
[operations/mediawiki-config@master] Do not log logins at loginwiki via CU

https://gerrit.wikimedia.org/r/635819

Urbanecm updated the task description. (Show Details)Oct 22 2020, 12:09 PM

Mentioned in SAL (#wikimedia-operations) [2020-10-22T12:10:39Z] <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: 52ad2d4df1164dced684231c12aa64bd028b8ac9: Do not log logins at loginwiki via CU (T253802) (duration: 01m 06s)

Thanks for the great work guys. Just want to confirm that this will not be deployed to loginwiki.

There was not a consensus for that when it was discussed on the list, as it would create global CU. Another CU wasn’t sure on this point as it wasn’t documented here, so I wanted to make sure it wasn’t missed.

Fixed, and set loginwiki to false.

Change 639095 had a related patch set uploaded (by Urbanecm; owner: Urbanecm):
[operations/mediawiki-config@master] Enable wgCheckUserLogLogins at all wikis but loginwiki

https://gerrit.wikimedia.org/r/639095

FYI: I plan to deploy the above patch later today (probably during Morning B&C window). I will create a task for the DBAs to monitor the big wikis then, and email CUs to announce the new feature.

Huji added a comment.Nov 4 2020, 2:36 PM

Fingers crossed.

Change 639095 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable wgCheckUserLogLogins at all wikis but loginwiki

https://gerrit.wikimedia.org/r/639095

Mentioned in SAL (#wikimedia-operations) [2020-11-04T20:08:16Z] <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: fb5c03262c20b5e99b3c2f6e91abb024f12da1f5: Enable wgCheckUserLogLogins at all wikis but loginwiki (T253802) (duration: 01m 08s)

So, this was done at all wikis (excl. loginwiki). Nice work everyone!