Page MenuHomePhabricator

Users occasionally logged in as different users after SessionManager deployment
Closed, ResolvedPublic

Description

Last known occurrence was 2016-02-22. Since then a number of defense mechanisms were added against the assumed causes:

, rMW8b413431d760: Guard against allowing intermediate caching when cookies are present, rMWb84fae0173b2: Use header_register_callback to avoid caching responses with Set-Cookie headers


Three reports so far:

  • Matiia was recognized as Schexnayder on commonswiki on 2016-01-29
  • Matanya as Pas99 on hewiki on 2016-01-30
  • FallingGravity as 023yangbo on enwiki on 2016-02-22

Outreach/report collection task is T126069: Collect information about session pollution during the previous SessionManager rollouts.

Hypotheses so far:

  • Set-Cookie header gets cached (has happened in past)
    • SessionManager emits Set-Cookie in some cases old code did not (when $wgUser is not unstubbed during the request, or when cookies are inconsistent in certain ways)
    • varnish nukes everything with a Set-Cookie, but has an exception for Special:HideBanner
    • all pages should vary on cookie, and they should set Cache-control: private if the session is persisted (and it must be persisted if Set-Cookie is output for session things). RawAction did not do that (now patched), load.php and similar secondary endpoints still don't (but see T127233). HideBanners disables vary headers, but only sets cache-control:public for anons.
    • if Set-Cookie gets cached, shouldn't there be lots of incidents with the same user?
      • only concurrent requests affected via varnish coalescing? a hit-for-pass cannot be coalesced into, though, and it looks like our varnish config never coalesces session cookies. (Also @ArielGlenn says there was not overlap in user activity.)
  • job runner does multiple importScopedSession calls in same request which somehow leak through
    • doesn't seem to be any leak

Other things looked at:

  • no obvious pattern in user ids (such as having the same id on different wikis)
  • session id generation was guarded with an extra check in case there is some sort of random generator failure, that didn't help
  • added logging for same session or user account being used from lots of different IPs simultaneously, didn't find anything interesting

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I just tried merging one of my accounts on various wikimedia sites and it didn't do anything like this, but moving into the security area anyway.

Schexnayder@commonswiki merged by login at 03:26, and then CentralAuth.log shows Matiia logging in shortly before this task was created.

This is the request in which Schexnayder got autocreated (so the session mixup likely happened here), but the logs don't tell me much; the only interesting message is getSessionById: failed to create empty session: Session ID already exists. Which, if I'm reading it correctly, means that there is a valid session cookie, but loading the session is failed by CentralAuth's local/central source check; but then somehow the request ends up with a valid session after all (and goes on to autocreate the user), which is weird.

The request that autocreated Schexnayder was from https://meta.wikimedia.org/wiki/2016_Strategy/Reach to commons.wikimedia.org/w/index.php?title=MediaWiki:Gadget-LanguageSelect.js&action=raw&ctype=text/javascript at 2016-01-30 03:26:23 which matches Matiia's activity well (edit to same page a few hours later).

but then somehow the request ends up with a valid session after all

The session ID it tried to use (from some cookie or something) couldn't be authenticated, so it should have created a new empty session with a different ID. So ending up with a valid session after all isn't that weird.

This is the request in which Schexnayder got autocreated (so the session mixup likely happened here), but the logs don't tell me much;

To get more data I looked into the CheckUser database tables. The auto-creation of Schexnayder on commonswiki matches IP addresses with Schexnayder's recent activity on enwiki, while Matiia's recent activity on both commonswiki and metawiki has a very different IP address. So it looks to me like the session mixup was more likely to have happened when Matiia went to commonswiki.

I can think of one way this could have happened, but the chances seem absurdly low:

  1. Matiia on Commons gets into the state of having a session cookie but being logged out (e.g. logging out or doing something that sets a cookie while logged out). The chances of this happening recently aren't too bad thanks to T125194.
  2. Time passes with no activity on Commons, so Matiia's session expires from redis.
  3. Schexnayder's account gets auto-created on Commons. The cross-wiki load of [[commons:MediaWiki:Gadget-LanguageSelect.js]] is apparently global on Meta, so that part's not hard. Schexnayder happens to get assigned the same session ID that Matiia had earlier, which is a crazy-tiny chance.
  4. Matiia hits Commons again with that same session cookie (possibly by Meta's hitting Gadget-LanguageSelect.js), which picks up Schexnayder's new session.

I don't know if that's actually what happened here, but

should close that loophole.

(edit: new version of patch)

A collision of 128-bit tokens is unrealistic (unless something is wrong with our random number generation). Even with the recent bug of generating a new session on every pageview, about 10^9 sessions could have existed (in practice much less at the same time since most of them got evicted); collisions start to happen around 10^19.

Is it possible that importScopedSession somehow bleeds over when jobs from two different users are selected into the same job runner process? It's the only setup I can think of where tokens from multiple users can coexist in the same PHP request.

Reported at 2016-01-30T19:27Z:

[12:27]  <  matanya>	_joe_: i was logged in now a another user! which is not me
[12:27]  <  matanya>	anyone else had this issue ?
[12:29]  < Nemo_bis>	on the wikis?
[12:29]  <  matanya>	he.wikipedia
bd808 triaged this task as Unbreak Now! priority.Jan 30 2016, 7:34 PM

Uploaded @Anomie's patch to mira as /srv/patches/1.27.0-wmf.11/core/04-T125283.patch, applied and synced to cluster

Additional info from PM with matanya (UTC time 19:37):

[12:37]  <    bd808>	do you know the last time you logged out before finding yourself in another account?
[12:38]  <  matanya>	two weeks ago ?
[12:38]  <  matanya>	and hi :)
[12:38]  <    bd808>	hi :)
[12:39]  <    bd808>	I assume you've been active on hewiki in the last couple of days as well,  correct?
[12:39]  <  matanya>	i have been
[12:39]  <  matanya>	and now i notice some of my edits are not there
[12:39]  <  matanya>	maybe i performed them under some other account
[12:40]  <    bd808>	fuuuuck
[12:41]  <  matanya>	I only noticed because i was forcibly logged out, and right back in again, but in another account
[12:42]  <    bd808>	that's good info
[12:42]  <  matanya>	I was logged into pas99 account
[12:42]  <    bd808>	I don't know if the force logout script we kicked off last week is still running or not
[12:43]  <  matanya>	https://meta.wikimedia.org/w/index.php?title=Special%3ACentralAuth&target=Pas99
[12:43]  <  matanya>	this ^ account
[12:43]  <  matanya>	but it has no edits on he.wiki
[12:43]  <    bd808>	hmmm and very new account
[12:43]  <  matanya>	so god knows who had my account before
[12:45]  <    bd808>	can you think of what some of the missing edits are and see if you can find what account they were made as? Not sure this helps debug at all at this point but more data is generally good
[12:46]  <  matanya>	i'll try to find, but i doubt i'll remeber, i mostly edit from mobile lately, and that was totally broken during this session shit
[12:47]  <    bd808>	:/
[12:47]  <    bd808>	these are crappy bumps on the way to good things for authn
[12:48]  <    bd808>	2factor and external hash storage
[12:48]  <  matanya>	no joy without some shit in the way
[12:48]  <  matanya>	i totally understand and happy we are going in this direction
[12:48]  <    bd808>	glad to hear that :)
[12:50]  <  matanya>	I found a page i edited, but i don't see my edit
[12:50]  <  matanya>	https://he.wikipedia.org/wiki/%D7%AA%D7%95%D7%A4%D7%A2%D7%AA_%D7%9C%D7%95%D7%95%D7%90%D7%99_%28%D7%AA%D7%A8%D7%95%D7%A4%D7%94%29
[12:50]  <  matanya>	the history function in the app becomes useful 
[12:59]  <  matanya>	anyhow, i need to move on, i'll be back later, feel free to leave a message here.
fluorine:/a/mw-log
bd808$ ack Matanya CentralAuth.log
2016-01-30 17:35:24 mw1020 commonswiki CentralAuth INFO: authentication for 'Matanya' succeeded
2016-01-30 17:35:24 mw1020 commonswiki CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Matanya'
2016-01-30 17:35:24 mw1020 commonswiki CentralAuth INFO: Already fully migrated user 'Matanya'
2016-01-30 18:59:44 mw1055 mediawikiwiki CentralAuth INFO: authentication for 'Matanya' succeeded
2016-01-30 18:59:44 mw1055 mediawikiwiki CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Matanya'
2016-01-30 18:59:44 mw1055 mediawikiwiki CentralAuth INFO: Already fully migrated user 'Matanya'
2016-01-30 19:28:02 mw1109 hewiki CentralAuth INFO: authentication for 'Matanya' succeeded
2016-01-30 19:28:02 mw1109 hewiki CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Matanya'
2016-01-30 19:28:02 mw1109 hewiki CentralAuth INFO: Already fully migrated user 'Matanya'
2016-01-30 21:09:43 mw1184 commonswiki CentralAuth INFO: authentication for 'Matanya' succeeded
2016-01-30 21:09:43 mw1184 commonswiki CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Matanya'
2016-01-30 21:09:43 mw1184 commonswiki CentralAuth INFO: Already fully migrated user 'Matanya'
2016-01-30 21:28:48 mw1188 commonswiki CentralAuth INFO: authentication for 'Matanya' succeeded
2016-01-30 21:28:48 mw1188 commonswiki CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Matanya'
2016-01-30 21:28:48 mw1188 commonswiki CentralAuth INFO: Already fully migrated user 'Matanya'
fluorine:/a/mw-log
bd808$ ack Pas99 CentralAuth.log
2016-01-30 08:47:37 mw1023 dewiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass
2016-01-30 11:16:06 mw1173 enwiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass
2016-01-30 11:30:23 mw1042 enwiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass
2016-01-30 13:10:57 mw1252 dewiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass
2016-01-30 17:51:55 mw1052 ptwiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass
2016-01-30 18:15:21 mw1240 enwiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass
2016-01-30 20:47:14 mw1037 enwiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass
fluorine:/a/mw-log/archive
bd808$ zgrep Pas99 CentralAuth.log-201601*
CentralAuth.log-20160109.gz:2016-01-08 09:22:30 mw1077 ruwikibooks CentralAuth INFO: authentication for 'Pas9988' succeeded
CentralAuth.log-20160109.gz:2016-01-08 09:22:31 mw1077 ruwikibooks CentralAuth INFO: Set global password for 'Pas9988'
CentralAuth.log-20160109.gz:2016-01-08 09:22:31 mw1077 ruwikibooks CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Pas9988'
CentralAuth.log-20160109.gz:2016-01-08 09:22:31 mw1077 ruwikibooks CentralAuth INFO: Already fully migrated user 'Pas9988'
CentralAuth.log-20160109.gz:2016-01-08 09:22:33 mw1077 ruwikibooks CentralAuth INFO: Attaching local user Pas9988@ruwikibooks by 'login'
CentralAuth.log-20160109.gz:2016-01-08 09:22:34 mw1077 ruwikibooks CentralAuth INFO: authentication for 'Pas9988' succeeded
CentralAuth.log-20160109.gz:2016-01-08 09:22:34 mw1077 ruwikibooks CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Pas9988'
CentralAuth.log-20160109.gz:2016-01-08 09:22:34 mw1077 ruwikibooks CentralAuth INFO: Already fully migrated user 'Pas9988'
CentralAuth.log-20160109.gz:2016-01-08 09:22:36 mw1244 loginwiki CentralAuth INFO: Attaching local user Pas9988@loginwiki by 'login'
CentralAuth.log-20160109.gz:2016-01-08 09:22:39 mw1235 wikidatawiki CentralAuth INFO: Attaching local user Pas9988@wikidatawiki by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:16:45 mw1109 itwiki CentralAuth INFO: registered global account 'Pas99'
CentralAuth.log-20160118.gz:2016-01-18 00:16:45 mw1109 itwiki CentralAuth INFO: Attaching local user Pas99@itwiki by 'new'
CentralAuth.log-20160118.gz:2016-01-18 00:16:47 mw1109 itwiki CentralAuth INFO: Set global password for 'Pas99'
CentralAuth.log-20160118.gz:2016-01-18 00:16:49 mw1008 metawiki CentralAuth INFO: Attaching local user Pas99@metawiki by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:16:49 mw1009 mediawikiwiki CentralAuth INFO: Attaching local user Pas99@mediawikiwiki by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:16:49 mw1003 loginwiki CentralAuth INFO: Attaching local user Pas99@loginwiki by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:17:22 mw1080 eswiki CentralAuth INFO: Attaching local user Pas99@eswiki by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:17:30 mw1258 eowiki CentralAuth INFO: Attaching local user Pas99@eowiki by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:19:45 mw1242 commonswiki CentralAuth INFO: Attaching local user Pas99@commonswiki by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:19:45 mw1250 incubatorwiki CentralAuth INFO: Attaching local user Pas99@incubatorwiki by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:19:45 mw1211 enwikiquote CentralAuth INFO: Attaching local user Pas99@enwikiquote by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:19:46 mw1055 enwikiversity CentralAuth INFO: Attaching local user Pas99@enwikiversity by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:19:46 mw1184 enwikinews CentralAuth INFO: Attaching local user Pas99@enwikinews by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:19:46 mw1044 enwikibooks CentralAuth INFO: Attaching local user Pas99@enwikibooks by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:19:46 mw1084 specieswiki CentralAuth INFO: Attaching local user Pas99@specieswiki by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:19:46 mw1043 enwikivoyage CentralAuth INFO: Attaching local user Pas99@enwikivoyage by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:19:46 mw1253 enwikisource CentralAuth INFO: Attaching local user Pas99@enwikisource by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:19:46 mw1209 enwiktionary CentralAuth INFO: Attaching local user Pas99@enwiktionary by 'login'
CentralAuth.log-20160118.gz:2016-01-18 00:48:40 mw1258 mswiki CentralAuth INFO: Attaching local user Pas99@mswiki by 'login'
CentralAuth.log-20160119.gz:2016-01-18 22:10:40 mw1033 itwiki CentralAuth INFO: authentication for 'Pas99' succeeded
CentralAuth.log-20160119.gz:2016-01-18 22:10:40 mw1033 itwiki CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Pas99'
CentralAuth.log-20160119.gz:2016-01-18 22:10:40 mw1033 itwiki CentralAuth INFO: Already fully migrated user 'Pas99'
CentralAuth.log-20160125.gz:2016-01-24 14:13:49 mw1177 itwiki CentralAuth INFO: authentication for 'Pas99' succeeded
CentralAuth.log-20160125.gz:2016-01-24 14:13:49 mw1177 itwiki CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Pas99'
CentralAuth.log-20160125.gz:2016-01-24 14:13:49 mw1177 itwiki CentralAuth INFO: Already fully migrated user 'Pas99'
CentralAuth.log-20160125.gz:2016-01-24 14:48:20 mw1108 frwiki CentralAuth INFO: Attaching local user Pas99@frwiki by 'login'
CentralAuth.log-20160125.gz:2016-01-24 15:18:08 mw1208 wikidatawiki CentralAuth INFO: Attaching local user Pas99@wikidatawiki by 'login'
CentralAuth.log-20160126.gz:2016-01-25 19:18:37 mw1049 itwiki CentralAuth INFO: authentication for 'Pas99' succeeded
CentralAuth.log-20160126.gz:2016-01-25 19:18:37 mw1049 itwiki CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Pas99'
CentralAuth.log-20160126.gz:2016-01-25 19:18:37 mw1049 itwiki CentralAuth INFO: Already fully migrated user 'Pas99'
CentralAuth.log-20160126.gz:2016-01-25 21:41:08 mw1098 simplewiki CentralAuth INFO: Attaching local user Pas99@simplewiki by 'login'
CentralAuth.log-20160126.gz:2016-01-25 21:41:10 mw1023 ptwiki CentralAuth INFO: Attaching local user Pas99@ptwiki by 'login'
CentralAuth.log-20160126.gz:2016-01-25 21:45:24 mw1258 dewiki CentralAuth INFO: Attaching local user Pas99@dewiki by 'login'
CentralAuth.log-20160127.gz:2016-01-26 23:24:34 mw1044 itwiki CentralAuth INFO: authentication for 'Pas99' succeeded
CentralAuth.log-20160127.gz:2016-01-26 23:24:34 mw1044 itwiki CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Pas99'
CentralAuth.log-20160127.gz:2016-01-26 23:24:34 mw1044 itwiki CentralAuth INFO: Already fully migrated user 'Pas99'
CentralAuth.log-20160130.gz:2016-01-29 18:13:51 mw1174 itwiki CentralAuth INFO: authentication for 'Pas99' succeeded
CentralAuth.log-20160130.gz:2016-01-29 18:13:51 mw1174 itwiki CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Pas99'
CentralAuth.log-20160130.gz:2016-01-29 18:13:51 mw1174 itwiki CentralAuth INFO: Already fully migrated user 'Pas99'
CentralAuth.log-20160130.gz:2016-01-29 19:12:08 mw1065 itwiki CentralAuth INFO: authentication for 'Pas99' succeeded
CentralAuth.log-20160130.gz:2016-01-29 19:12:08 mw1065 itwiki CentralAuth INFO: plugin: attempting wgCentralAuthAutoMigrate for 'Pas99'
CentralAuth.log-20160130.gz:2016-01-29 19:12:08 mw1065 itwiki CentralAuth INFO: Already fully migrated user 'Pas99'
CentralAuth.log-20160130.gz:2016-01-29 22:55:54 mw1209 tewiki CentralAuth INFO: Attaching local user Pas99@tewiki by 'login'
CentralAuth.log-20160130.gz:2016-01-29 23:25:12 mw1221 enwiki CentralAuth INFO: Attaching local user Pas99@enwiki by 'login'
CentralAuth.log-20160130.gz:2016-01-29 23:30:53 mw1097 arwiki CentralAuth INFO: Attaching local user Pas99@arwiki by 'login'
CentralAuth.log-20160130.gz:2016-01-30 00:46:05 mw1090 enwiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass
CentralAuth.log-20160130.gz:2016-01-30 02:16:01 mw1048 enwiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass
CentralAuth.log-20160130.gz:2016-01-30 02:16:07 mw1101 enwiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass
CentralAuth.log-20160130.gz:2016-01-30 02:16:15 mw1030 enwiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass
CentralAuth.log-20160130.gz:2016-01-30 05:29:05 mw1079 enwiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass
CentralAuth.log-20160130.gz:2016-01-30 05:31:53 mw1102 wuuwiki CentralAuth INFO: authentication for 'Pas99' failed, bad pass

The user Pas99 doesn't exist attached or unattached on hewiki.

(wikiadmin@db1041) [hewiki]> select * from user where user_name ='Pas99';
Empty set (0.02 sec)
[15:35]  <  matanya>	bad bad bad
[15:35]  <  matanya>	i have local OS now on itwiki
[15:36]  <  matanya>	and my session is cancelled
[15:37]  <    bd808>	I don't understand. Your logged out session has rights somewhere?
[15:37]  <  matanya>	had
[15:37]  <  matanya>	logged out and backin again
[15:37]  <  matanya>	and now things are working again
[15:38]  <    bd808>	I'm confused about the hewiki part you reported. You saw the Pas99 username in the site chrome while you were on hewiki?
[15:38]  <  matanya>	check centralauth.log, you are going to enjoy what just happened there
[15:39]  <  matanya>	yes, i saw pas99 there
[15:39]  <  matanya>	what happened goes this way
[15:39]  <  matanya>	1. i was forciblly logged out
[15:40]  <  matanya>	2. automaticlly logged back in, but with user pas99
[15:40]  <  matanya>	3. logged out, hard refresh, clean cookies
[15:40]  <  matanya>	4. login as self
[15:42]  <    bd808>	and then you had other issues?
[15:42]  <  matanya>	yes, a new one just now
[15:43]  <  matanya>	vito asked on -stewards for someone to do an OS action on itwiki
[15:43]  <  matanya>	i changed my rights on meta to add myself OS on itwiki
[15:43]  <  matanya>	but when going to itwiki i got logged out
[15:43]  <  matanya>	still had the rights
[15:44]  <  matanya>	but when i tapped the OS button i got the message
[15:44]  <  matanya>	your action was cancelled to prevent bla bla ...
[15:44]  <    bd808>	"2016-01-30 22:36:31 mw1243 itwiki CentralAuth INFO: authentication for 'Matanya' succeeded"
[15:45]  <  matanya>	ok, good start
[15:45]  <    bd808>	that's the CentralAuth logs have other than some noise about you already having a central account
[15:46]  <    bd808>	we don't log logouts afaik

All wikis were rolled back to 1.27.0-wmf.10 at 2016-01-30T23:20Z because of this bug. This is too nasty to debug on the weekend with only partial staffing.

A collision of 128-bit tokens is unrealistic (unless something is wrong with our random number generation).

Totally agree here. Either we're pulling from a suddenly horrible entropy source (and iirc we're using openssl's generate IV, which is a wrapper around /dev/urand in our setup), or something else is going on.

This sounds a lot like cookies were set, and then something was cached, and matanya got a cached "set-cookie" header, which gave him someone else's session. The last time that happened, it was a central notice banner returned when the user was autocreated on meta for the first time. The timing of someone being autocreated, then having someone impersonate them makes me suspect the same pattern.

One possible culprit is the job runner which processes jobs running in the name of different users in the same PHP request, and sometimes uses RequestContext::importScopedSession to access the sessions of those users. (I first thought that LocalPageMoveJob is called cross-wiki and the user ID ends up pointing to a different user, but the way it is used seems safe, and the other importScopedSession jobs too.) Maybe users can bleed through into another job there, by being cached by another object or something like that.

This sounds a lot like cookies were set, and then something was cached, and matanya got a cached "set-cookie" header, which gave him someone else's session. The last time that happened, it was a central notice banner returned when the user was autocreated on meta for the first time. The timing of someone being autocreated, then having someone impersonate them makes me suspect the same pattern.

That would make sense. Setup.php calls session_cache_limiter( 'private, must-revalidate' ) if the session is persistent, but that's not the case for autocreation. So maybe Schexnayder visited meta, which pulled in a JS file from Commons, Schexnayder got autocreated as a result, the response was cached, and then Matiia loaded the same file and boom.

So maybe Schexnayder visited meta, which pulled in a JS file from Commons, Schexnayder got autocreated as a result, the response was cached, and then Matiia loaded the same file and boom.

Tried to reproduce that in curl but I couldn't trigger an autocreation (or set-cookie headers of any sort) at all when fetching a gadget URL.

Tried to reproduce that in curl but I couldn't trigger an autocreation (or set-cookie headers of any sort) at all when fetching a gadget URL.

Well obviously not, because it has been rolled back. @bd808 can we deploy wmf.11 on mw1017 again?

Tried to reproduce that in curl but I couldn't trigger an autocreation (or set-cookie headers of any sort) at all when fetching a gadget URL.

Well obviously not, because it has been rolled back. @bd808 can we deploy wmf.11 on mw1017 again?

Sure. Just hack the wikiversions.php locally. It would probably be a good idea to leave testwiki on .10 for now so that you only get .11 when using the debug header.

Tried again properly, but at least for the gadget request, cache headers seem to be set properly: P2548

That would make sense. Setup.php calls session_cache_limiter( 'private, must-revalidate' ) if the session is persistent, but that's not the case for autocreation.

The more I look at it, the more it seems session_cache_limiter( 'private, must-revalidate' ) seems equivalent to session_cache_limiter( '' ) or any other unrecognized value in that it disables PHP's session handling from sending caching headers entirely.

vagrant@mediawiki-vagrant:~$ cat /var/www/w/foo.php 
<?php
session_cache_limiter( 'private, must-revalidate' );
session_start();

echo PHP_VERSION . "\n";
vagrant@mediawiki-vagrant:~$ curl -v 'http://127.0.0.1:8080/w/foo.php'
* Hostname was NOT found in DNS cache
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> GET /w/foo.php HTTP/1.1
> User-Agent: curl/7.35.0
> Host: 127.0.0.1:8080
> Accept: */*
> 
< HTTP/1.1 200 OK
< Date: Sun, 31 Jan 2016 17:22:39 GMT
* Server Apache/2.4.7 (Ubuntu) PHP/5.5.9-1ubuntu4.14 is not blacklisted
< Server: Apache/2.4.7 (Ubuntu) PHP/5.5.9-1ubuntu4.14
< X-Powered-By: HHVM/3.6.5
< Set-Cookie: PHPSESSID=afab60df6b331a4dc9761cf47a0ae1ea; path=/
< Vary: Accept-Encoding
< Transfer-Encoding: chunked
< Content-Type: text/html; charset=utf-8
< 
5.6.99-hhvm
* Connection #0 to host 127.0.0.1 left intact

The Cache-Control headers we see seem to come from OutputPage or the like.

Hmm, this is strange:

GET /w/index.php?title=MediaWiki:Gadget-LanguageSelect.js&action=raw&ctype=text/javascript HTTP/1.1
Host: commons.wikimedia.org
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:44.0) Gecko/20100101 Firefox/44.0 Iceweasel/44.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Cookie: GeoIP=REDACTED; CP=H2; WMF-Last-Access=31-Jan-2016; commonswikiSession=9aREDACTED; centralauth_User=Anomie+test+12; centralauth_Session=ccREDACTED; forceHTTPS=1
X-Wikimedia-Debug: 1
Connection: keep-alive

HTTP/1.1 200 OK
Server: nginx/1.9.4
Date: Sun, 31 Jan 2016 19:37:22 GMT
Content-Type: text/javascript; charset=UTF-8
Content-Length: 3751
x-powered-by: HHVM/3.6.5
x-content-type-options: nosniff
p3p: CP="This is not a P3P policy! See https://commons.wikimedia.org/wiki/Special:CentralAutoLogin/P3P for more info."
Content-Encoding: gzip
Vary: Accept-Encoding
Set-Cookie: commonswikiSession=qfREDACTED; path=/; secure; httponly
Set-Cookie: forceHTTPS=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; domain=commons.wikimedia.org; httponly
Set-Cookie: forceHTTPS=true; path=/; httponly
Set-Cookie: centralauth_User=Anomie+test+12; expires=Tue, 01-Mar-2016 19:37:22 GMT; Max-Age=2592000; path=/; domain=commons.wikimedia.org; secure; httponly
Set-Cookie: centralauth_Session=8bREDACTED; path=/; domain=commons.wikimedia.org; secure; httponly
Set-Cookie: commonswikiSession=p9REDACTED; path=/; secure; httponly
Set-Cookie: centralauth_User=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; domain=commons.wikimedia.org; secure; httponly
Set-Cookie: centralauth_Session=deleted; expires=Thu, 01-Jan-1970 00:00:01 GMT; Max-Age=0; path=/; domain=commons.wikimedia.org; secure; httponly
Last-Modified: Tue, 26 Jan 2016 02:56:41 GMT
x-varnish: 3734641595, 973222538
via: 1.1 varnish, 1.1 varnish
Accept-Ranges: bytes
Age: 0
x-cache: cp1052 pass+chfp(0), cp1066 frontend pass+chfp(0)
strict-transport-security: max-age=31536000
x-analytics: WMF-Last-Access=31-Jan-2016;https=1
x-client-ip: REDACTED
Cache-Control: private, s-maxage=0, max-age=0, must-revalidate
X-Firefox-Spdy: 3.1
  1. The Vary header is wrong. It looks like RawAction.php doesn't bother to set the proper header. That could be a problem if the response gets cached, since it won't vary on the session cookies.
  2. There's two "commonswikiSession" Set-Cookie headers. After doing some debugging, it looks like the "failed to create empty session: Session ID already exists" message is because CA is failing the check from T124409:
    1. User doesn't exist locally so CentralAuthSessionProvider returns it to be auto-created.
    2. So it auto-creation adds it to the database.
    3. Something from the 'AuthPluginAutoCreate' hook tries to access the Session, before CA's $wgAuth->initUser() gets a chance to attach it. Since it exists locally but isn't attached, the session fails to load, which then sets up a new empty session on the request.
    4. In this case something is also persisting that empty session, but it's possible that in some cases it won't do so.
  3. Thanks to the new empty session from #2, RawAction.php doesn't see a logged-in user. This should be making it return Cache-Control: public, s-maxage=300, maxage=2678400 to varnish (templates/varnish/text-frontend.inc.vcl.erb rewrites it to the private version we see here). The lack of a correct Vary from #1 might then get varnish to decide to serve that cached response with set-cookies to someone else.

Note to reproduce this particular query it needs an edge login with the pixel images under wmf.10 to set the CA cookies on Commons, then hit the URL under wmf.11 to auto-create.

I don't know if there might be any other way to exploit the RawAction thing in wmf.10, so

for #1. Since SessionManager is currently not deployed again, https://gerrit.wikimedia.org/r/267610 for #2.

That would make sense. Setup.php calls session_cache_limiter( 'private, must-revalidate' ) if the session is persistent, but that's not the case for autocreation.

The more I look at it, the more it seems session_cache_limiter( 'private, must-revalidate' ) seems equivalent to session_cache_limiter( '' ) or any other unrecognized value in that it disables PHP's session handling from sending caching headers entirely.

For those following along that weren't involved in other patches, I put the session_cache_limiter( 'private, must-revalidate' ) back in rMW7491b52f700e: Call session_cache_limiter() before starting a session to eliminate the Pragma: no-cache header that that was causing T124510: Page preview response not being cached for history navigation in 1.27.0-wmf11. @Anomie pointed out during review that 'private, must-revalidate' was not a listed value at php.net, but since this was the configuration used in 1.27.0-wmf.10 and prior we decided to keep using it.

Based on the php.net documentation it looks like we might want to use session_cache_limiter( 'private' ) to get Expires, Cache-Control, and Last-Modified headers. We certainly don't want session_cache_limiter( 'nocache' ) as that introduces the Pragma: no-cache header that messed with the back button in Firefox.

If we do decide to go with a value that sets real headers, we should also make at least CookieSessionProvider (if not SessionBackend itself) set the same headers itself when persisting in case $wgPHPSessionHandling is off.

OTOH, MediaWiki already sets the same headers itself with more logic behind it and has been for some time now (I don't know whether private, must-revalidate ever worked in some old version of PHP, but a quick search through PHP's git doesn't look like it). So we might want to just leave well enough alone there.

In that case we should use session_cache_limiter( '' ) which (apparently) does the same thing but is more explicit and documented on php.org.

I don't know if there might be any other way to exploit the RawAction thing in wmf.10, so

for #1. Since SessionManager is currently not deployed again, https://gerrit.wikimedia.org/r/267610 for #2.

Marked as blocker of T124940: MediaWiki 1.26.3 security release for getting

into the next security release.

Did we find any further users affected by this?
Is anybody investigating this further, or is this de-facto stalled?
Wondering how to proceed.

Did we find any further users affected by this?

No.

Is anybody investigating this further, or is this de-facto stalled?
Wondering how to proceed.

No one knows what might have caused it, and it hasn't recurred (or if it has no one has reported it). Maybe I44096c69 fixed it, or maybe something else entirely. The stuff in T125283#1985224 probably wasn't a cause, since our varnish config refuses to cache anything with Set-Cookie headers.

We fished for further reports in T126069/T126074 and that did not uncover anything.

Just to make sure everything is deployed here:

  • It looks like has been merged publicly, so it's no longer in the security patches on tin
  • is not currently deployed on the cluster-- should that be deployed?
  • is not currently deployed on the cluster-- should that be deployed?

Yes, it probably should be.

New version of the RawAction patch:

New version of the RawAction patch:

This is now in /srv/patches on tin as 04-T125283.patch and applied to the .13 and .14 branches.

Tgr renamed this task from Logged in as Schexnayder on commonswiki to Users occasionally logged in as different users after SessionManager deployment.Feb 24 2016, 6:22 AM
Tgr updated the task description. (Show Details)
Tgr updated the task description. (Show Details)
Tgr added a subscriber: ArielGlenn.

from grepping logs provided by Giuseppe for anomie:

2016-02-23T02:43:42 user 023yangbo mucking about on zh wiki, last entry for awhile
2016-02-23T04:12:46 FallingGravity starts work from laptop
2016-02-23T04:19:30 FallingGravity starts work from desktop
2016-02-23T04:24:04 FallingGravity's last access from laptop before closing it
2016-02-23T04:52:04 023yangbo first entry of a new series of accesses on zh wiki (same ip as earlier)
2016-02-23T04:55:18 FallingGravity ends work on desktop
2016-02-23T05:00:39 023yangbo last entry in their second series of accesses
2016-02-23T06:03:59 FallingGravity opens laptop and has the 023yangbo user
2016-02-23T06:17:04 FallingGravity's next 'good' access from desktop

When looking at common urls between 023yangbo and Falling Gravity on the laptop, the only thing I saw was

http://meta.wikimedia.org/w/index.php?title=Special:BannerLoader&campaign=2016+Steward+Elections&banner=stewvote&uselang=zh-cn&debug=false
at 2016-02-23T04:52:16 and 2016-02-23T04:59:35 by 023yangbo, and

http://meta.wikimedia.org/w/index.php?title=Special:BannerLoader&campaign=2016+Steward+Elections&banner=stewvote&uselang=en&debug=false
at 2016-02-23T01:53:46, also later on the desktop at 2016-02-23T04:19:30

but note uselang variance.

Anything else people want me to scry, let me know; anomie did some in depth looking earlier and may have more to say.

Does anybody plan to investigate more, or are we just stuck with all potential reasons already covered?
Asking as this is still open with Unbreak now priority and without an assignee...

We had talked about doing this: https://gist.github.com/MaxSem/9b2db79237b90f550a6f with logging, I think @bd808 had mentioned turning this into a real patch but I don't see that in gerrit anywhere.

Does anybody plan to investigate more, or are we just stuck with all potential reasons already covered?

The question is "investigate how?". If there were any way to reproduce it, or any indication it's happening with any regularity, or even any way to detect server-side when it happens (T125455: Track ip addresses associated with a session and log when anomalous was tried, but wasn't very helpful), I'd investigate it. But as things stand, I have no idea how I might proceed.

We had talked about doing this: https://gist.github.com/MaxSem/9b2db79237b90f550a6f with logging, I think @bd808 had mentioned turning this into a real patch but I don't see that in gerrit anywhere.

rMW8b413431d760: Guard against allowing intermediate caching when cookies are present, and also rMWb84fae0173b2: Use header_register_callback to avoid caching responses with Set-Cookie headers.

Tgr lowered the priority of this task from Unbreak Now! to Medium.Mar 9 2016, 7:35 PM

Given that this hasn't happened for several weeks, and multiple plausible defense mechanisms were added, I think we can lower the priority.

Tgr changed the task status from Open to Stalled.Mar 9 2016, 7:35 PM

Gotcha. Thanks for the updates everybody! Appreciated.

csteipp claimed this task.

It's been many weeks, and we haven't seen this come up again. I think we can call this resolved? We'll plan to release the patch with the next security release, unless there's a reason to keep this open longer.

@Anomie, I'm working on backporting

to REL1_26, REL1_25, REL1_23. Should $wgUseKeyHeader also be backported to includes/DefaultSettings.php or should the conditional in RawAction.php be modified not to check for "UseKeyHeader"? If the config param should be backported, where should the changelog note go?

rMW7e20cb51fe19: Replace XVO with support for the Key HTTP header effectively renamed $wgUseXVO to $wgUseKeyHeader as part of changing from our old custom "X-Vary-Options" header to the very similar "Key" header that has been proposed as a new standard. The simplest change to the patch would be to just use $wgUseXVO for older MediaWiki, and change the call to ->getKeyHeader() on the next line to ->getXVO() to match.

@Anomie, thanks. That was vary helpful.

Patches:




demon changed the visibility from "Custom Policy" to "Public (No Login Required)".May 20 2016, 5:26 PM
demon changed the edit policy from "Custom Policy" to "All Users".
demon changed Security from Software security bug to None.

T125283-REL1_23.patch broke MediaWiki 1.23's raw action because it uses $wgUseXVO in the onView() function where it is not yet declared. This may result in a PHP Notice which breaks things served via Raw action (e.g. the JavaScript we store and serve from MediaWiki).

I suggest changing
global $wgSquidMaxage, $wgForcedRawSMaxage;
to
global $wgSquidMaxage, $wgForcedRawSMaxage, $wgUseXVO;
at the start of the onView() function.