User authentication security issue (Oct 1, 2020)
Closed, ResolvedPublic
Actions

Assigned To

None

Authored By

	Tgr
	Oct 1 2020, 10:00 PM

Description

This is a public placeholder task for the Oct 1. security issue related to user authentication. Investigation is ongoing, details will be shared later. So far we have not seen any indication that the issue would be intentional or widespread. Out of an abundance of caution, we have logged out all users (at 21:48 UTC).

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		None	T264370 User authentication security issue (Oct 1, 2020)
					Restricted Task

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 1 2020, 10:00 PM

Reedy added a project: WMF-General-or-Unknown.Oct 1 2020, 10:02 PM

DannyS712 subscribed.Oct 1 2020, 10:02 PM

Xaosflux subscribed.Oct 1 2020, 10:14 PM

Urbanecm added a parent task: T263177: 1.36.0-wmf.11 deployment blockers.Oct 1 2020, 10:18 PM

Urbanecm subscribed.Oct 1 2020, 10:29 PM

JJMC89 subscribed.Oct 1 2020, 10:30 PM

LuchoCR subscribed.Oct 1 2020, 10:31 PM

Davey2010 subscribed.Oct 1 2020, 10:48 PM

Daimona subscribed.Oct 1 2020, 10:56 PM

Thibaut120094 subscribed.Oct 1 2020, 10:59 PM

Were people really logged out at 7:43 UTC? Judging from comments on en-wiki the log out was from 21:43 UTC

In T264370#6510885, @DuncanHill wrote:

Were people really logged out at 7:43 UTC? Judging from comments on en-wiki the log out was from 21:43 UTC

Updated the time.

Meirae subscribed.Oct 2 2020, 12:13 AM

Apap04 subscribed.Oct 2 2020, 12:19 AM

https://lists.wikimedia.org/pipermail/wikitech-l/2020-October/093922.html

• Mholloway subscribed.Oct 2 2020, 3:59 AM

Amitie_10g subscribed.Oct 2 2020, 4:22 AM

I use 2fA for Wikimedia sites and logged in again successfully. Is recommended to change the password anyways?

taavi subscribed.Oct 2 2020, 5:02 AM

RhinosF1 subscribed.Oct 2 2020, 5:46 AM

Kaartic subscribed.Oct 2 2020, 5:53 AM

Marostegui subscribed.Oct 2 2020, 7:51 AM

Nintendofan885 subscribed.Oct 2 2020, 8:51 AM

Peachey88 subscribed.Oct 2 2020, 9:16 AM

In T264370#6511175, @Amitie_10g wrote:

I use 2fA for Wikimedia sites and logged in again successfully. Is recommended to change the password anyways?

See https://lists.wikimedia.org/pipermail/wikitech-l/2020-October/093922.html which does not say that people should change passwords - thanks.

kostajh subscribed.Oct 2 2020, 10:03 AM

LSobanski subscribed.Oct 2 2020, 2:00 PM

Was the release mentioned rolled back before the logout was done?

I mean, if the problem happens when people log in, then to force *everyone* to re-login is to call for more instances of the problem to happen…

I see T264363, but it is quite too technical for me to follow.

RodneyAraujo subscribed.Oct 2 2020, 3:27 PM

the logout was occurred in all Wikimedia projects.

• Mholloway unsubscribed.Oct 2 2020, 3:33 PM

In T264370#6512836, @Base wrote:

Was the release mentioned rolled back before the logout was done?

Yes.

I mean, if the problem happens when people log in, then to force *everyone* to re-login is to call for more instances of the problem to happen…

I see T264363, but it is quite too technical for me to follow.

We are not sure if T264363 is the issue - it may be unrelated.

Urbanecm added a comment.Oct 2 2020, 3:39 PM

This comment was removed by Urbanecm.

Isn't this the third or fourth time everyone has been forcibly logged out in the past year? How is this acceptable? Why does this keep happening and who's taking responsibility to ensure that it stops happening?

Avicennasis subscribed.Oct 2 2020, 5:46 PM

PiRSquared17 subscribed.Oct 2 2020, 6:20 PM

SuperiorWalrus subscribed.Oct 2 2020, 6:20 PM

Omar_sansi subscribed.Oct 2 2020, 6:26 PM

pierreneter subscribed.Oct 2 2020, 7:41 PM

The undeniable truth of software engineering and computer security is that bugs happen. While techniques like automated testing and code review are used to try to prevent bugs from entering production, and deployment stages are used to minimize the impact of new bugs, those techniques are not capable of preventing every bug (and thus, every security issue) from hitting production servers. This is because all code is written by humans and humans are imperfect.

Trying to eliminate all security problems is a noble goal, but it is also a futile one. Instead, it is much more important to ensure that there is an effective response to those security problems once they are discovered. The forced logouts are an example of that plan in action. The only ways to prevent them would be to expect perfection from human developers, to stop developing and improving MediaWiki, or to stop adequately responding to security incidents. Those are all worse options.

We don't yet know what caused this specific incident. Only once that stage of the investigation is complete can we begin to discuss how similar issues might be prevented in the future. All indications at the moment, however, are that it is unrelated to the previous session caching issues in June.

This task is the public facing task for the user authentication issue. It is blocking the train and is thus an unbreak now priority. I am marking it stalled pending resolution of the issue which is tracked internally in the private task T264369.

Ciencia_Al_Poder subscribed.Oct 5 2020, 9:42 AM

Michael subscribed.Oct 5 2020, 12:54 PM

DanInCPT awarded a token.Oct 5 2020, 6:02 PM

hashar mentioned this in T263178: 1.36.0-wmf.12 deployment blockers.Oct 5 2020, 9:01 PM

In T264370#6513972, @AntiCompositeNumber wrote:

The undeniable truth of software engineering and computer security is that bugs happen. While techniques like automated testing and code review are used to try to prevent bugs from entering production, and deployment stages are used to minimize the impact of new bugs, those techniques are not capable of preventing every bug (and thus, every security issue) from hitting production servers. This is because all code is written by humans and humans are imperfect.

Trying to eliminate all security problems is a noble goal, but it is also a futile one. Instead, it is much more important to ensure that there is an effective response to those security problems once they are discovered. The forced logouts are an example of that plan in action. The only ways to prevent them would be to expect perfection from human developers, to stop developing and improving MediaWiki, or to stop adequately responding to security incidents. Those are all worse options.

These are very nice words, but none of them answer the questions I asked. When the "emergency door" is being used regularly, with widespread user-facing impact, it warrants a serious and thorough investigation into why we're repeatedly having issues that require such drastic preventative action.

I woke up on Friday morning to a text from a Wikipedia administrator who couldn't remember his password. Because he was on a school IP address that had been blocked, he was also offered no option to reset his password via e-mail. Wikimedia wikis already face significant challenges attracting and retaining contributors; forcibly logging all of them out is disruptive and a big deal.

We don't yet know what caused this specific incident. Only once that stage of the investigation is complete can we begin to discuss how similar issues might be prevented in the future. All indications at the moment, however, are that it is unrelated to the previous session caching issues in June.

Who specifically is leading this investigation you reference?

• ema mentioned this in T264378: ATS-BE Lua mitigations for cacheable responses w/ Set-Cookie seemingly not working.Oct 6 2020, 8:16 AM

matmarex subscribed.Oct 6 2020, 8:18 PM

In T264370#6519622, @MZMcBride wrote:

Who specifically is leading this investigation you reference?

Hi, the issue is related to user authentication which is an highly sensible matter and unfortunately can not be discussed publicly. Rest assured it is being investigated by multiple persons since that involves multiple layers of the stack.

For your other questions, they are more general and would be better asked on another venue (mailing list or another task). Thanks!

Universal_Omega subscribed.Oct 7 2020, 12:19 AM

Agusbou2015 subscribed.Oct 7 2020, 12:32 PM

Amorymeltzer subscribed.Oct 9 2020, 2:22 PM

Lowering priority as new logging is in place and this is no longer a blocker of 1.36.0-wmf.11—the latter has been re-deployed for all wikis.

dduvall removed a parent task: T263177: 1.36.0-wmf.11 deployment blockers.Oct 13 2020, 7:57 PM

BEANS-X2 subscribed.Nov 2 2020, 8:01 AM

• ema closed subtask Restricted Task as Invalid.Nov 24 2020, 9:33 AM

Tgr changed the status of subtask Restricted Task from Invalid to Resolved.Nov 24 2020, 10:00 AM

Resolving for now per T264369#6644444.

sbassett removed sbassett as the assignee of this task.Dec 16 2020, 3:59 PM

Aklapper renamed this task from User authentication security issue (Oct 1) to User authentication security issue (Oct 1, 2020).Dec 16 2020, 4:02 PM

The task description states Investigation is ongoing, details will be shared later.

I see nothing in https://wikitech.wikimedia.org/wiki/Incident_documentation yet. If this issue has been sufficiently mitigated so that the relevant tasks can be marked resolved, now would be the time to begin publicly sharing information about the issue.

In T264370#6696925, @AntiCompositeNumber wrote:

The task description states Investigation is ongoing, details will be shared later.

I see nothing in https://wikitech.wikimedia.org/wiki/Incident_documentation yet. If this issue has been sufficiently mitigated so that the relevant tasks can be marked resolved, now would be the time to begin publicly sharing information about the issue.

Yes indeed that is usually the process we follow for security issues. The public task (this T264370) is usually just a placeholder when the real work and investigation is captured in a different private task (since it potentially hold personal information, might be a threat to the infrastructure or leak counter measures to the attacker(s)).

The same occurs for the incident documentation. Along the Phabricator private task we also open a private document which is nicer to format and such a document has been created and polished. Unfortunately for the same reason as the private task, we can not share it.

What I can say is that the actual root cause has not been identified. We went through a lot of history and could not find any other use case. As a result of the investigation we have enabled massive logging on the server side infrastructure which, if the issue occurs again, would give us ample details and should let us pinpoint the actual root cause of the issue.

We haven't found the cause, it was a one time occurrence and we have added massive logged on the infrastructure to help analysis in the future if it occurs again. There is unfortunately not much more to say.

Just to (belatedly) close the loop here, the likely cause was found and fixed two months later. Details are in {T274514} (restricted task).

Tgr mentioned this in T292812: Every UserLogin visit generates "Persisting session for unknown reason" entry in Logstash.Oct 9 2021, 2:06 AM

In T264370#7414132, @Tgr wrote:

Just to (belatedly) close the loop here, the likely cause was found and fixed two months later. Details are in {T274514} (restricted task).

Could information now be shared?

Private tasks aren't useful for most of us.

Zabe subscribed.Oct 9 2021, 10:09 AM

In T264370#7414321, @RhinosF1 wrote:

In T264370#7414132, @Tgr wrote:

Just to (belatedly) close the loop here, the likely cause was found and fixed two months later. Details are in {T274514} (restricted task).

Could information now be shared?

Private tasks aren't useful for most of us.

Ping @Dsharpe / WMF-Legal for advice on this. T274514 likely cannot ever be made public since it's currently PermanentlyPrivate, and for good reason, but we may be able to add NDA'd or other trusted Phabricator users to the task.

What about just sharing here a technical description of the issue, without private or personal information?

sbassett added a project: Security-Team.Oct 13 2021, 4:04 PM

I don't know what information from T274514 can be shared, but the two public patches associated with that task will probably answer most of your questions.

Tgr mentioned this in T348206: Improve logging, monitoring and test coverage for MediaWiki Platform team authentication extensions.Oct 9 2023, 4:00 AM

User authentication security issue (Oct 1, 2020)Closed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

User authentication security issue (Oct 1, 2020)
Closed, ResolvedPublic
Actions

Related Objects
Search...