Page MenuHomePhabricator

Decouple chronology protector from authentication
Closed, ResolvedPublic

Description

Services like RESTBase currently assume that reads from the API are not significantly affected by slave lag. We can't easily use the chronology protector to avoid incorrect results, as this would mix authentication with chronology protection. This could potentially lead to protected or otherwise personalized content being returned in an otherwise public wiki, which in turn could cause information leaks in RESTBase and caches.

One option of addressing this issue might be to decouple the chronology protector from authentication, perhaps by using separate cookies. This way, services like RESTBase could forward the chronology protector cookie without also enabling authentication in backends.

Example use cases

  • A user hits 'edit' using VE right after saving an article, and before the MySQL slave has caught up. When replication lag is elevated, this is causing Parsoid parse failures as the MW API returns 'revision not found'.

Event Timeline

GWicke updated the task description. (Show Details)
GWicke raised the priority of this task from to Needs Triage.
GWicke added subscribers: GWicke, aaron.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 2 2015, 10:18 PM
GWicke set Security to None.Sep 2 2015, 10:19 PM
GWicke added a project: Services.
GWicke removed a subscriber: Aklapper.
GWicke added a subscriber: ori.
GWicke added a subscriber: csteipp.
GWicke updated the task description. (Show Details)Sep 28 2015, 10:53 PM
GWicke updated the task description. (Show Details)

To avoid side-channel leakage of whether a user is active, any header to get ChronologyProtector set for a user needs to use proper tokens.

It seems like adding a special HMAC token to ResourceLoaderUserTokensModule could do the trick. When JS to runs to fetch the new HTML+rdfa after the first edit it can request a CP token and use that as a Header in the request to the parsoid endpoint. Parsoid could relay this to MediaWiki, activating CP for whatever master positions the user is at.

GWicke added a comment.EditedSep 28 2015, 11:15 PM

https://github.com/firebase/php-jwt seems to be a PHP implementation that has seen some amount of scrutiny.

Edit: @Legoktm mentioned that @csteipp already reviewed php-jwt, and we're using it for OAuth and ContentTranslation.

Edit 2: T106762: Security review for firebase/php-jwt

aaron claimed this task.
aaron renamed this task from [multi-dc] Decouple chronology protector from authentication to Decouple chronology protector from authentication.Nov 2 2015, 8:01 PM
Krinkle triaged this task as Normal priority.Nov 5 2015, 12:23 AM

Change 247178 had a related patch set uploaded (by Aaron Schulz):
Decouple ChronologyProtector from user sessions

https://gerrit.wikimedia.org/r/247178

aaron moved this task from Doing to Blocked on the Performance-Team board.Nov 12 2015, 8:04 PM

Change 247178 merged by jenkins-bot:
Decouple ChronologyProtector from user sessions

https://gerrit.wikimedia.org/r/247178

aaron added a comment.EditedDec 1 2015, 8:23 PM

As long as restbase forwards the proper XFF headers (adding the CDN one to the list and sending it to api.php), and $wgSquidServersNoPurge covers parsoid IPs, then slave waits will automatically happen.

One can opt out of CP via the header ChronologyProtection: false (or not sending XFF).

As long as restbase forwards the proper XFF headers

It forwards the x-client-ip header provided by Varnish. For good measure (and until the MW API switches to using that header), we set the XFF header to the same value when contacting the MW API.

@aaron, how does this work for requests from the job queue? We do have the timestamp of the revision in those job requests; could we use that for lag detection instead?

We'll also need to coordinate with the Parsoid folks on this, as they'll have to forward whatever RB forwards as well.

aaron added a comment.Dec 3 2015, 1:11 AM

@aaron, how does this work for requests from the job queue? We do have the timestamp of the revision in those job requests; could we use that for lag detection instead?

We'll also need to coordinate with the Parsoid folks on this, as they'll have to forward whatever RB forwards as well.

HTTP requests in jobs will have jobrunner IPs by default, so CP won't do much there (normally writes happen in Job::run() directly, where CP is disabled in rpc/RunJobs.php). If jobs do writes back to api.php, then the reads in job-based HTTP api requests will just slow down a bit, but CP would not be useful in that case either.

I think with jobs, you have to be clear about what changes you expect to see, given how the job was enqueued. One trick might be to enqueue jobs with a current master position and have some way of sending it to APIs via HTTP headers. This would only require sanity checks and maybe simple hmac signing to avoid waiting on values in the future or something silly. Replays or side-channel issues wouldn't apply in that case AFAIK, and user sessions would not be related to it. I can add such header support to MW if it's clear whats needed.

As for live requests on behave of a user, we should confirm the XFF headers are being received and trusted by MW (maybe some simple automated test can do this).

aaron removed aaron as the assignee of this task.Jan 15 2016, 10:47 PM
GWicke renamed this task from Decouple chronology protector from authentication to Decouple chronology protector from authentication .Oct 26 2016, 4:30 PM
GWicke edited projects, added RESTBase, Parsoid, Services (watching); removed Services.
ssastry moved this task from Backlog to Non-Parsoid Tasks on the Parsoid board.Dec 5 2016, 5:41 PM
Krinkle added a subscriber: Krinkle.Jul 1 2017, 2:18 AM

@aaron @GWicke Is there something left on this task?

The use case of job runner is presumably separate (it didn't have ChronologyProtecter previously either, right?).

This use case of Parsoid and other services that make API requests to MediaWiki on behalf of a user sounds important. Did that work previously? Presumably it woudl if the user session cookies were forwarded (which Parsoid does). In the new system, we'd need the IP and User-Agent from the end-user.

If so, that leaves two tasks:

  • Parsoid to forward user-agent
  • MediaWiki to verify and honour forwarded IP and User-Agent. The IP it already honours from XFF, given we do that same with Varnish already. So presumably that part is easy. The User-Agent forwarding might require some non-standard header to be added, given that's a relatively uncommon thing to forward, right?
Krinkle moved this task from Radar to Inbox on the Performance-Team board.Jul 1 2017, 2:18 AM
aaron moved this task from Inbox to Radar on the Performance-Team board.Jul 6 2017, 8:03 PM

Another option, is using the existing 'ChronologyClientId' header that MediaWiki supports for acting on behalf of a client (e.g. no need to forward the agent/IP).

Krinkle closed this task as Resolved.Feb 9 2019, 2:14 AM
Krinkle assigned this task to aaron.

Closing as the decoupling has taken place. There is now no longer any tie between session management and ChronologyProtector.

For Parsoid, presumably it's either found a different way to make this work, or doesn't need CP, or is a minor issue we've not noticed in three years passing. If and when there are issues of this kind, the comments from Aaron show how the new system already accommodates this need, ready to be picked up as desired.