Page MenuHomePhabricator

Enable use of session storage service in MediaWiki
Open, NormalPublic2 Story Points

Description

Implement storage of MediaWiki sessions in the session storage service.

See: RESTBagOStuff
See also: T210717: Find an alternative to HHVM curl connection pooling for PHP 7

Event Timeline

Eevans triaged this task as Normal priority.Feb 7 2019, 4:58 PM
Eevans created this task.
Restricted Application removed a project: Patch-For-Review. · View Herald TranscriptFeb 7 2019, 4:58 PM
Clarakosi moved this task from Backlog to Next on the User-Clarakosi board.Feb 8 2019, 5:43 PM

Should this task go to @BPirkle now?

Eevans reassigned this task from Eevans to BPirkle.Mar 15 2019, 3:54 PM

Should this task go to @BPirkle now?

Yeah.

Relevant documentation: https://www.mediawiki.org/wiki/Requests_for_comment/SessionStorageAPI
Development instance is at deployment-sessionstore01.deployment-prep.eqiad.wmflabs

I have, on my local development wiki, a working KaskBagOStuff (naming suggestions welcome) that communicates with the development instance at deployment-sessionstore01.deployment-prep.eqiad.wmflabs (via some tricky port forwarding) and successfully allows basic session management for happy login and logout.

I reviewed the following pages, which have some wide-ranging discussion:

To make sure I understand the conclusions, I have a few questions:

  1. ERRORS: there is mention of using the more descriptive error information from the API. Within the context of SessionManager/BagOStuff/MultiHttpClient, does that mean adding that to the existing logging output? Or is there somewhere else that detailed error information should appear?
  2. FORMAT: does it matter what format I store the structured session data in? I saw mention of JSON, and that's what my exploratory code uses right now. I don't think the API/service really cares (that's the beauty of opaque key/value storage). But with the extensive discussion of future key/map stores, including the possibility of someday setting fields within sessions, etc. I'd like to confirm JSON is acceptable/desirable for the implementation I'm writing now.
  3. CHECK-AND-SET: BagOStuff/RESTBagOStuff has CAS support, but there's a FIXME tag related to the $casToken in RESTBagOStuff::doGet(). Is this something I need to dig into, or is the existing behavior sufficient for everything that will use the new service?
  4. COLLATERAL DAMAGE: all I've tested is login and logout, and that's just locally on my MediaWiki vagrant. I'm concerned there are major things I'm missing. I see discussion about CentralAuth and OAuth, but I'm frankly clueless about how/if these are affected by what I'm doing. What might I be breaking that I need to be concerned with?
  5. SET-IF-NOT-EXISTS: the API uses POST (rather than RESTBagOStuff's PUT) for sets, so my KaskBagOStuff uses POST as well. Per https://www.mediawiki.org/wiki/Requests_for_comment/SessionStorageAPI#Notes, I will do nothing with PUT at this time, and we will not initially have set-if-not-exists support. This isn't really a question, I just wanted to explicitly state it.
  6. TTL: similarly, there is significant discussion of per-write TTLs, probably via a Cache-Control header, but the notes say this is for the future and we will initially not have per-write TTLs.

I think most of these are probably best addressed by someone familiar with session management in MediaWiki, but...

I have, on my local development wiki, a working KaskBagOStuff (naming suggestions welcome) that communicates with the development instance at deployment-sessionstore01.deployment-prep.eqiad.wmflabs (via some tricky port forwarding) and successfully allows basic session management for happy login and logout.

Out of curiosity, what is it that prevents you from using RESTBagOStuff?

I reviewed the following pages, which have some wide-ranging discussion:

To make sure I understand the conclusions, I have a few questions:

  1. ERRORS: there is mention of using the more descriptive error information from the API. Within the context of SessionManager/BagOStuff/MultiHttpClient, does that mean adding that to the existing logging output? Or is there somewhere else that detailed error information should appear?
  2. FORMAT: does it matter what format I store the structured session data in? I saw mention of JSON, and that's what my exploratory code uses right now. I don't think the API/service really cares (that's the beauty of opaque key/value storage). But with the extensive discussion of future key/map stores, including the possibility of someday setting fields within sessions, etc. I'd like to confirm JSON is acceptable/desirable for the implementation I'm writing now.
  3. CHECK-AND-SET: BagOStuff/RESTBagOStuff has CAS support, but there's a FIXME tag related to the $casToken in RESTBagOStuff::doGet(). Is this something I need to dig into, or is the existing behavior sufficient for everything that will use the new service?
  4. COLLATERAL DAMAGE: all I've tested is login and logout, and that's just locally on my MediaWiki vagrant. I'm concerned there are major things I'm missing. I see discussion about CentralAuth and OAuth, but I'm frankly clueless about how/if these are affected by what I'm doing. What might I be breaking that I need to be concerned with?
  5. SET-IF-NOT-EXISTS: the API uses POST (rather than RESTBagOStuff's PUT) for sets, so my KaskBagOStuff uses POST as well. Per https://www.mediawiki.org/wiki/Requests_for_comment/SessionStorageAPI#Notes, I will do nothing with PUT at this time, and we will not initially have set-if-not-exists support. This isn't really a question, I just wanted to explicitly state it.

I think I mentioned this before, but I'll leave it here for posterity sake. Cassandra does have proper CAS support (via paxos), and so implementing this in Kask for session storage is doable. That said, it comes at a cost (extra round trips which limit throughput and increase latency) and so I'd like to avoid implementing this in Kask unless we need it. During the RFC process there seemed to be consensus that the read-modify-write that BagOStuff does in this context was safe, if that's the case, I think it'd be preferable to continue doing that. It might be beneficial though to double-check that assumption.

  1. TTL: similarly, there is significant discussion of per-write TTLs, probably via a Cache-Control header, but the notes say this is for the future and we will initially not have per-write TTLs.

During the RFC process, I took away that we needed exactly two TTL periods, one for sessions, and one for central auth...state (if that's the write word), where the latter needed to be somewhat less than the former to ensure correctness. If this is correct (and again, it would be good to double-check this assumption), then we could support this by deploying a separate instance of Kask for central auth (I think there is an argument for this being a Good Idea™ anyway).

Again, I'd only push back on implementing TTL support in the event we don't really need it, but this is a far smaller deal than with CAS above. If it would be at all helpful, let me know and we can add it straightaway. And, for what it's worth, it has been suggested that if we do support TTLs, that we require them to be less-than-or-equal-to the configured default.

I have, on my local development wiki, a working KaskBagOStuff (naming suggestions welcome) that communicates with the development instance at deployment-sessionstore01.deployment-prep.eqiad.wmflabs (via some tricky port forwarding) and successfully allows basic session management for happy login and logout.

Out of curiosity, what is it that prevents you from using RESTBagOStuff?

RESTBagOStuff is hard-coded to use PUT, while the API uses POST. Also, RESTBagOStuff is hard-coded to use PHP serialization for the value data, which I'm guessing nobody is in favor of. (Well, Joe wanted "json storing base64-encoded serialized php"... <grin>)

I rearranged things a bit so that RESTBagOStuff and KaskBagOStuff share almost all their code, except overrides for the essential differences, so there's not really going to be a lot of new code for this, mostly just some shuffling.

TTL: similarly, there is significant discussion of per-write TTLs, probably via a Cache-Control header, but the notes say this is for the future and we will initially not have per-write TTLs.

During the RFC process, I took away that we needed exactly two TTL periods, one for sessions, and one for central auth...state (if that's the write word), where the latter needed to be somewhat less than the former to ensure correctness. If this is correct (and again, it would be good to double-check this assumption), then we could support this by deploying a separate instance of Kask for central auth (I think there is an argument for this being a Good Idea™ anyway).
Again, I'd only push back on implementing TTL support in the event we don't really need it, but this is a far smaller deal than with CAS above. If it would be at all helpful, let me know and we can add it straightaway. And, for what it's worth, it has been suggested that if we do support TTLs, that we require them to be less-than-or-equal-to the configured default.

Do the existing configuration variables allow us to point session storage at one Kask instance and Central Auth at a different instance, or is there a bit more work for me to do there? (Not sure who this question is pointed at, but someone is bound to know...)

I have, on my local development wiki, a working KaskBagOStuff (naming suggestions welcome) that communicates with the development instance at deployment-sessionstore01.deployment-prep.eqiad.wmflabs (via some tricky port forwarding) and successfully allows basic session management for happy login and logout.

Out of curiosity, what is it that prevents you from using RESTBagOStuff?

RESTBagOStuff is hard-coded to use PUT, while the API uses POST. Also, RESTBagOStuff is hard-coded to use PHP serialization for the value data, which I'm guessing nobody is in favor of. (Well, Joe wanted "json storing base64-encoded serialized php"... <grin>)
I rearranged things a bit so that RESTBagOStuff and KaskBagOStuff share almost all their code, except overrides for the essential differences, so there's not really going to be a lot of new code for this, mostly just some shuffling.

AFAIK, RESTBagOStuff was written for this purpose (though quite some time ago, long before the RFC established what the API would be). I wonder if you can just...change it?

TTL: similarly, there is significant discussion of per-write TTLs, probably via a Cache-Control header, but the notes say this is for the future and we will initially not have per-write TTLs.

During the RFC process, I took away that we needed exactly two TTL periods, one for sessions, and one for central auth...state (if that's the write word), where the latter needed to be somewhat less than the former to ensure correctness. If this is correct (and again, it would be good to double-check this assumption), then we could support this by deploying a separate instance of Kask for central auth (I think there is an argument for this being a Good Idea™ anyway).
Again, I'd only push back on implementing TTL support in the event we don't really need it, but this is a far smaller deal than with CAS above. If it would be at all helpful, let me know and we can add it straightaway. And, for what it's worth, it has been suggested that if we do support TTLs, that we require them to be less-than-or-equal-to the configured default.

Do the existing configuration variables allow us to point session storage at one Kask instance and Central Auth at a different instance, or is there a bit more work for me to do there? (Not sure who this question is pointed at, but someone is bound to know...)

I'm curious about this as well.

WDoranWMF set the point value for this task to 2.Apr 15 2019, 3:23 PM
  1. ERRORS: there is mention of using the more descriptive error information from the API. Within the context of SessionManager/BagOStuff/MultiHttpClient, does that mean adding that to the existing logging output? Or is there somewhere else that detailed error information should appear?

My thoughts: for errors that seem to be retryable during a single Web page load (let's say... 100ms? Something like that), we should retry. Otherwise, logging the detailed errors and giving a generic end user error seems like the right step.

  1. FORMAT: does it matter what format I store the structured session data in? I saw mention of JSON, and that's what my exploratory code uses right now. I don't think the API/service really cares (that's the beauty of opaque key/value storage). But with the extensive discussion of future key/map stores, including the possibility of someday setting fields within sessions, etc. I'd like to confirm JSON is acceptable/desirable for the implementation I'm writing now.

I think JSON is the right format. It's well supported by lots of other languages and tools. If for some reason we need to unpack things with another process, it'd be good not to have to deal with PHP-only formats. I guess the big question is performance, maybe...?

  1. CHECK-AND-SET: BagOStuff/RESTBagOStuff has CAS support, but there's a FIXME tag related to the $casToken in RESTBagOStuff::doGet(). Is this something I need to dig into, or is the existing behavior sufficient for everything that will use the new service?

I need to look at how this is used for sessions already in the MW code. It seems to me that collisions are quite possible, but I don't know how bad they could be. Is this covered in T219526 ?

  1. COLLATERAL DAMAGE: all I've tested is login and logout, and that's just locally on my MediaWiki vagrant. I'm concerned there are major things I'm missing.

I talked through this with Will; we think there's some level of human QA necessary to get some ideas of what's going on. I count ~60 files in the MW codebase that call getSession() or use a $session variable. Ideally we'd have some automated tests for this, but I think we'll need a manual test script, too.

I see discussion about CentralAuth and OAuth, but I'm frankly clueless about how/if these are affected by what I'm doing. What might I be breaking that I need to be concerned with?

I don't have an answer on these.

  1. SET-IF-NOT-EXISTS: the API uses POST (rather than RESTBagOStuff's PUT) for sets, so my KaskBagOStuff uses POST as well. Per https://www.mediawiki.org/wiki/Requests_for_comment/SessionStorageAPI#Notes, I will do nothing with PUT at this time, and we will not initially have set-if-not-exists support. This isn't really a question, I just wanted to explicitly state it.

It doesn't make sense to use methods that aren't supported by the service.

  1. TTL: similarly, there is significant discussion of per-write TTLs, probably via a Cache-Control header, but the notes say this is for the future and we will initially not have per-write TTLs.

Is this captured in T219525?

Change 507207 had a related patch set uploaded (by BPirkle; owner: BPirkle):
[mediawiki/core@master] Add additional configuation parameters to RESTBagOStuff

https://gerrit.wikimedia.org/r/507207

First cut at a patch set posted. Per feedback, I made RESTBagOStuff configurable instead of creating a new KaskBagOStuff class. If we stay with this approach, we'll want to update a bunch of tickets that @EvanProdromou created (sorry).

One thing I want to make sure of is that KaskBagOStuff is compatible with the TLS setup defined in T209109, as well as supporting username/password HTTP Basic authentication.

Eevans added a comment.EditedMay 9 2019, 6:32 PM

One thing I want to make sure of is that KaskBagOStuff is compatible with the TLS setup defined in T209109, as well as supporting username/password HTTP Basic authentication.

To be totally clear:

Kask supports (optional) TLS, which at a minimum could be used to encrypt the connection, but it could also be used for certificate verification. The former obscures sensitive information over the wire, the latter could serve as a form of authentication. The former is simple from an operational perspective, the latter less so because it means dealing with an internal CA.

Kask does NOT currently support HTTP Basic authentication. It could, rather easily, it just doesn't at this time. The selling point in using it IMO, would be as a way of providing some minimal authentication, while being simpler to implement than signed certificates.

So the options as they stand right now are:

  • No TLS encryption, no authentication
  • TLS encryption, no authentication
  • TLS encryption, with verified signed certificates

If we implemented Basic Auth, we could add the following options to the list:

  • No TLS encryption, Basic Auth (exposes credentials over the wire)
  • TLS encryption, Basic Auth
  • TLS encryption with verified signed certificates and Basic Auth (I don't know why you'd want to, but you could...)

Personally, I had thought "TLS encryption + signed certs" was Best™, but that "TLS encryption + Basic Auth" was a reasonable compromise if SRE didn't have, or preferred not to use an internal CA. When I raised this with others (various meetings with either Security or SRE), no one seemed overly concerned one way or another, so I stopped short of implementing Basic Auth until I knew it was needed.

TL;DR At a minimum, the relevant BagOStuff should support whatever configuration we plan to use in production, and unfortunately, I don't think we know what that is.

@BPirkle perhaps you could chime in WRT to precedent (if there is any), or the degree of difficulty relative to each approach?

RESTBagOStuff internally uses MultiHttpClient for communications, which supports TLS + signed certs. I haven't actually tried this, but the code looks solid.

A configuration would look something like this:

$wgObjectCaches['sessions'] = [
	'class' => 'RESTBagOStuff',
	'url' => 'http://127.0.0.1:8080/sessions/v1/',
	'setMethod' => 'POST',
	'setHeaders' => [ 'content-type' => 'application/octet-stream' ],
	'extendedErrorBodyFields' => [ 'type', 'title', 'detail', 'instance' ]
	'caBundlePath' => '/path/to/cert
];
$wgSessionCacheType = 'sessions';

See https://curl.haxx.se/libcurl/c/CURLOPT_CAINFO.html for more info.
Once I finally finish T202352, this will become relevant as well: http://docs.guzzlephp.org/en/stable/request-options.html#verify
There shouldn't be any change in behavior when we merge T202352, as Guzzle uses curl internally, but I mention both for completeness.

MultiHttpClient (and therefore RESTBagOStuff) do not currently support Basic Auth. It wouldn't be hard to add that:

https://curl.haxx.se/libcurl/c/CURLOPT_USERPWD.html
http://docs.guzzlephp.org/en/stable/request-options.html#verify

However, if there's no compelling need to do Basic Auth, I'd vote to support the option that it sounds like both RESTBagOStuff and Kask already support.

[ ... ]
However, if there's no compelling need to do Basic Auth, I'd vote to support the option that it sounds like both RESTBagOStuff and Kask already support.

No argument from me, but again, we should see if SRE are OK with this. I'm not sure if we have a suitable Cert Authority for this, or the tools and process to gracefully handle renewal, (or the will to obtain and use either :)).

Change 507207 merged by jenkins-bot:
[mediawiki/core@master] Add additional configuation parameters to RESTBagOStuff

https://gerrit.wikimedia.org/r/507207