Page MenuHomePhabricator

Cookbook for centralised logouts and session status queries
Closed, ResolvedPublic

Description

There are situations where it's needed that we logout a user centrally and promptly, e.g. because a laptop was lost. With the adoption of Apereo CAS as your identity provider access to most web services services can be centrally revoked via Single-Sign Logout (SLO). There are however also use cases beyond CAS-enabled web services that in the future we also want to support, e.g. terminating a user's SSH session or logging them off from ttys on the serial console. As such, we'll need a mechanism will is flexible and extensible.

The eventual design/setup will look like this:

  • We create some central service logout directory, e.g. /etc/wikimedia/logout.d fleet-wide. This directory contains executable scripts (which can be written in an arbitrary language, so for the most common case Python). They are managed with Puppet (or we could also not use recurse/purge and also allow scripts shipped via debs). For 99% of all cases order won't matter, but we can establish that they are executed in alphabetical order and use a scheme like 50-foo, 50-bar, 99-baz.
  • Logout scripts expect the following "API":
# {} mutually exclusive
# [] optional
$0 {logout,query} --user <uid> --cn <cn> [all]

(Along with possible aliases like -u or -c)

Since some services operate on the UID and some on the CN/Wikimedia Developer Name, the cookbook will simply pass both and the script can use what it needs. logout returns 0 and no output if a user has been successfully logged out or if the user wasn't logged in to begin with. In case there's an error, it's non-zero and an arbitrary error message is returned. query returns 1 is the user is logged in and 0 if not.

  • One central/initial logout script will be /etc/wikimedia/logout.d/cas which will be present on the primary IDP host. Once executed by the cookbook, it'll detect the user's current CAS TGT and post an HTTP DELETE to https://idp-test.wikimedia.org/api/ssoSessions/$TGT which logs out the user from all active CAS sessions. The CAS SLO request is a "fire&forget", so if there are transient issues (like a 5xx, brief network blip) on the hosts which are meant to be logged out, then these won't be repeated. That's a conceptual issue and will happen rarely, but we can adddress it by running "query" again after the initial "logout" run and offering to re-run failed scripts.
  • Other services where the CAS logout doesn't work ATM (currently only LibreNMS) can deploy a /etc/wikimedia/logout.d/librenms which e.g. restarts Apache (which should terminate the session as well). In some cases we will also need to terminate the users's session in mod_cas via the SLO call, but also need to void some internal state in the backend to really log off the user.
  • The cookbook would simply traverse /etc/etc/wikimedia/logout.d/* with the "logout" action and passing CN/UID. Initially it can simply run fleet-wide, but we can also make it smarter by preparing some targets depending on the level of access a user has (SREs with global access or e.g. researchers or so). In addition there would be a cookbook (or a flag to the logout one) which only runs "query", which detects where/if a user is currently logged in. This way we also decouple the cookbook from the service logouts (since these might change more often and we don't want to update the cookbook all the time)
  • Longterm we can also use the logout script framework to provide logouts for individual services or fleet-wide, so that with something like https://idp-logout.wikimedia.org/grafana/$USER a user can call the SLO cookbook or trigger the service-specific logout (something for later with a more detailed design once the general logout logic is established).

Event Timeline

I think it would also be nice to have the ability to show information for all logged in users. logging out all users from a specific service seems less useful but could probably also add it easily. i wonder if the following my be better

# {} mutually exclusive
# [] optional
$0 {logout,query} --user <uid> --cn <cn> [all]

would have cause also have -u -c short options

I think it would also be nice to have the ability to show information for all logged in users. logging out all users from a specific service seems less useful but could probably also add it easily. i wonder if the following my be better

# {} mutually exclusive
# [] optional
$0 {logout,query} --user <uid> --cn <cn> [all]

Oh yes, these are meant to be mutually exclusive, copying that into the task description to make it more explicit.

would have cause also have -u -c short options

Sure, works for me.

Change 695203 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add logout script for sretest

https://gerrit.wikimedia.org/r/695203

The cookbook would simply traverse /etc/etc/wikimedia/logout.d/*

I'm wondering if it would be quicker/simpler at this point to ship also a simple script that does the traversal so that the cookbook will simply run that one. At that point it would be easier to also implement the logout only from service X passing it (optionally) to the traversal script.

I'm wondering if it would be quicker/simpler at this point to ship also a simple script that does the traversal so that the cookbook will simply run that one

I had a think on this and im not sure we gain much be adding an additional level of indirection. further i can see the simple traversals script been not so simple as it will need to handle errors and output for all the intermediate scripts as well as selecting which script to execute. The helper script would also have a complex output as it would need to signal to the calling cook book which scripts failed as the cookbooks may need this information to preform other tasks. right now it seems like it makes more sense to handle all this logic in one place, maybe im missing something though?

The additional complexity that I foresee in the cookbook is this:

# host: scripts
host1: 10foo, 20bar, 30baz
host2: 20bar
host3: 10foo, 30baz

You have to run the logic through each host, one at a time, because each one might have a different list of scripts and so it will be harder to parallelize.
While if running a wrapper in each host that is the same, you could just run it blindly.
The wrapper could print a json that would be easily parsable by the cookbook, but maybe is overkill and YMMV.

Change 695341 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/software/pywmflib@master] logoutd: create logoutd base class

https://gerrit.wikimedia.org/r/695341

Change 695365 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] P:base: add logoutd profile to base

https://gerrit.wikimedia.org/r/695365

Change 695365 merged by Jbond:

[operations/puppet@production] P:base: add logoutd profile to base

https://gerrit.wikimedia.org/r/695365

Change 695203 merged by Muehlenhoff:

[operations/puppet@production] Add logout script for sretest

https://gerrit.wikimedia.org/r/695203

Change 700389 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add helper tool for returning a user's current TGT (WIP)

https://gerrit.wikimedia.org/r/700389

Change 700922 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] P:logoutd: create wrapper script for calling logout.d scripts

https://gerrit.wikimedia.org/r/700922

Change 700389 merged by Muehlenhoff:

[operations/puppet@production] Add helper tool for returning a user's current TGT

https://gerrit.wikimedia.org/r/700389

Change 695341 merged by Jbond:

[operations/software/pywmflib@master] IDM: create new idm library with logoutd base class

https://gerrit.wikimedia.org/r/695341

Change 701350 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add logout.d script for the IDP (WIP)

https://gerrit.wikimedia.org/r/701350

Change 701350 merged by Muehlenhoff:

[operations/puppet@production] Add logout.d script for the IDP

https://gerrit.wikimedia.org/r/701350

Change 701442 had a related patch set uploaded (by Volans; author: Volans):

[operations/puppet@production] logoutd: add support for Python 3.5

https://gerrit.wikimedia.org/r/701442

Change 700922 merged by Jbond:

[operations/puppet@production] P:logoutd: create wrapper script for calling logout.d scripts

https://gerrit.wikimedia.org/r/700922

Change 703571 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Deploy systemd-login logout.d script fleet-wide

https://gerrit.wikimedia.org/r/703571

Change 701442 merged by Volans:

[operations/puppet@production] logoutd: add support for Python 3.5

https://gerrit.wikimedia.org/r/701442

Change 704325 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/cookbooks@master] Don't expect all hosts to be reachable for the logout cookbook

https://gerrit.wikimedia.org/r/704325

Change 704325 merged by Muehlenhoff:

[operations/cookbooks@master] Don't expect all hosts to be reachable for the logout cookbook

https://gerrit.wikimedia.org/r/704325

Change 703571 merged by Muehlenhoff:

[operations/puppet@production] Deploy systemd-login logout.d script fleet-wide

https://gerrit.wikimedia.org/r/703571

Change 704584 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] systemdlogind-logout.py: Check login state prior to logout attempt

https://gerrit.wikimedia.org/r/704584

Change 704584 merged by Muehlenhoff:

[operations/puppet@production] systemdlogind-logout.py: Check login state prior to logout attempt

https://gerrit.wikimedia.org/r/704584

Change 704761 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/cookbooks@master] logout: Catch RemoteExecutionError exception

https://gerrit.wikimedia.org/r/704761

Change 704761 merged by Muehlenhoff:

[operations/cookbooks@master] logout: Catch RemoteExecutionError exception

https://gerrit.wikimedia.org/r/704761

Should services like Gerrit, Mailman, etc. be added to this?

Should services like Gerrit, Mailman, etc. be added to this?

Yeah, I created a few sub tasks.

jbond claimed this task.

this has now been implmented