Page MenuHomePhabricator

Store client hint mapping rows for logout events
Open, Needs TriagePublic

Description

We currently send client hint header data when a POST request is made from Special:UserLogout. We need to modify CheckUser to record these headers as client hint mapping rows.

We also need to handle logouts that are done via a one-click POST to the logout API from Vector/Minerva.

Note that while working on this, we might decide that it's easier to use the JavaScript API for client hints.

Acceptance criteria:

  • When a user logs out via a UI element on Vector/Minerva, or via Special:UserLogout, CheckUser creates rows in cu_useragent_clienthints_map

Event Timeline

A couple of notes:

  • We'll need a way to hook into "logout just happened". We can use mw.hook( LOGOUT_EVENT ).add() for that, though I am not sure if that is comprehensive.
  • We need to have a more strict time limit for posting a logout event for a user
  • We need to make sure that we don't leak any information about a user's logged-in or logged-out status via the API, e.g. avoid the scenario where User A posts a logout event for User B, and gets an error saying "User B is logged-in" etc. One way to do this is to return no information regardless of whether the POST is valid or invalid.
  • To validate events, we'll need to check that the user referenced in {id} is logged out.
    • We need to limit spam/abuse of this endpoint, e.g. don't allow unlimited POSTs to the endpoint just because the user ID in question is logged-out

Some addendum thoughts:

  • The PHP hook implementation of UserLogoutComplete by CheckUser already stores that a logout event has occurred. The API could check for a row that indicates a logout in cu_private_event. Using this method is dependent on T324907.
  • When using Special:Logout it is possible to inject HTML using the above hook, however, when using the API this is not possible.

A couple of notes:

  • We'll need a way to hook into "logout just happened". We can use mw.hook( LOGOUT_EVENT ).add() for that, though I am not sure if that is comprehensive.
  • We need to have a more strict time limit for posting a logout event for a user
  • We need to make sure that we don't leak any information about a user's logged-in or logged-out status via the API, e.g. avoid the scenario where User A posts a logout event for User B, and gets an error saying "User B is logged-in" etc. One way to do this is to return no information regardless of whether the POST is valid or invalid.
  • To validate events, we'll need to check that the user referenced in {id} is logged out.
    • We need to limit spam/abuse of this endpoint, e.g. don't allow unlimited POSTs to the endpoint just because the user ID in question is logged-out

Coming back to this, I think we may want to use both js and headers for this data:

  1. User is on Special:UserLogout. We can set the Accept-CH header here and capture the client hint data similarly to what we implemented in T345818: Store client hint mapping rows for login events, except we'll use onUserLogoutComplete.
  2. User clicks logout in the user tools menu, and does the logout via JS. We use mw.hook( LOGOUT_EVENT ) and post to the client hints endpoint.

I'm wondering if it's possible, though, to just use the headers for logout events. We'd have to somehow issue a redirect after the user clicks "Logout" in order to set the request headers for Accept-CH.

I'm wondering if it's possible, though, to just use the headers for logout events. We'd have to somehow issue a redirect after the user clicks "Logout" in order to set the request headers for Accept-CH.

This seems like more trouble than it is worth, having looked at it a little bit just now.

I'm wondering if it's possible, though, to just use the headers for logout events. We'd have to somehow issue a redirect after the user clicks "Logout" in order to set the request headers for Accept-CH.

This seems like more trouble than it is worth, having looked at it a little bit just now.

For the JS API collection approach, CheckUser's onUserLogoutComplete can encode the user's ID and the insert ID in &$injected_html. Then the client hints JS code can use this to issue a POST request to the client hints REST endpoint.

Change #1091914 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/CheckUser@master] (WIP) client hints: Collect on user logout

https://gerrit.wikimedia.org/r/1091914

Change #1092187 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/CheckUser@master] (WIP) client hints: Collect on Special:UserLogout

https://gerrit.wikimedia.org/r/1092187

Change #1092229 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/core@master] (WIP) UserLogoutCompleteHook: Allow for returning data in ApiLogout

https://gerrit.wikimedia.org/r/1092229

Change #1092231 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/CentralAuth@master] (WIP) UserLogoutHookHandler: Don't modify injected_html for API responses

https://gerrit.wikimedia.org/r/1092231

Change #1092187 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] client hints: Collect on Special:UserLogout

https://gerrit.wikimedia.org/r/1092187

I'm wondering if it's possible, though, to just use the headers for logout events. We'd have to somehow issue a redirect after the user clicks "Logout" in order to set the request headers for Accept-CH.

This seems like more trouble than it is worth, having looked at it a little bit just now.

For the JS API collection approach, CheckUser's onUserLogoutComplete can encode the user's ID and the insert ID in &$injected_html. Then the client hints JS code can use this to issue a POST request to the client hints REST endpoint.

Spoke with @Dreamy_Jazz about how to approach the API logouts. What we'll plan to do:

  • load ext.clientHints on Special:Userlogout, which is where users are redirected to after API logouts
  • create a new REST endpoint, something like /checkuser/v0/useragent-clienthints/api-logout
  • Rate limit requests to the endpoint by IP address, at a fairly low rate TBD
  • for requests to the endpoint, we'll look for rows in cu_private_event that match user-logout for the IP address making the query. Then we'll validate that the timestamp is within a reasonable period of time (TBD). If that validation passes, we'll save the client hints data supplied in the request with the cupe_id in the matching cu_private_event row

No changes to CentralAuth, or core, would be required.

Change #1092229 abandoned by Kosta Harlan:

[mediawiki/core@master] (WIP) UserLogoutCompleteHook: Allow for returning data in ApiLogout

https://gerrit.wikimedia.org/r/1092229

Change #1092231 abandoned by Kosta Harlan:

[mediawiki/extensions/CentralAuth@master] (WIP) UserLogoutHookHandler: Don't modify injected_html for API responses

Reason:

Going a different route, T345819#10343777

https://gerrit.wikimedia.org/r/1092231

That seems a bit messy in case of IPs used by many people (such as certain cgNATs).

Change #1094389 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/core@master] logout: Modify flow to bring user to Special:Userlogout

https://gerrit.wikimedia.org/r/1094389

Change #1091914 abandoned by Kosta Harlan:

[mediawiki/extensions/CheckUser@master] (WIP) client hints: Collect on API user logout

Reason:

Proposing to go with I3b5e606e0671368a7e58d159d3c1461cb94212e3 instead

https://gerrit.wikimedia.org/r/1091914

This patch does the following:

  • bring user the Special:UserLogout when they click or tap on "Log out" or "Exit session" in the tools menu on Vector and Minerva
  • on Special:UserLogout, execute the JS to log the user out via the API
  • Reload Special:UserLogout so the user sees it in a logged-out state

This approachs solves the problem in this task: needing to set client hint headers on Special:UserLogout before the logout request happens. The downside is that there's an extra request involved in the logout flow.

Here's what that looks like on desktop:

And here's what it looks like on slow 4G:

This feels super hacky. Why not just use the JS API?

This feels super hacky. Why not just use the JS API?

Because the JS API approach is would likely be more hacky. (Happy to be corrected if you have ideas on how to address this.) I think if we don't do the redirect to UserLogout approach, we'd probably just decline this task.

Why the JS API doesn't really work here:

  • We need to link the client hints data with the CheckUser private event ID, which is generated in UserLogoutComplete. So it means we need a way for ext.clientHints to interject before the API request to ApiLogout is made, gather the client hints, and pass them along with the ApiLogout request. I don't see a good path to implementing that in mediawiki.page.ready/ready.js.
  • ext.clientHints does support posting client hints via the JS API if it finds a mw.config value containing wgCheckUserClientHintsPrivateEventId. But, there's no easy way to create a generic mechanism for passing back the generated CheckUser private event ID to the client. And even if we did, there'd be another issue in that as soon as the API request is completed, we redirect the user to Special:UserLogout, so it's unclear if the POST to the client hints ingestion endpoint would complete successfully. (I sketched this approach out in other patches linked to this task.)
  • We could collect the client hints data via the JS API after the user logs out, and then try to match it with the most likely record for CheckUser private event data. That's also not great: you'd need some kind of holding area, while you wait to make sure that a valid record exists in the DB, and you'd also have to create an API that would allow anyone (well, maybe you could limit by IP address) to link arbitrary client hint data with an arbitrary CheckUser private event record. That is probably solvable, but seems like way more effort and risk than is justifiable for this feature.

So, taking all of that into account, I find a slightly worse UX in the logout flow (which is infrequent enough for most users) is not such a bad tradeoff.

You could define a client hint data structure that can be attached to any API request (see e.g. centralauthtoken), have an API hook that writes it into some service it can be read from by the CheckUser logic, then add a hook mechanism to logoutViaPost that lets subscribers add information to the API request.

That last part seems a little annoying but seems useful enough that it's worth building some sort of core mechanism for it. E.g.

// in the mw.Api module
mw.Api.prepareExtensibleApiRequest = function ( hookName ) {
    let data = { params = {}, promise: $.Deferred().promise() };
    mw.hook( 'hookName' ).fire( data );
    return data.promise.then( data.params );
};

// in ready.js
api.prepareExtensibleApiRequest( 'extendLogout' ).then( ( params ) => {
    api.postWithToken( 'csrf', $.extend( {}, params, { action: 'logout' } ).then...
} );

// in CheckUser
mw.hook( 'extendLogout' ).add ( ( data ) => {
    data.promise = data.promise.then( () => {
        return navigator.userAgentData.getHighEntropyValues(...).then( ( values ) => {
            data.params.checkUser = { clientHints: values };
        } );
    } );
} );

You could define a client hint data structure that can be attached to any API request (see e.g. centralauthtoken), have an API hook that writes it into some service it can be read from by the CheckUser logic, then add a hook mechanism to logoutViaPost that lets subscribers add information to the API request.

That last part seems a little annoying but seems useful enough that it's worth building some sort of core mechanism for it. E.g.

// in the mw.Api module
mw.Api.prepareExtensibleApiRequest = function ( hookName ) {
    let data = { params = {}, promise: $.Deferred().promise() };
    mw.hook( 'hookName' ).fire( data );
    return data.promise.then( data.params );
};

// in ready.js
api.prepareExtensibleApiRequest( 'extendLogout' ).then( ( params ) => {
    api.postWithToken( 'csrf', $.extend( {}, params, { action: 'logout' } ).then...
} );

// in CheckUser
mw.hook( 'extendLogout' ).add ( ( data ) => {
    data.promise = data.promise.then( () => {
        return navigator.userAgentData.getHighEntropyValues(...).then( ( values ) => {
            data.params.checkUser = { clientHints: values };
        } );
    } );
} );

Thanks, I'll have another look at this.

Change #1099153 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/core@master] mediawiki.api: Introduce prepareExtensibleApiRequest and use in logout flow

https://gerrit.wikimedia.org/r/1099153

Change #1094389 abandoned by Kosta Harlan:

[mediawiki/core@master] logout: Modify flow to bring user to Special:Userlogout

Reason:

Will go with something based on Ia491491355151b5c1f5f8f987b4fe910194927fd

https://gerrit.wikimedia.org/r/1094389

Change #1099155 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/CheckUser@master] (WIP) clienthints: Collect data on JS logout events

https://gerrit.wikimedia.org/r/1099155

Change #1099153 merged by jenkins-bot:

[mediawiki/core@master] mediawiki.api: Introduce prepareExtensibleApiRequest and use in logout flow

https://gerrit.wikimedia.org/r/1099153

Change #1099223 had a related patch set uploaded (by Dreamy Jazz; author: Dreamy Jazz):

[mediawiki/extensions/CheckUser@master] Follow-up: clienthints: Collect data on JS logout events

https://gerrit.wikimedia.org/r/1099223

Change #1099155 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] clienthints: Collect data on JS logout events

https://gerrit.wikimedia.org/r/1099155