Page MenuHomePhabricator

Setting the EditPage::POST_EDIT_COOKIE_KEY_PREFIX cookie on every edit causes the Cookie header to be truncated for bots and browsers.
Closed, InvalidPublic

Description

Upgraded from MW 1.29 to MW 1.31.

AutoWikiBrowser, PyWikiBot, and mwclient are all affected by this. This also affects browsers, but requires rapid fire editing.

The EditPage::POST_EDIT_COOKIE_KEY_PREFIX cookie is being set on every edit causing the cookie header to infinity grow. This affects bots the most since the bots do not follow the redirect to the view page to get the cookie cleared. This results in the cookie header being truncated which causes the bot to get logged out. One of our editors is also able to reproduce this behavior in Firefox and Chrome due to them making rapid edits along with not always loading the view page afterwards.

The only reason this cookie appears to have been added was to selectively add the mediawiki.action.view.postEdit resource module to the page. Previously in MW 1.29 it was being added to every article view.

This happens in both the web and API entry points.

Reference: See EditPage::setPostEditCookie() where it creates the cookie key: $postEditKey = self::POST_EDIT_COOKIE_KEY_PREFIX . $revisionId;

Event Timeline

Setting the same cookie many times it should only be stored (and sent to the server) once. If the clients are using a naive approach where they append several values for the same cookie name, it is a client bug.

This is MediaWiki making many cookies. See EditPage::setPostEditCookie() where it creates the cookie key: $postEditKey = self::POST_EDIT_COOKIE_KEY_PREFIX . $revisionId;

Change 477858 had a related patch set uploaded (by Alexia; owner: Alexia):
[mediawiki/core@master] Do not set the post edit cookie for API made edits.

https://gerrit.wikimedia.org/r/477858

The cookie expires after 20 minutes, and should be removed immediately on successful edit by the JS anyway. Is this not happening in the client?

The cookie expires after 20 minutes, and should be removed immediately on successful edit by the JS anyway. Is this not happening in the client?

See the description: "This affects bots the most since the bots do not follow the redirect to the view page to get the cookie cleared."

Sounds like this is a known bug in the bots, then?

Sounds like this is a known bug in the bots, then?

Why would a robot follow a redirect to a view page? Robots don't have eyes. They use APIs. They also do not run Javascript.

HTTP clients are expected to follow the HTTP spec, including following redirects.

I imagine some bots don't execute JS, that's true (though certainly e.g. AutoWikiBrowser does); they can wait the 20 minutes for cookie expiry.

Reducing the expiry time means that people with very slow/spotty connections will have challenges instead.

You mean that https://www.mediawiki.org/wiki/API:Edit should be updated to state that the client must either run JavaScript (which most bots cannot do) or speculatively try to time the editing frequencey in order not to accumulate too many cookies (which sounds quite crazy)?

In practice, the easiest solution for a bot author then seems to be to ignore/delete the PostEditRevision cookies (like this), but that's a messy workaround.

I fail to see why the cookie is set for API edits, and why the patch above cannot be merged.

Could you add some symptoms to this so one finds it . It surfaced for me as an HTTP 400 posting hundreds of edits to my wiki. Was particularly hard to debug because the cookies and not my edits were causing the BAD REQUEST. I'd definitely add HTTP 400 Mediawiki API and would recommend adding Apache, cURL and PHP to avoid having everyone automating edits through the API having to debug the cryptic error message BAD REQUEST.

I can't reproduce the behavior where post-edit cookies are set on API requests. I'd consider that a bug if this was the case, but as far as I can tell it isn't.

For non-API edits, I think that if your tool respects the 'Set-Cookie' headers, then it should also respect 'Location' headers. If you do, then the cookies will be cleared on the subsequent request.

The only remaining issue is that the cookies last for 20 minutes, which does seem a little long. They definitely need to be a few minutes at least, because the expiration time is a timestamp (not a duration), and the clocks between client and server can easily be out of sync by a few minutes. I think the duration is arbitrary, we could arbitrarily drop it to let's say 5 minutes, but that probably wouldn't actually resolve the issue of cookies accumulating for automated edits anyway…

Change 477858 abandoned by Bartosz Dziewoński:

[mediawiki/core@master] Do not set the post edit cookie for API made edits.

Reason:

I can't reproduce the bug this is supposed to fix. If it still occurs, please reopen with reproduction steps.

https://gerrit.wikimedia.org/r/477858

Per my previous comment.