Page MenuHomePhabricator

Require POST in ?action=purge
Closed, ResolvedPublic

Assigned To
Authored By
aaron
May 12 2016, 7:55 PM
Referenced Files
F4025051: pasted_file
May 17 2016, 9:37 PM
Tokens
"100" token, awarded by Liuxinyu970226."Dislike" token, awarded by Pppery."Dislike" token, awarded by MGChecker."Heartbreak" token, awarded by Luke081515."Dislike" token, awarded by Xaosflux."Heartbreak" token, awarded by Kghbln.

Description

Instead of just suppressing log warnings to fix T92357, this action should require POST. It causes writes to page_touched in the database and possibly does expensive purges. This is not really appropriate for a GET request.

Any gadgets/JS for users with 'purge' rights can be updated to use POST instead of GET.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I oppose this change, for users with purge right they should be able to purge without confirmation regardless POST or GET.

Users are able to purge without confirmation using POST.

I oppose this change, for users with purge right they should be able to purge without confirmation regardless POST or GET.

Users are able to purge without confirmation using POST.

Now they can purge with GET as well.

Users are able to purge without confirmation using POST.

In light of the proposed PR this statement is incorrect, logged-in users (we are not talking about anonymous users) are forced to confirm a purge as outlined in one of the above comments or to cite GWicke:

"...The only difference is that all users will be shown a form with a submit button, while previously this was only done for anons / those without the specific rights....".

I oppose this change, for users with purge right they should be able to purge without confirmation regardless POST or GET.

Users are able to purge without confirmation using POST.

Now they can purge with GET as well.

I think the problem here is that we're just talking across each other. So I'll try to break this pattern by trying to really listen to what you're saying, or at least what I think you're trying to say.

What I hear you as saying is this: this has been in place for years, people have come to expect it, and if we change it, things that have been quietly working for a very long time will break. Having things that used to work perfectly well break all of a sudden is very frustrating. And then they either stay broken, or they have to be fixed, which can take a lot of time and effort. And to just say "well, they can fix it" is not respectful, because it just sort of assumes people have nothing better to do with their time than to fix things that worked perfectly fine until some developer got the bright idea to change them.

If this is what you're saying, then let me just say: I understand where you're coming from. This makes a lot of sense. But I want to try and convince you that there is a lot of good to users that would come from this seemingly small change, and that it's so much good that it's worth the trouble. But before I go on, let me stop here and ask: is this what you are saying?

Also, how about mathpurge?

See also related T74537: support regular action=purge

As @Physikerwelt said in the task:

purge does not work anymore for math anyhow since cache is now handled by restbase

The purge action is a misfeature. It's a hack that should not be necessary and we've somewhat recognized this by not exposing it in the user interface (T135170#2302814 generously describes ?action=purge as a "hidden feature").

Instead of debating whether to make page purges use GET or POST, we should be focusing on ways to deprecate and remove the action altogether (T56902).

I oppose this change, for users with purge right they should be able to purge without confirmation regardless POST or GET.

I would say that I totally support this patch, GET should be deprecated

I placed this in the "Needs details or plan" column of MediaWiki-Action-API since the impact of this proposed change on existing API clients should be determined, and if there are a significant number the usual deprecation process should be followed unless there's a compelling reason to break those clients without warning.

<<+channel:DBPerformance +url:"*api.php*" +message:*writes*>> shows nothing in 7 days (not even filtering for patrol action). Removing the "message" condition just shows random unrelated DB connection errors and the like. The only warnings I see are for the GUI action=markpatrolled endpoint. Apparently, clients are using the HTTP verb correctly.

There seems to be something wrong with your methodology. POSTs are more common, but GETs are not zero:

anomie@fluorine:/a/mw-log$ fgrep action=purge api.log | tee ~/1 | awk '{print $10}' | sort | uniq -c
   2417 GET
  20528 POST

There are five users with over 100 GETs each in that dataset, and another five with 10–99. There are 15 with over 100 POSTs, and 99 with 10–99.

For a grep over the archived logs from 2016-07-09 06:28:43 to 2016-07-26 21:03:24 that I had done a while back, I get

anomie@fluorine:~$ awk '{print $10}' < ~/T135170.txt | sort | uniq -c
 100802 GET
 479943 POST

There are 12 users with over 1000 GETs each during that time period, and another 6 with 100–999. There are 20 users with over 1000 POSTs, and 147 with 100–999.

Maybe expectations are somehow not set for the api...that sounds like a bug, since those logs should agree.

Change 304157 had a related patch set uploaded (by Aaron Schulz):
Require POST for action=purge in PurgeAction

https://gerrit.wikimedia.org/r/304157

Change 304157 merged by jenkins-bot:
Require POST for action=purge in PurgeAction

https://gerrit.wikimedia.org/r/304157

https://gerrit.wikimedia.org/r/304157

Given that https://gerrit.wikimedia.org/r/304157 as been merged and the comment section only notes "However these links will now point to a confirmation form. To preserve the immediate-purge-redirect effect, these scripts should be updated to use the API instead."

So, I'm seeking an understanding of what exactly does this entail? Do you have an example as to what are those "... updated to use the API instead ..." or is it expected to endeavour on a trial and error approach?

https://gerrit.wikimedia.org/r/304157

So, I'm seeking an understanding of what exactly does this entail? Do you have an example as to what are those "... updated to use the API instead ..." or is it expected to endeavour on a trial and error approach?

see: Snippets/Purge action.

Z.

see: Snippets/Purge action.

Maybe I should have been clearer about it since we are talking about PHP functionality that changed its behaviour (as of Purge.php), so I'm looking for a PHP example.

Maybe I should have been clearer about it since we are talking about PHP functionality that changed its behaviour (as of Purge.php), so I'm looking for a PHP example.

I assume by "PHP example", you mean HTML. Per the main purpose of this task, it is undesirable to perform write actions on regular link navigation from a browser (aka "GET" request). Write actions being: purging caches, establishing a master DB connection and changing the page_touched field, etc. This must only happen on POST requests so that traffic can distinguish between routes for primary and secondary clusters. It is also more semantically correct so that browsers will not silently perform actions multiple times when navigating backwards, or cause database changes to be associated with your account without your knowledge when merely clicking on a link to Wikipedia from elsewhere.

There will not be any url you can make a link to in HTML or PHP that will result in an immediate purge. However you can output an HTML link in PHP and augment that link with client-side javascript, handle click events, and perform an API request in the background.

This is similar to what we do with watch/unwatch. The plain links to action=watch result in a confirmation form when navigated to directly, but for the majority of uses the link is enhanced by client-side scripting to submit the action directly to the API in the background with a POST request.

An extension can do the same thing. Produce a plain action=purge link (the same as before), but also write a small JS module that enhances it as described above.

This is similar to what we do with watch/unwatch. The plain links to action=watch result in a confirmation form when navigated to directly, but for the majority of uses the link is enhanced by client-side scripting to submit the action directly to the API in the background with a POST request.

Well, and I hoped for a more specific answer because when looking at [0] nothing indicates how or which module is loaded and while browsing [1] I somehow missing a mediawiki.action.watch.js that I would expect when a reference is made to action=watch and JS module. I'd like to iterate that I don't have time to chase around MW code just to restore previous behaviour therefore specific pointers as to where and how would be surely appreciated.

[0] https://github.com/wikimedia/mediawiki/blob/master/includes/actions/WatchAction.php
[1] https://github.com/wikimedia/mediawiki/tree/master/resources/src/mediawiki.action

Template Purge is used on 200 different wiki. Someone can update template in https://www.mediawiki.org/wiki/Template:Purge So with an example I can start to update template in wiki.

Maybe expectations are somehow not set for the api...that sounds like a bug, since those logs should agree.

Fixed in 1f44e05fb75fc07d3f32154129f47d806a146330.

Template Purge is used on 200 different wiki. Someone can update template in https://www.mediawiki.org/wiki/Template:Purge So with an example I can start to update template in wiki.

It's possible find a solution or for now don't exist workaruound for this?

A request to revert this change is now open: T143531

Hmmm, why is this task still marked as open/needs triage? Is there anything left to do here?

Hmmm, why is this task still marked as open/needs triage? Is there anything left to do here?

The API I suppose.

Hmmm, why is this task still marked as open/needs triage? Is there anything left to do here?

The API I suppose.

Ah, right: https://gerrit.wikimedia.org/r/288462 has not yet been merged.

Hmmm, why is this task still marked as open/needs triage? Is there anything left to do here?

The API I suppose.

I notice that if the API is called with options

{
	action: 'purge',
	titles: 'Main Page'
}

the API returns the full HTML of Main_Page, but if the options are

{
	format: 'json',
	action: 'purge',
	titles: 'Main Page'
}

the API returns

{
	"batchcomplete": "",
	"purge": [
		{
			"ns": 0,
			"title": "Main Page",
			"purged": ""
		}
	]
}

Given that action=purge requires a POST all the time now, is there any reason to return the page HTML? Shouldn't the return be either the XML or JSON form of the current JSON return? Is this behavior more work for this task, a separate task, or expected?

@Unready are you sure it returns the page HTML ? The default for the format parameter is jsonfm, which is an annotated HTML version of json. So that would be HTML, but not the page's html...

Also, probably formatversion: 2 should be added to api call as well, to get the latest version of the api output.

Oops. I saw a bunch of HTML and thought it was the page. Yeah, it's HTML-formatted JSON. My bad.

<!DOCTYPE html>
<html class="client-nojs" lang="en" dir="ltr">
<head>
<meta charset="UTF-8"/>
<title>MediaWiki API result - Wikipedia, the free encyclopedia</title>
<script>document.documentElement.className = document.documentElement.className.replace( /(^|\s)client-nojs(\s|$)/, "$1client-js$2" );</script>
...
</head>
<body class="mediawiki ltr sitedir-ltr mw-hide-empty-elt ns--1 ns-special mw-special-ApiHelp page-Special_ApiHelp rootpage-Special_ApiHelp skin-apioutput action-view feature-page-action-bar-v2">
		<div class="mw-body" role="main">
			<h1 class="firstHeading">MediaWiki API result</h1>
			<div class="mw-body-content">
				<div id="mw-content-text"><div class="api-pretty-header"><p>This is the HTML representation of the JSON format. HTML is good for debugging, but is unsuitable for application use.
...

Also, probably formatversion: 2 should be added to api call as well, to get the latest version of the api output.

Even better would probably be format=none, since the return is going to be discarded in most scenarios, anyway.

Summary:

  • ?action=purge now requires POST (this very T135170; task currently still open for the API part in T135170#2574028).
  • This required updating of gadgets (e.g. need to purge via api.post instead of mw.util.getUrl in T143263).
  • Post-deploy issue fixed in T135170#2569682.
  • Other post-deploy issue brought up in T143435 (but that task needs clarification as of now).
  • A future patch will allow to automatically submit the form on action=purge with JS (to not have the confirm dialog) in T143531#2578113.
  • T143531#2578508 states that "solutions that use JavaScript to bypass the form are *welcome*".
  • T143531#2578508 asks users who face situations which require regular purging to please create dedicated tasks so the underlying bugs could get fixed.

So I guess we can remove the "purge" right, since it has no effect now.

aaron claimed this task.

API went a different route in c84ba4d86420d7af918e572e2cd4613d7be185b3

IMO c84ba4d8 was a poor solution, making the whole thing more complex and fragile instead of taking the time to actually deprecate and remove API action=purge via GET.

It should still be deprecated. I'll add notices.

Change 288462 abandoned by Aaron Schulz:
[DNM] Require POST for API action=purge

https://gerrit.wikimedia.org/r/288462