Page MenuHomePhabricator

'Examine' should show details of the edit at the time the edit was made
Open, LowestPublic

Description

When clicking the Examine button next to an edit in an abuse filter log, some of the information shown corresponds to the current state of the editor's account, not the state when it made the edit, which can make testing and debugging filters difficult. The fields I've found to change are user_editcount, user_age, and user_blocked, all of which change depending on when you click edit, rather than showing the information at the time of the edit.

Event Timeline

Samwalton9-WMF raised the priority of this task from to Low.
Samwalton9-WMF updated the task description. (Show Details)
Samwalton9-WMF added a project: AbuseFilter.
Samwalton9-WMF subscribed.
Daimona subscribed.

Also page_age and page_recent contributors.

Change 462289 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] [WIP] Use AsOf to compute variables for /examine

https://gerrit.wikimedia.org/r/462289

I've been too optimistic about this, but it won't be easy. To compute these variables we mostly call specific methods from other classes (the relevant ones are Title::getRestrictions and User::getEditCount|isBlocked|getRights|getEffectiveGroups). However, these methods do not provide a way to specify an "as of". This means that either:

  1. There are methods I don't know of which allow passing an asof parameter
  2. We add such methods to core (or change existing ones when possible)
  3. We copy the content of those methods and manually add an asof (Argh! This would also require splitting user methods, which right now are all handled within user-simple-accessor)
Daimona changed the task status from Open to Stalled.Oct 18 2018, 4:47 PM
Daimona moved this task from Next to Blocked on the User-Daimona board.

Option 3 is not the right way, option 1 still has no answer, and option 2 needs time.

This doesn't seem to be only a problem with editor variables, but also with page_id when a new page is created (it's 0 when the filter is triggered, but contains the actual value when the filter is reviewed at a later time). Reported originally here: Topic:V8i2gjsvwmglpiss.

(Not working on this, because there doesn't seem to be a way to proceed)

Aklapper changed the task status from Stalled to Open.Nov 22 2020, 4:42 PM

The previous comments don't explain who or what (task?) exactly this task is stalled on ("If a report is waiting for further input (e.g. from its reporter or a third party) and can currently not be acted on"). Hence resetting task status, as tasks should not be stalled for unclear reasons.

The previous comments don't explain who or what (task?) exactly this task is stalled on ("If a report is waiting for further input (e.g. from its reporter or a third party) and can currently not be acted on"). Hence resetting task status, as tasks should not be stalled for unclear reasons.

Right, let me sum up. The implementation of this feature in AbuseFilter is currently impossible because MW core doesn't provide a way to retrieve the values of certain variables in a given moment in the past. For instance, no way to retrieve what user groups where, or whether they were blocked, or wheter a page was protected, etc. Some of these might be retrieved if we manually query the Database or some other hacks, but this should NOT be done.

What's more, some of this data cannot be retrieved with 100% accuracy at all. For instance, user groups: the only source of past groups are log entries, but this is a weak source, and not guaranteed to be stable.

Long story short, I'm fairly confident that it will never be possible to implement this feature for all variables. At most, we might determine which variables can be computed for a past timestamp and limit the scope to those.

Eh, thanks a lot for elaborating! (My previous action wasn't a reply to previous actions; I was more after looking at some older tickets stalled for years).
Meh, sounds like somewhere between lowest priority and declined status then, but let's see. :-/

Daimona lowered the priority of this task from Low to Lowest.Nov 22 2020, 7:58 PM

Eh, thanks a lot for elaborating! (My previous action wasn't a reply to previous actions; I was more after looking at some older tickets stalled for years).

Sure, I just wanted to explain the current status to people coming across this task :-)

Meh, sounds like somewhere between lowest priority and declined status then, but let's see. :-/

Lowest is for sure... Keeping open with some optimism about limiting the scope

Change 775842 had a related patch set uploaded (by Matěj Suchánek; author: Matěj Suchánek):

[mediawiki/extensions/AbuseFilter@master] Compute use and page age relative to recent change timestamp

https://gerrit.wikimedia.org/r/775842

Change 775842 merged by jenkins-bot:

[mediawiki/extensions/AbuseFilter@master] Compute user and page age relative to recent change timestamp

https://gerrit.wikimedia.org/r/775842

Naive question - could we store the value of these variables in the AbuseFilter database at the time of the edit?

We already store some variables when a filter is triggered and an entry in the abuse log is made. But only those variables that were needed and actually computed during execution (see VariablesBlobStore.php). I think it is possible to change this and always store all variables. But remember that some variables might be expensive to compute (database access, etc.) or very large (page HTML), so the increase in CPU time or used storage could be problematic.

However, that does not concern the situation when you want to test against arbitrary changes done in the past. Because that would mean you would have to do the above for every change done to the wiki (because you may want to examine potentially any change retrospectively). And this is probably not feasible.