Page MenuHomePhabricator

AbuseFilter tripping more times than it should, and for the wrong edit, and not tagging
Open, HighPublic

Description

See discussion at https://en.wikipedia.org/wiki/Special:Permalink/796903845#Can_someone_more_competent_than_me_take_a_look

User:SorinaPopescu tripped filter 279 a dozen times, but they were for the same edit, and it wasn't their edit. The edit was actually made by User:Dilloncoulter whose filter log is empty. If you look at the details of one of the log entries, things are way off. The "Changes made in edit" are completely different than the linked diff, and the timestamp doesn't match any edit made by the aforementioned users (see their contribs: 1, 2), or any edit made by anyone to the relevant page. Something is very wrong!

Filter 279, by the way, is tripped when a user makes a series of edits that failed to save to the same page (checking if their username is not in article_recent_contributors), but here both SorinaPopescu and Dilloncoulter's edits did save. Not only that, the filter did not tag the edits as it normally would -- and good that it didn't, because it was a false positive :)

To recount, we have a number of issues:

  • Filter 279 is being tripped more times than there were actual actions
  • The edit the logs point to was made by a different user, different timestamp, and diff doesn't match the actual edit
  • The filter was tripped when no previous edits by the user were disallowed, which is not how the filter is supposed to work
  • The edit wasn't tagged as the filter would normally do (probably because of one of the above points)

The first thing that comes to mind is T168736 which I filed back in June. Whether or not the filter was malfunctioning there I'm not sure, but similarly the logs reported wrong values. I have a strong suspicion these issues are related.

I dug deeper into filter 279, I can't find any entries that are as out place as the above. However Special:AbuseLog/19127087 shouldn't have tripped the filter, since that IP had not made previous edit that were disallowed via AbuseFilter. This may or may not be related, but anyway it also appears to be a bug.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 23 2017, 7:50 PM
MusikAnimal updated the task description. (Show Details)Aug 23 2017, 7:54 PM

MusikAnimal, as filter 279 is a private filter, I'm not able to view the same. Is it possible to check the view-edit-filter rights for me?

Restricted Application added a subscriber: jeblad. · View Herald TranscriptAug 25 2017, 3:26 AM

@Lourdes as a non-admin you would presently require the "edit filter manager" right (see https://en.wikipedia.org/wiki/Wikipedia:Edit_filter#User_right ), for which you need to make a request at the edit filter noticeboard.

There is currently an RFC happening at the noticeboard about the creation of a read-only edit filter helper right, which would probably be a good fit for this situation, so you may wish to express your opinion about it.

@Lourdes as a non-admin you would presently require the "edit filter manager" right (see https://en.wikipedia.org/wiki/Wikipedia:Edit_filter#User_right ), for which you need to make a request at the edit filter noticeboard.
There is currently an RFC happening at the noticeboard about the creation of a read-only edit filter helper right, which would probably be a good fit for this situation, so you may wish to express your opinion about it.

Thanks. Noted.

TBolliger moved this task from Untriaged to Cards ready to be discussed on the Anti-Harassment board.
TBolliger added a subscriber: TBolliger.
dbarratt claimed this task.Sep 29 2017, 1:38 PM
dbarratt moved this task from Ready to In progress on the Anti-Harassment (AHT Sprint 6) board.

If I go to the list and I click on details I get a revision that is different from the diff.

Perhaps this edit was by User:SorinaPopescu but the edit was unsuccessful (as it should have been), but that the reporting/log is incorrect (i.e. recording a diff when there wasn't one?) or am I missing something?

If I go to the list and I click on details I get a revision that is different from the diff.
Perhaps this edit was by User:SorinaPopescu but the edit was unsuccessful (as it should have been), but that the reporting/log is incorrect (i.e. recording a diff when there wasn't one?) or am I missing something?

As far as I can tell the edits by Dilloncoulter and SorinaPopescu all went through. Filter 279 does not disallow, it tags, and as you can see it didn't do that either. However that filter works on the fact that previous edits by a user were disallowed. If there were disallowed edits, they're not showing up in either user's filter log.

Yeah, it looks like those diffs are not the same. The 'diff' link shows an edit by an entirely different user.

The filter says it was trigged on August 23 by SorinaP, but according to the page history they only edited the page on August 10. It doesn't look like anybody made an edit on that article at 15:45, 23 August 2017, and SorinaP did not edit any page on August 23.

I believe I've figured out what is causing this.

I noticed that these edits also use API:Edit to make the changes with the dashboard.wikiedu.org app.

AbuseFilter uses the EditFilterMergedContent hook. However, other extensions, like ConfirmEdit use the same hook.

I believe the hook executions are in (alphabetical?) order. Because of this, if AbuseFilter does not prevent the edit, it gets passed to ConfirmEdit (or something else). If ConfirmEdit (or something else) prevents the edit, then the edit will fail, but AbuseFilter will still have a log of the change.

If you view the "details" of the change in the log, each change is actually another save attempt.

My theory is that dashboard.wikiedu.org is not setup to handle a failed edit state and so the user kept trying to submit. Alternatively, it could be handling it correctly, but the user kept getting something wrong (i.e. kept getting the CAPTCHA wrong).

I have no idea why AbuseFilter is logging an unrelated change as the "diff" but that seems unrelated, but I created T177112 to figure that out.

Here are the possible solutions:

  1. Do nothing. It technically works as intended. However, this does expose a different issue, which is that the rate limit will be incorrect (i.e. you might have more actions in the log then are real actions)
  2. Figure out a way to make AbuseFilter execute last (thought I'm not sure the API supports this, so it could be a core change).
  3. Change AbuseFilter so it executes filters that could prevent saving in this hook and then executes filters that can not prevent saving after saving is complete (which might fix T177112). However, if someone enables edit prevention and also enables throttling, the throttling I think could be incorrect (i.e. passes the filter, user gets the captcha, then tries again, could hit the filter since now it's the second time the edit has been seen).
  4. Implement some way to "track" edit attempts (if this doesn't already exist). If it does exist, then we should use the same attempt (i.e. override the log entry if the attempt is the same). If it doesn't exist then it could be really helpful in determining other reasons why the user may have failed saving (i.e. failed Captcha) which could be used in AbuseFilter. :)

What do you all think?

The hooks are executed in the order they're registered, and ConfirmEdit indeed goes before AbuseFilter. Note how ConfirmEdit aborts hook execution when disallowing an edit, while AbuseFilter doesn't.

The hooks are executed in the order they're registered, and ConfirmEdit indeed goes before AbuseFilter. Note how ConfirmEdit aborts hook execution when disallowing an edit, while AbuseFilter doesn't.

Hmm... In my testing I had it like this:

wfLoadExtension( 'AbuseFilter' );
wfLoadExtension( 'ConfirmEdit' );

Is it possible something else (that comes after AbuseFilter) is preventing the edit? errr... is there a list of extensions that use that hook?

maxsem@tin:~$ mwscript eval.php enwiki
> var_dump(Hooks::getHandlers('EditFilterMergedContent'));
array(7) {
  [0]=>
  string(39) "SpamBlacklistHooks::filterMergedContent"
  [1]=>
  string(38) "GadgetHooks::onEditFilterMergedContent"
  [2]=>
  string(35) "ConfirmEditHooks::confirmEditMerged"
  [3]=>
  string(43) "AbuseFilterHooks::onEditFilterMergedContent"
  [4]=>
  string(30) "ScribuntoHooks::validateScript"
  [5]=>
  string(42) "JsonSchemaHooks::onEditFilterMergedContent"
  [6]=>
  string(49) "JsonConfig\JCSingleton::onEditFilterMergedContent"
}

Well... soo that might not be the case, here are other errors that could fail saving:

As well as anything that implements PageContentSave. @MaxSem could you provide a list of implementers of PageContentSave?

Although, I suppose it doesn't really matter, the point is is that it's certainly possible (if not likely) for AbuseFilter to have edits in the Abuse Log that were not prevented by Abuse Filter, but were prevented by something else.

Perhaps it would be better if AbuseFilter implemented PageContentSave instead of EditFilterMergedContent? That way all validation that is in core is complete before getting to AbuseFilter.

> var_dump(Hooks::getHandlers('PageContentSave'));
array(1) {
  [0]=>
  string(36) "TemplateDataHooks::onPageContentSave"
}

What I think we should do... is break up the filters.

If a filter can block saving, it should run in PageContentSave and if it can't prevent saving, it should run after the content has been saved.

What do you think?

MaxSem added a comment.Oct 2 2017, 8:45 PM

EditFilterMergedContent can abort too.

dbarratt added a comment.EditedOct 2 2017, 8:50 PM

EditFilterMergedContent can abort too.

Oh right. If we split them up, it doesn't matter if it aborts early or not, but I would propose moving to PageContentSave anyways since it's possible for a filter that can prevent saving, but also has a throttle, to misreport if something else prevented saving.

Based on what I have found in T177112: AbuseFilter sometimes lists an unrelated diff when save fails, the edits in the log, must have been null edits, otherwise they would not have an unrelated "diff" link.

How could these edits be null edits?

TBolliger changed the point value for this task from 5 to 0.

Removing size, as understanding the reproduction steps may change the complexity.

TBolliger removed the point value for this task.
dbarratt removed dbarratt as the assignee of this task.Oct 18 2017, 5:48 PM
dbarratt added a subscriber: dbarratt.

A Timeline....

  1. (22 August 2017 22:20) Dilloncoulter successfully edits the page (diff) using dashboard.wikiedu.org without tripping AbuseFilter
  2. (23 August 2017 15:42) SorinaPopescu makes a dozen attempts within three minutes to edit the page (log) using dashboard.wikiedu.org while tripping AbuseFilter on every edit, but 0 edits were successful and AbuseFilter did not prevent the edit (according to the log). All log entries (at some point) have their afl_rev_id updated to Dilloncoulter's edit.
  3. (23 August 2017 16:54) Mx45 successfully edits the page (diff) using dashboard.wikiedu.org without tripping AbuseFilter
Daimona moved this task from Backlog to Internal bugs on the AbuseFilter board.Apr 7 2018, 4:34 PM
Restricted Application added a subscriber: Daimona. · View Herald TranscriptJun 13 2018, 10:03 AM