Page MenuHomePhabricator

Abuse filter hit fails the "Examine" interface for the text of the filter
Closed, ResolvedPublic

Description

On English Wikipedia, Abuse Filter number 58 hasn't been updated since April 22nd; however, when I use the "exmaine" interface for hits from April 29-30, using the filter content of "Load filter ID" for number 58, I get a message that "The filter did not match this change".

Event Timeline

OdMishehu raised the priority of this task from to Needs Triage.
OdMishehu updated the task description. (Show Details)
OdMishehu added a project: AbuseFilter.
OdMishehu added a subscriber: OdMishehu.
matej_suchanek changed the task status from Open to Stalled.Sep 5 2017, 4:20 PM
matej_suchanek added a subscriber: matej_suchanek.

Given that the filter has been private, it's very difficult for us to debug.

Daimona changed the task status from Stalled to Open.May 9 2018, 7:45 PM
Daimona added a subscriber: Daimona.

I gave a quick look. Some useful links:

Now, some considerations. Although it's not easy to test, the problem is likely coming either from "stringy", or the first used variable or the function at line 13 (sorry for being cryptic). However, I don't know which of these is the real cause. Also, /examine seems to show the right variables.

May be useful to see the actual var_dump, again with fetchText.php. Old_ids for three affected random entries are 665112684, 665041142 and 665166487.

Daimona claimed this task.

Thanks to T193903 I could finally do the testing directly on enwiki, so I found out some abuselog entries with the described problem, for instance this one. Now, you can easily see that the added text (i.e. added_lines) almost matches a piece of stringy (it's on the second row, it takes few time to find it). I said "almost", because there's a difference, which I'll say explicitly since it's not that big deal: the character "5" from stringy is actually an "S" in the added text, and it would also be trasformed from "5" to "S" by applying normalization to stringy. Probably this is because some old version of equivset changed all S's to 5 and caused the filter to match, while now it doesn't happen anymore and there's no match.