Page MenuHomePhabricator

non-reproducible False positive AbuseFilter bug
Closed, ResolvedPublic

Description

Author: KaewWiki

Description:
On Thai Wikipedia, the abuse filter #22 reads

!('bot' in user_groups) & added_links rlike "wikipedia.org"

The action for this filter is to label the edit.

The edit http://th.wikipedia.org/w/index.php?title=%E0%B9%81%E0%B8%A1%E0%B9%88%E0%B9%81%E0%B8%9A%E0%B8%9A%3AWebCite&diff=4663859&oldid=4663855 clear doesn't fall into the filter condition but the edit was nevertheless labelled by AbuseFilter.

We (two programmers on Thai Wikipedia and I) tried to repeat the edit but the filter does not take an action on the same edit. Therefore, we conclude that it is not reproducible. However, it is recorded on Thai Wikipedia and you can investigate it from available logs.

Details

Reference
bz45301

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 22 2014, 1:18 AM
bzimport added a project: AbuseFilter.
bzimport set Reference to bz45301.
bzimport added a subscriber: Unknown Object (MLST).
bzimport created this task.Feb 23 2013, 6:16 AM

(In reply to comment #0)

On Thai Wikipedia, the abuse filter #22 reads
!('bot' in user_groups) & added_links rlike "wikipedia.org"

For future reference, here is a link to the version of the filter which matched the edit:
https://th.wikipedia.org/wiki/Special:AbuseFilter/history/22/item/174
the log:
https://th.wikipedia.org/wiki/Special:AbuseLog?wpSearchUser=Taweetham&wpSearchFilter=22&wpSearchTitle=แม่แบบ%3AWebCite
the "examine" link points to
https://th.wikipedia.org/wiki/Special:AbuseFilter/examine/log/21989
and the "details" to:
https://th.wikipedia.org/wiki/Special:AbuseLog/21989

The variable "added_links" shows:

//en.wikipedia.org/wiki/Pagediff?withJS=MediaWiki:Common.js%2Fpagediff.js&oldpage=%E0%B9%81%E0%B8%A1%E0%B9%88%E0%B9%81%E0%B8%9A%E0%B8%9A:WebCite&newpage=%E0%B9%81%E0%B8%A1%E0%B9%88%E0%B9%81%E0%B8%9A%E0%B8%9A:WebCite%2Fsandbox

and this indeed matches the regex "wikipedia.org":
http://rubular.com/r/OAjGtNTRth

KaewWiki wrote:

I guess the added_links variable is derived from variables

new_html and
old_html

There might be some one-off change that affected new_html.

Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptJul 14 2017, 5:23 PM
matej_suchanek removed a subscriber: wikibugs-l-list.
Daimona raised the priority of this task from Lowest to Medium.EditedApr 17 2018, 11:33 AM
Daimona moved this task from Filtering features to Internal bugs on the AbuseFilter board.
Daimona added a subscriber: Daimona.

Some news for this which may be bad or good. Opening /details for that entry returns the following error message:

[WtXZ7wpAMFQAAKHDwzUAAABC] 2018-04-17 11:26:39: Fatal exception of type "BadMethodCallException"

This may be bad since it hides the culprit log and without a stacktrace it's literally impossible to debug it. However it might have a precise meaning: for that specific edit, AF messed up with something which caused the filter to match. The error has been silent until some time ago, and is now clearly visible as fatal, maybe due to some changes in the code.

This is definitely worth some more investigation, since it may be related to T175933 and related tasks; maybe they don't share the same cause, but could be something similar. First of, we need the stacktrace.

Also: may be related to T187153, although I think this log was normally available after that task was filed.

@Daimona: Trace for WtXZ7wpAMFQAAKHDwzUAAABC:

exception.file	       	/srv/mediawiki/php-1.31.0-wmf.29/includes/Revision.php:938
exception.message	       	Call to a member function getContent() on a non-object (null)
exception.trace
#0 /srv/mediawiki/php-1.31.0-wmf.29/includes/page/WikiPage.php(721): Revision->getContent(integer, NULL)
#1 /srv/mediawiki/php-1.31.0-wmf.29/includes/page/WikiPage.php(2134): WikiPage->getContent(integer)
#2 /srv/mediawiki/php-1.31.0-wmf.29/extensions/AbuseFilter/includes/AFComputedVariable.php(195): WikiPage->prepareContentForEdit(WikitextContent)
#3 /srv/mediawiki/php-1.31.0-wmf.29/extensions/AbuseFilter/includes/AbuseFilterVariableHolder.php(46): AFComputedVariable->compute(AbuseFilterVariableHolder)
#4 /srv/mediawiki/php-1.31.0-wmf.29/extensions/AbuseFilter/includes/AbuseFilterVariableHolder.php(185): AbuseFilterVariableHolder->getVar(string)
#5 /srv/mediawiki/php-1.31.0-wmf.29/extensions/AbuseFilter/includes/special/SpecialAbuseLog.php(483): AbuseFilterVariableHolder->dumpAllVars(boolean)
#6 /srv/mediawiki/php-1.31.0-wmf.29/extensions/AbuseFilter/includes/special/SpecialAbuseLog.php(97): SpecialAbuseLog->showDetails(string)
#7 /srv/mediawiki/php-1.31.0-wmf.29/includes/specialpage/SpecialPage.php(522): SpecialAbuseLog->execute(string)
#8 /srv/mediawiki/php-1.31.0-wmf.29/includes/specialpage/SpecialPageFactory.php(568): SpecialPage->run(string)
#9 /srv/mediawiki/php-1.31.0-wmf.29/includes/MediaWiki.php(288): SpecialPageFactory::executePath(Title, RequestContext)
#10 /srv/mediawiki/php-1.31.0-wmf.29/includes/MediaWiki.php(861): MediaWiki->performRequest()
#11 /srv/mediawiki/php-1.31.0-wmf.29/includes/MediaWiki.php(524): MediaWiki->main()
#12 /srv/mediawiki/php-1.31.0-wmf.29/index.php(42): MediaWiki->run()
#13 /srv/mediawiki/w/index.php(3): include(string)
#14 {main}
exception_url	       	/wiki/%E0%B8%9E%E0%B8%B4%E0%B9%80%E0%B8%A8%E0%B8%A9:%E0%B8%9B%E0%B8%B9%E0%B8%A1%E0%B8%81%E0%B8%B2%E0%B8%A3%E0%B8%A5%E0%B8%B0%E0%B9%80%E0%B8%A1%E0%B8%B4%E0%B8%94/21989?uselang=en

Alright, so it's the same as T187153. I still wonder whether the error is related to the false positive.

In order to debug this, we may try to start with some more info. With a db query I can see that the var_dump for that edit is stored-text:4626692. This means that we need the data stored in text table under old_id = 4626692, which I see is doable using fetchText.php. Could someone with shell access please run the script and paste here the var_dump? A full paste would be really good, but if there's some private data in there, it should be enough to have "user_groups" and "added_links" from there. Thanks!

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:12 PM

In order to debug this, we may try to start with some more info. With a db query I can see that the var_dump for that edit is stored-text:4626692. This means that we need the data stored in text table under old_id = 4626692, which I see is doable using fetchText.php. Could someone with shell access please run the script and paste here the var_dump? A full paste would be really good, but if there's some private data in there, it should be enough to have "user_groups" and "added_links" from there. Thanks!

In case it's still helpful, P9487 :-).

Daimona closed this task as Resolved.Oct 29 2019, 5:49 PM
Daimona claimed this task.

In order to debug this, we may try to start with some more info. With a db query I can see that the var_dump for that edit is stored-text:4626692. This means that we need the data stored in text table under old_id = 4626692, which I see is doable using fetchText.php. Could someone with shell access please run the script and paste here the var_dump? A full paste would be really good, but if there's some private data in there, it should be enough to have "user_groups" and "added_links" from there. Thanks!

In case it's still helpful, P9487 :-).

Thanks, much appreciated :)

From the dump, I can see that all interesting variables (user_groups and added_links) are baked as AFComputedVariable, so the values are recomputed every time you view the page. Currently, and unlike what @He7d3r reported at T47301#490416, added_links is empty, so whatever bug caused it to be non-empty, now it's been solved. Unfortunately, we can't do anything further. The serialized values will be cleaned up as part of T213006.