Page MenuHomePhabricator

AbuseFilter test tools does not work for Flow edits
Closed, ResolvedPublic

Description

Try checking on a Flow edit here. https://www.mediawiki.org/wiki/Special:AbuseFilter/test

Actual: Doesn't appear.
Expected: It should work the same way it does for wikitext edits.

Event Timeline

Glaisher raised the priority of this task from to Needs Triage.
Glaisher updated the task description. (Show Details)
Glaisher subscribed.
Restricted Application added subscribers: Luke081515, Aklapper. · View Herald Transcript
Glaisher renamed this task from AbuseFilter examine/test tools does not work for Flow edits to AbuseFilter test tools does not work for Flow edits.Oct 27 2015, 3:47 PM
Glaisher set Security to None.

Sorry, actually examine does work, it's test interface which is not working properly. See T116744#1757682.

In rEABF32e2a7c27e9b, I tried to generalize how recentchanges are used in both interfaces for testing. Introducing some hook could do the trick (not sure if it would be enough, though).

Change 445323 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] [WIP] Make Flow edits testable

https://gerrit.wikimedia.org/r/445323

Restricted Application added a subscriber: Daimona. · View Herald Transcript

The patch above only needs Flow to properly format our entry in /test (and maybe some adjustments for /examine). However, not only I don't know Flow code at all, but I'm also having big problems in making Flow work on my wiki.
I determined that Flow entries aren't shown because the onOldChangesListRecentChangesLine hook returns false. This happens for two reasons: first, here the change is excluded from the view. Second, it doesn't exists in this cache. I'm not really sure about what we should do: maybe change the RC entry on AbuseFilter? Or maybe add specific methods to Flow to handle entries in Special:AbuseFilter/test? I'd appreciate it if someone from Flow would give it a look!

This issue is preventing my abusefilter at MediaWiki (https://www.mediawiki.org/wiki/Special:AbuseFilter/59) from working properly. It'll be very nice if this issue could be fixed as soon as possible.

@Leaderboard Let's wait for someone who knows well Flow's code. Since I don't, I want to avoid to destroy everything :-) Also, we currently have several other bugs which require higher attention.

Leaderboard raised the priority of this task from High to Unbreak Now!.Jan 16 2019, 10:40 AM

It won't be wrong to say that this bug is the major reason for all of the spam on MediaWiki, and hence this should be urgently fixed.
All of the spam now comes from flow talk pages.

Daimona lowered the priority of this task from Unbreak Now! to High.Jan 16 2019, 10:58 AM

@Leaderboard IMHO this isn't UBN! per https://www.mediawiki.org/wiki/Phabricator/Project_management#Priority_levels. Again, the patch above (which I'm going to rebase) is WIP because something in Flow codebase prevents it from working, and I don't want to mess with it. I also had several troubles in getting Flow to work properly on my wiki, and thus testing the patch. Developers/Maintainers says stewards are Growth-Team, and this task is already on their workboard. We should get someone from there to give a look to the patch.
This is also not UBN because filters still work -even though they cannot be tested-, and being myself a sysop on MediaWiki I can try to help tuning spam-related filters, I just need some time and examples.

@Daimona I'm sorry, but the filters don't work, at least filter 59 (flow variant of the hugely useful filter 12) doesn't.

@Leaderboard May I please have some spam examples which should have triggered filter 59?

@Leaderboard Thanks, I'm working on it (i.e. on understanding why filter 59 didn't trigger), while for testing flow edits we still need someone who knows Flow's codebase.

Same problem on frwiki: edits on Flow talkpages do not show in Abusefilter Testing Tool.

And I can't make a filter work on Flow talkpages. So, without a working testing tool, I have no way to know if Abusefilter does not detect Flow pages *at all*, or if I only made a mistake in the filter :(

Daimona raised the priority of this task from High to Needs Triage.Sep 16 2020, 12:44 PM

Accepted, but I need to re-evaluate the priority after having played a little bit with Flow locally.

I took another stab. First of all, things haven't changed much since the last time, but here are the results of my investigation:

  • On the AF side, making flow edits appear in the test page is quite easy. There are 2 reasons why they don't appear:
    • Flow edits are stored differently in the recent_changes table: we need to conditionally add rc_source='flow' to get them. This is easy to fix, as can be seen in my old patch.
    • These lines are then skipped because the revision cache is empty. This is because Flow populates this cache via the ChangesListInitRows hook. This can also easily be fixed by calling $changesList->initChangesListRows( $res ); in AF.
  • At that point, Flow lines appear, but there's no "examine" link. This is a direct consequence of T273387. While the proper solution should happen in core (or in Flow), we can hack-fix this by overriding recentChangesLine and re-adding the "examine" link for Flow rows.
  • The next issue is that the "diff" link doesn't work, it's just plain text. I still don't really know how Flow works internally, but it seems like diff links with Flow are rare. Just look at Special:RecentChanges on mw.org filtered by "Topic" namespace, or the history of a random Flow talk, and you'll see that no diff links appear there either. So apparently this is a Flow feature. Nothing we can do on the AF side.
  • And then, the final issue, is that examining the entry itself doesn't work (variables are empty). This is also kind of expected, because again, Flow has all its magic to save page revisions, and the current code (which works for "usual" revision) doesn't work for Flow. This would probably require special-casing RCVariableGenerator, running a new hook and having Flow handle it. The main point here is grabbing the right revision from the Database (flow edits don't have rc_cur_id, rc_this_oldid, nor rc_last_oldid). At that point, everything else should work almost out-of-the-box, given that Flow already has the necessary code to convert a Board content into text for AbuseFilter.

All in all, I'm more and more surprised by how complicated Flow is.

Change 660635 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Add a hook to allow computing variables from different types of RC rows

https://gerrit.wikimedia.org/r/660635

Change 445323 abandoned by Daimona Eaytoy:
[mediawiki/extensions/AbuseFilter@master] [WIP] Make Flow edits testable

Reason:
Rewriting from scratch

https://gerrit.wikimedia.org/r/445323

Change 660636 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] [WIP] Allow testing Flow edits

https://gerrit.wikimedia.org/r/660636

Change 660646 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/Flow@master] Generate AbuseFilter variables for edits in recent_changes

https://gerrit.wikimedia.org/r/660646

So, that works. However, there are a few things to keep in mind:

  • The code on the AF side is not 100% clean, in particular it's unclear whether flow edits should appear together with normal edits, and if not, how the behaviour should be controlled (without hardcoding knowledge about Flow in AbuseFilter). T273387 is also to blame here.
  • I don't know if the code on the Flow side can be called clean, given that I don't know the internals of the extension
  • Diff links don't work, as per above, but this seems to be a feature of Flow
  • The computed values of some variables might be confusing (at least for those who, like me, know little about Flow). For instance, old_wikitext is always empty for new replies (which I can understand, given that every reply is somewhat standalone); new_pst and dependent vars are empty, which happens because AbuseFilter still doesn't support parsing non-wikitext content models. (Some progress with T264104 would help here); (old|new)_content_model variables are unset. However, TTBOMK (based on some live testing), this is all consistent with how variables are generated while the flow edit is ongoing (i.e. the values that you see in /test are the same that would be computed during the edit).

All in all, I think this is acceptable: the AF hacks were highlighted in the code and have an associated task, and improving variables can also happen in a separate task.

Change 660635 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Add a hook to allow computing variables from different types of RC rows

https://gerrit.wikimedia.org/r/660635

Change 660646 merged by jenkins-bot:
[mediawiki/extensions/Flow@master] Generate AbuseFilter variables for edits in recent_changes

https://gerrit.wikimedia.org/r/660646

Change 660636 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Allow testing Flow edits

https://gerrit.wikimedia.org/r/660636