Page MenuHomePhabricator

Added_lines and removed_lines are computed wrongly for MassMessage pages
Open, LowPublic

Description

Enwiki private filter 789 has had false positives on Wikipedia:Administrators' newsletter/Subscribe. An example filter log can be seen here.

Without discussing details of the private filter, the cause of this is that a condition searches the added_lines variable. Although the diff looks normal when viewed in the usual way (here), when you look at the filter log details, removed_lines contains all the old version with spacing, and added_lines contains all the new version without spacing. This applies to the delivery list and the header.

This effectively makes added_lines equivalent to new_wikitext.

Event Timeline

Huji renamed this task from AbuseFilter false positives on page with 'MassMessageListContent' content model due to inconsistent added_lines and removed_lines to AbuseFilter variables added_lines and removed_lines are incosistent with the diff view.Feb 1 2017, 6:08 PM

@BethNaught This problem is strictly related to the page type. Actually, what kind of page is that? I'd like to make some tests, but I can't test filters on en.wiki (not enough rights), nor I know how to set up a page like that on my local wiki.

ADD: Oh, I see it's a delivery list for MM. I'll install it locally tomorrow in order to reproduce and possibly fix the problem. Probably we'll need to implement MM's own diff system, and see if we should also do the same for other extensions which use their own diffs.

Daimona triaged this task as Low priority.EditedMar 25 2018, 11:54 AM

My hunch was right. Since the page is generated by MassMessage in a peculiar way and uses its own diff system, it's quite obvious that using the vanilla one returns weird stuff. What we would need to do is determine if the page we're diffing is a MM one and, if so, implement and correctly use their own classes to generate the diff. Unfortunately I can't do that since I never dealt with MassMessage. However, given that really few pages use MM, this isn't terribly urgent and also not related to other problems affecting added_lines and arrays.

ADD: Of course, an alternative could be to add a wgAbuseFilterValidGroups for MassMessage, like we do for Flow, which would be cleaner.

Daimona renamed this task from AbuseFilter variables added_lines and removed_lines are incosistent with the diff view to Added_lines and removed_lines are computed wrongly for MassMessage pages.Mar 25 2018, 11:56 AM

MassMessage implements it's own diff engine that can be used. It's called MassMessageListDiffEngine.

MassMessage implements it's own diff engine that can be used. It's called MassMessageListDiffEngine.

Yeah, but it generates rich HTML for display. We only need plain text. Maybe the AbuseFilter-contentToString hook needs to be used.