Page MenuHomePhabricator

Expose content model to abuse filters
Closed, InvalidPublic

Description

I came across https://en.wikipedia.org/wiki/Special:AbuseFilter/960, which tries to log edits to others' user scripts. But, it checks these based on the page name. Thus, if a script is moved to a page that doesn't end in .js (https://en.wikipedia.org/w/index.php?title=Special:Log&logid=101101733), the filter can no longer catch it. Additionally, the filter's logic can lead to false positives (https://en.wikipedia.org/wiki/Special:AbuseLog/24565419) when a .js page isn't actually a script.

The solution is to add a new variable, content_model, that filters can use. This could also allow communities to more finely control the effects of (if actioned) T226926: Investigate separate user right for creating pages with custom content models

Details

Related Gerrit Patches:
mediawiki/extensions/AbuseFilter : masterAdd `content_model` variable

Event Timeline

DannyS712 created this task.Aug 8 2019, 4:48 AM
Restricted Application added a project: User-DannyS712. · View Herald TranscriptAug 8 2019, 4:48 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
DannyS712 claimed this task.Aug 8 2019, 4:56 AM
DannyS712 added a project: Patch-For-Review.

Change 528981 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/extensions/AbuseFilter@master] Add content_model variable

https://gerrit.wikimedia.org/r/528981

Also, there is currently already new_content_model and old_content_model, which I assume are for log entries that change content models, but when examining a change content model log entry (like https://en.wikipedia.org/wiki/Special:AbuseFilter/examine/1172331730), neither variable is shown

Daimona added a subscriber: Daimona.Aug 8 2019, 6:35 AM

I think there are some todos to actually set those variables for all edits, as having them for changeContentModel entries only wouldn't make sense in AF.
Next week I'll check and think about what we should opt for.

OK, so: new_ and old_content_model are for all edit actions, and the're correctly set within the hook handler. In fact, you can check that they're available when examinating an AbuseLog entry, random example. In the code there's the following comment: // TODO: set old_content and new_content vars, use them, but that's outdated and it just wasn't removed in rEABF1e550ddda9c6956bae412dc4c69b1cc7cf22b055 (which is where those variables were added).
Given that we already have these two variables, I find it unnecessary to add another one which is identical to new_content_model.
Instead, there are two separate bugs

  1. The variables aren't set for uploads, and there's a todo for that.
  2. The variables aren't set when examinating edits.

I believe that 1. is not urgent, and the todo comment is enough. As for 2., we need a way to retrieve the content model from a recentchanges row. This is definitely possible, but at a quick glance I'm seeing a long path from rc_this_oldid/rc_last_oldid, then maybe Revision::newFromId but that could fail for deleted stuff, and we can't use Revision::getContentModel because of MCR. Or maybe, from rc_this_oldid go to revision.rev_id, then we cannot use rev_content_model so we'd need to join with the slots table, then with content.content_id and then get content.content_model, which is too long to be true.
For the moment I can fix the comments.

Change 529605 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Remove outdated comment, add a new one

https://gerrit.wikimedia.org/r/529605

Daimona closed this task as Invalid.Aug 11 2019, 2:35 PM

Moved the actual bug description to T230295, instead of rewriting the whole task.

Change 528981 abandoned by DannyS712:
Add content_model variable

Reason:
Per phab

https://gerrit.wikimedia.org/r/528981