Page MenuHomePhabricator

Add a filter for References check in Special:Log/spamblacklist
Closed, ResolvedPublic

Description

NOTE: You can expect this change to arrive on wikis the week of 14 October 2024.

This task involves the work of creating a new long entry each time the Link and/or Reference Reliability get activated within an edit session.

Story

As an experienced volunteer focused on content quality, I want to view all instances of users being blocked from linking to or referencing a domain, so I can reduce false positives.

Requirements

Log entry copy

$date #editcheck-[link|reference-reliability] warned $user about adding a [link to|citation of] $URL on $page
ContextDisallow list domain is present withinLog locationLog entry
Link inspectorMediaWiki:Spam-blacklistSpecial:Log/spamblacklist See Log entry copy above
Link inspectormeta:Spam_blacklistSpecial:Log/spamblacklist See Log entry copy above
Link inspectorMediaWiki:BlockedExternalDomains.jsonSpecial:Log/abusefilterblockeddomainhitSee Log entry copy above
CitoidMediaWiki:Spam-blacklistSpecial:Log/spamblacklist See Log entry copy above
Citoidmeta:Spam_blacklistSpecial:Log/spamblacklist See Log entry copy above
CitoidMediaWiki:BlockedExternalDomains.jsonSpecial:Log/abusefilterblockeddomainhitSee Log entry copy above

Open questions

  • 1. To what extent – if any – should the Special:Log/spamblacklist hits triggered within Edit Check be distinguished from those triggered by people tapping/clicking the Publish button in other editing interfaces?
  • 2. When should we log the fact that the Link or Reference Reliability Check activated a spam disallow list? E.g. the globally-defined meta:Spam_blacklist and the locally defined, en:MediaWiki:BlockedExternalDomains.json and MediaWiki:Spam-blacklist
  • 3. Hits to what disallow lists (see "2.") should cause an entry to be logged on Special:Log/spamblacklist?

Background

From this user feedback.
The idea would be along the lines of:

$date $user caused a spam block list hit on $page by attempting to add $URL #linkcheck

Using tags, users can filter down edits depending on how the rejection list was triggered.

Event Timeline

ppelberg added a subscriber: Quiddity.

Language for Tech/News

What do you should be logged when someone attempts to link to and/or reference a blocked domain ''within'' an edit session? Please review [[phab:T368438|T368438]] and share what you think.

cc @Quiddity

Taking one step back: should we ask for feedback? We already tag these edits in Recent Changes or in Watchlist, and Special:Log already lists all tags. Adding these tags to Special:Log should just be the continuation of this work. #editcheck-link and #editcheck-reference-reliability should be added to RCs and Watch lists too.

If we continue with the current task scope: we are changing Tech News' writing style, to attract interested users and provide more context. Here is an alternative that fits the new guidelines:

Editors who are interested in spam block lists are requested to share feedback on which tags should be used when a user links to and/or references a blocked domain ''within'' an edit session.

We already tag these edits in Recent Changes or in Watchlist, and Special:Log already lists all tags. Adding these tags to Special:Log should just be the continuation of this work. #editcheck-link and #editcheck-reference-reliability should be added to RCs and Watch lists too.

@Trizek-WMF: can you please share what you're seeing that's leading you to think #editcheck-link and #editcheck-reference-reliability already exist?

Reason I ask: I do not recall us implementing the tags (or others like it) that you referenced [i][ii], nor do I seem them included in the following places...

Special:RecentChangesSpecial:Log
Screenshot 2024-09-03 at 11.06.41 PM.png (928×1 px, 250 KB)
Screenshot 2024-09-03 at 11.07.17 PM.png (338×1 px, 59 KB)

i. T350622: Introduce a new tag to identify edits the Reference Reliability check is shown within
ii. https://www.mediawiki.org/wiki/Edit_check/Tags

@Trizek-WMF: can you please share what you're seeing that's leading you to think #editcheck-link and #editcheck-reference-reliability already exist?

They don't exist. You created #editcheck-link and #editcheck-reference-reliability as examples, following the initial idea. But in the initial idea I posted, #linkcheck was a placeholder! We were both confused by our respective non-existing tags.

SCP-2000's idea is:

Add a log entry to record the action of attempting to link to a blocked domain. Currently, if the edit triggers the spam-blacklist, this would be recorded in the log (e.g. en:Special:Log/spamblacklist). Edit Check should also have the same feature for debugging.

In short: at Special:Log/spamblacklist, highlight logged events waer triggered by Edit check. The idea is to ease patrolling.

The suggested "how" was to add a tag to the end of the logged event:

$date $user caused a spam block list hit on $page by attempting to add $URL #linkcheck

which should have been better phrased as:

$date $user caused a spam block list hit on $page by attempting to add $URL $tags

We already have tags: at Special:Log (and Special:Log/spamblacklist), it is possible to filter edits by tags. These tags are the same as on Recent Changes.

Capture d’écran_2024-09-09_15-09-08.png (711×900 px, 68 KB)

At the moment, none of these tags return edits at Special:Log/spamblacklist. It is probably due to the fact that we don't have a tag to filter edits where Reference reliability and Link check were triggered.

Looking at the code for SpamBlacklist, it seems that the current behaviour of not logging when an API call is checking a link was intentional at the time:

preventLog - Whether to prevent logging of hits. Set to true when the action is testing the links rather than attempting to save them (e.g. the API spamblacklist action)

...by attempting to add...

Note also this is a little vague, and depends on the UI. In the case of Citoid, the user has to click a button before we go and check the URL (but they won't have had a chance to press "insert" yet). In the link inspector the URL is live checked as the user types. Theoretically the user might be in the middle of typing an allowed URL, but that seems very unlikely.

Either way we should amend the wording to something that doesn't blame the user directly:

"#editcheck-[link|reference-reliability] warned $user about adding a [link to|citation of] $URL on $page"

Change #1075011 had a related patch set uploaded (by Esanders; author: Esanders):

[mediawiki/extensions/VisualEditor@master] [WIP] Log spam blacklist checks against the current page

https://gerrit.wikimedia.org/r/1075011

Change #1075159 had a related patch set uploaded (by Esanders; author: Esanders):

[mediawiki/extensions/Citoid@master] Pass the page name when calling editcheckreferenceurl

https://gerrit.wikimedia.org/r/1075159

Change #1075010 had a related patch set uploaded (by Esanders; author: Esanders):

[mediawiki/extensions/AbuseFilter@master] BlockedDomainFilter: Make title optional

https://gerrit.wikimedia.org/r/1075010

Next steps
@ppelberg to decide whether it's important that the log entry text that Edit Check activations trigger be custom.

Note: at present, the patches above create the same log message that is currently shown when someone tries to publish a blocked URL.

Next steps
@ppelberg to decide whether it's important that the log entry text that Edit Check activations trigger be custom.

Note: at present, the patches above create the same log message that is currently shown when someone tries to publish a blocked URL.

Yes, it's important the log entry text that Edit Check activations trigger should be custom.

Let's use the language @Esanders proposed in T368438#10161553:

$date #editcheck-[link|reference-reliability] warned $user about adding a [link to|citation of] $URL on $page

Thinking: per a point @DLynch shared offline, let's assume someone trying to link to a source while editing to be different enough from someone attempting to publish an edit that contains a link to a source wikis have deemed to be problematic.

NOTE: I've updated the task description to reflect this revised language.
ppelberg updated the task description. (Show Details)

Change #1075159 merged by jenkins-bot:

[mediawiki/extensions/Citoid@master] Pass the page name when calling editcheckreferenceurl

https://gerrit.wikimedia.org/r/1075159

Draft language for Tech/News cc @Quiddity + @Trizek-WMF:

When people attempt to generate a link to or citation for a blocked domain, [[mw:Edit check|Edit Check]] will now log these attempts in [[Special:Log/spamblacklist]] and [[Special:Log/abusefilterblockeddomainhit]].[1]

I'm wondering if we can expand that Tech News draft with a bit more detail, particularly for anyone unfamiliar with it all...

Perhaps with details about the difference between the 2 Log pages? The names are ambiguous! My notes from trying to decipher/remember it all:

[Edit: Or maybe there's a documentation page which explains all this, and we could link to?]

When people attempt to generate a link to or citation for a blocked domain, [[mw:Edit check|Edit Check]] will now log these attempts in [[Special:Log/spamblacklist]] and [[Special:Log/abusefilterblockeddomainhit]].[1]

Quibble: "or" might be more appropriate, since any given attempt will only be logged in one of those places.

ppelberg updated the task description. (Show Details)

I've re-ordered the sentence to it better matches the existing messages in the log:

Username was warned about a [blocked domain|spam block list] hit by EditCheck ([references|links]) on $3 while attempting to [reference|link to] $4