Page MenuHomePhabricator

Rename SpamBlacklist
Open, Needs TriagePublic

Description

Per the parent task at T254646, we should rename the SpamBlacklist extension

Also, when renaming it, it would be a really good time to namespace the extension!

Event Timeline

Change 602827 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/SpamBlacklist@master] Start using blocklist where applicable

https://gerrit.wikimedia.org/r/602827

Should we not get rid of the word ‘spam’? It is a blocklist for external links, which are often, but certainly not only, spam

Should we not get rid of the word ‘spam’? It is a blocklist for external links, which are often, but certainly not only, spam

+1 since our current SBL has things like URL Shorteners, I was about to suggest URL-(Whatever alternative) but seems like T190521: Change name of spam-blacklist/spam-whitelist to link-blacklist/link-whitelist has (maybe) better one with link-(whatever alternative).

So this seems to be that we should consider renaming, and it becomes about picking a set of terms for the next many years. I am not convinced that block is the right term, it already has other meaning and will that just be confusing?

Seems that for spam we have the options

  1. url
  2. link

For white we have

  1. allow
  2. ???

For black we have

  1. block
  2. deny

One would hope that there could be a universal approach to multiple talks should be considered.

See also discussions in T254646: Reconsidering how we name things for numerous options. I already made this point.

I think as per the original purpose of T190521: Change name of spam-blacklist/spam-whitelist to link-blacklist/link-whitelist (obviously we're not going to keep "blacklist"), we should re-evaluate what the extensions actually do and are used for, and maybe use that to help influence the names we choose going forward.

Reedy renamed this task from Rename SpamBlacklist to SpamBlocklist to Rename SpamBlacklist.Jun 7 2020, 11:23 PM
Reedy updated the task description. (Show Details)

Dragging my comment over from the parent:

The issue seemingly to be identified with "blocklist" is that it can trivially be confused with the block log and other user block related concepts, specific to MediaWiki. There is an unfortunate (English) semantic partial-overlap.

Also want to give a bit of signal boost to T173080: Replace words "Blacklist" by "Denylist" and "Whitelist" by "Allowlist" since that was the same task with a lot of chatter.

I like the name in the title of T16719: Rename Spam blacklist to "Disallowed websites": "DisallowedWebsites" or maybe "DisallowedLinks".

Definitely +1 to getting rid of "spam" while we're renaming this.

Change 602827 abandoned by Reedy:
Start using blocklist where applicable

https://gerrit.wikimedia.org/r/602827

DisallowedLinks (and DisallowedTitles for TitleBlacklist) sounds good to me. I agree 'blocklist' could be confusing with e.g. Special:BlockList.

'spam' should not be replaced by 'url', because only linked urls are harmed by the lists. 'website' would not be correct, because the list entries do not necessarily block whole websites, but can also block single webpages. so i think 'link' or even 'external link' should be part of the name.

i don't have strong arguments for/against the mentioned alternatives for white (allowed, safe, unblocked, ...) and black (denied, blocked, disallowed, forbidden, ...). but i guess that a pair such as allowed/disallowed or blocked/unblocked is easy to remember and easy to guess (if you just know one of the lists). maybe "unblocked" is better than "allowed", because actually all external links are allowed by default (unless they are blocked).

so AllowedExternalLinks and DisallowedExternalLinks or BlockedExternalLinks and UnblockedExternalLinks (or without "external") are long names, but would be reasonable (and better than the current names) IMHO .

Would it be an option to have all wiki pages be custom set by the wiki:
$whereIsSpamBlacklist=Wikipedia:here?

Would it be an option to have all wiki pages be custom set by the wiki:
$whereIsSpamBlacklist=Wikipedia:here?

From the source code README, it appears that this may already be possible.

$wgBlacklistSettings is an array, where first key is either spam or email and
their value containing either a URL, a filename or a database location.
Specifying a database location allows you to draw the
blacklist from a page on your wiki. The format of the database location
specifier is "DB: <db name> <title>".
Example:

wfLoadExtension ( 'SpamBlacklist' );
$wgBlacklistSettings = [
    'spam' => [
        "$IP/extensions/SpamBlacklist/wikimedia_blacklist", // Wikimedia's list
        "DB: wikidb My_spam_blacklist", // database (wikidb), title (My_spam_blacklist)
    ]
];

The local pages [[MediaWiki:Spam-blacklist]] and [[MediaWiki:Spam-whitelist]]
will always be used, whatever additional files are listed.

Also covered in the extension documentation HERE.

Can someone test whether this works? Then we can start an RfC on en.wiki
to implement this.

Not sure if it solves the whole issue though, there are other pages that
still use ‘spam’ and ‘blacklist’ in the name

LocalSettings.php is not a wiki page and you cannot access it with your web browser. Instead, it is a file in the file system of the server. Its contents are generated during the initial setup of the wiki and the resulting file must be copied on the server manually.

Some Wiki farms use a CommonSettings.php file to contain settings that are common to all Wikis managed by that farm. Since config files are arbitrary php files, you can split up your config files into as many separate config files as you want, and potentially reuse portions among multiple wikis. The Wikimedia Foundation uses a file called CommonSettings.php for settings that are common to all of its wikis.

Wikimedia's CommonSettings.php – this file needs to be changed by a MediaWiki developer. I think this file sets the configuration on meta and then that falls thru to English and all the other language Wikipedias. Then it can be overridden by the English Wikipedia's LocalSettings.php file (which also needs to be changed by a developer to change any local settings).

Search the file for "spam" –

if ( $wmgUseSpamBlacklist ) {
	wfLoadExtension( 'SpamBlacklist' );
	$wgBlacklistSettings = [
		'email' => [
			'files' => [
				'https://meta.wikimedia.org/w/index.php?title=Email_blacklist&action=raw&sb_ver=1'
			],
		],
		'spam' => [
			'files' => [
				'https://meta.wikimedia.org/w/index.php?title=Spam_blacklist&action=raw&sb_ver=1'
			],
		],
	];
	$wgLogSpamBlacklistHits = true;
}

Local blacklists and whitelists hmm, maybe a MAGIC WORD is needed to pull the location out of the common or local settings?

External links policy/Local policies and pages Looks like everyone uses the default name

'spam' should not be replaced by 'url', because only linked urls are harmed by the lists. 'website' would not be correct, because the list entries do not necessarily block whole websites, but can also block single webpages. so i think 'link' or even 'external link' should be part of the name.

Makes sense.

so AllowedExternalLinks and DisallowedExternalLinks or BlockedExternalLinks and UnblockedExternalLinks (or without "external") are long names, but would be reasonable (and better than the current names) IMHO .

I think we want to avoid using "block" when it's not related to user blocks.

So either DisallowedLinks or DisallowedExternalLinks? I like the former because it's shorter. I think the latter will get abbreviated to "DEL" in common usage (not necessarily a bad thing?).

For on-wiki messages:

  • Spam-blacklist -> Disallowed-links or Disallowed-external-links
  • Spam-whitelist -> Allowed-links or Allowed-external-links

Maybe "Approved" instead of "Allowed" since all links are usually allowed, these are just specially approved from the disallow list?

I like

  • External link block list
  • External link unblock list

"External link" is an accurate description of what is blocked and unblocked. "Block list" is ambiguous. Disambiguation: User block list
"Unblock" connects it to the block list. You don't unblock something that isn't already blocked (on the block list, of course) "Approved" and "allowed" do not clearly communicate that the item was previously disallowed/disapproved.
"DEL" means "delete". Nobody will take it to mean "disallowed link"

I think we want to avoid using "block" when it's not related to user blocks.

So either DisallowedLinks or DisallowedExternalLinks? I like the former because it's shorter. I think the latter will get abbreviated to "DEL" in common usage (not necessarily a bad thing?).

DisallowedExternalLinks/DEL seems good to me.

DisallowedLinks is shorter... But can the extension stop internal links? As if not, it's probably better to have the slightly longer one...

My preference would be to dump the feature into AbuseFilter and scrap the extension.

However, if we are going to keep but rename the extension, it should follow our rough convention and use nouns and imperative verbs, so DisallowExternalLinks.

If you think merging functionality into Abusefilter a viable idea, that might welcome a dedicated task.

If you think merging functionality into Abusefilter a viable idea, that might welcome a dedicated task.

Despite discussing this at least as far back as Amsterdam (2013), apparently there wasn't a task, indeed. Filed: T279275: Move all the functionality of {Spam,Title}Blacklist extensions into AbuseFilter and retire them.