Page MenuHomePhabricator

Move spam regex checks out of EditPage
Closed, ResolvedPublic

Description

Currently, EditPage::matchSpamRegex, EditPage::matchSummarySpamRegex, and EditPage::matchSpamRegexInternal are used to compare the content and summary of an edit to the global wgSpamRegex and wgSummarySpamRegex. The methods are static, and they access the globals directly.

The use of global variables is discouraged (c.f. https://www.mediawiki.org/wiki/Manual:Wg_variable) and these methods are used outside of the EditPage class (MergeHistory, MovePage, and SpecialChangeContentModel in core, and SpecialNewsletter in MediaWiki-extensions-Newsletter, all use matchSummarySpamRegex to check summaries). I propose a new SpamRegexChecker service, that will be part of the backend for EditPage but can also be used elsewhere were needed. It'll have two methods, one for checking against wgSpamRegex, and the other for wgSummarySpamRegex, with those configuration values injected rather than accessed via the global scope.

Event Timeline

Change 592443 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/core@master] Add a new SpamRegexChecker service

https://gerrit.wikimedia.org/r/592443

The patch provided doesn't make any deprecations. Once it merges, will send:

  • A patch to update the Newsletter extension to stop using EditPage::matchSummarySpamRegex

Since the Newsletter extension requires 1.35+, the service will always be available

Once that merges:

  • Replace the core uses of EditPage::matchSummarySpamRegex in MergeHistory, MovePage, and SpecialChangeContentModel
  • Hard deprecate EditPage::matchSummarySpamRegex and EditPage::matchSpamRegex
  • Update release notes with new service and deprecation

Since both MovePage and MergeHistory would use the service, this won't conflict with T249446: Add a MergeHistoryFactory and convert MergeHistory to DI - if that is merged first, the SpamChecker service will be injected to both; if not, it'll only be injected to MovePage for now.

Change 592443 merged by jenkins-bot:
[mediawiki/core@master] Add a new SpamChecker service

https://gerrit.wikimedia.org/r/592443

Change 595659 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/extensions/Newsletter@master] Use the new SpamChecker service

https://gerrit.wikimedia.org/r/595659

Change 595659 merged by jenkins-bot:
[mediawiki/extensions/Newsletter@master] Use the new SpamChecker service

https://gerrit.wikimedia.org/r/595659

Change 596287 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/core@master] Inject SpamChecker where needed, hard deprecate EditPage static methods

https://gerrit.wikimedia.org/r/596287

Some tests for the new service would be nice.

Some tests for the new service would be nice.

I should have time to write tests later today

Change 596691 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/core@master] Add Unit tests for SpamChecker service

https://gerrit.wikimedia.org/r/596691

Some tests for the new service would be nice.

Tests at https://gerrit.wikimedia.org/r/#/c/596691/ bring coverage to 100%

Change 596691 merged by jenkins-bot:
[mediawiki/core@master] Add Unit tests for SpamChecker service

https://gerrit.wikimedia.org/r/596691

Change 596287 merged by jenkins-bot:
[mediawiki/core@master] Inject SpamChecker where needed, hard deprecate EditPage static methods

https://gerrit.wikimedia.org/r/596287

Based on my read of this change and a request for help on IRC today, this narrowed the allowed type of the configuration global from string or array of strings to array of strings (the cast to array looks to have been removed), and probably should have been documented in release notes/announced accordingly. https://www.mediawiki.org/wiki/Manual:$wgSpamRegex 's example(s) is/are out of date as a result.