Page MenuHomePhabricator

Spam filter not filtering majority of spam to Junk folder
Closed, ResolvedPublic

Description

This problem has been happening (again) for some time (years). It appears (by
viewing the email headers) that (some) messages are still being assigned a spam
score, but it does not appear anything happens beyond that. A lot of spam
(even scored as "possible spam", etc., items are still being directed to the
main queues rather than the junk folder.

OTRS admins can set filters, which typically could divert any message with an
X-Spam-Score higher than X value to the junk filter, but X-Spam-Score does not
exist on our list of search options. We have X-Spam-Flag, X-Spam-Level and
X-Spam-Status - none of these are even in the email headers.


Version: wmf-deployment
Severity: normal
See Also:
http://bugs.otrs.org/show_bug.cgi?id=9042

Details

Reference
bz43665

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 22 2014, 1:22 AM
bzimport added projects: OTRS, acl*sre-team.
bzimport set Reference to bz43665.
Rjd0060 created this task.Jan 6 2013, 1:25 AM
agray added a comment.Jan 6 2013, 11:41 PM

Some example headers from one such ticket:


X-Spam-Score: 1.4 (+)
X-Spam-Report: Spam detection software, running on the system "mchenry.wikimedia.org", has identified this incoming email as possible spam. If you have any questions, see the administrator of that system for details. Content analysis details: (1.4 points, 4.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- 0.0 HTML_MESSAGE BODY: HTML included in message 2.2 TVD_SPACE_RATIO BODY: TVD_SPACE_RATIO -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1% [score: 0.0000] 1.8 MISSING_SUBJECT Missing Subject: header


Fixing OTRS (on past experience) looks pretty unlikely to happen - would it be possible to get the WMF spam-filter to add a duplicate header when it adds X-Spam-Score - perhaps the same value but in X-Spam-Level?

ori added a comment.Jan 12 2013, 10:08 PM

Created attachment 11620
Patch against our SVN repo of quilt patches to add X-Spam-Score header support

Patch against http://svn.wikimedia.org/svnroot/mediawiki/trunk/otrs. Adds a patch to our quilt patch series which adds "X-Spam-Score" to the list of e-mail headers that are scannable. Submitted upstream at http://bugs.otrs.org/show_bug.cgi?id=9042.

Attached:

ori added a comment.Jan 12 2013, 10:14 PM

Assigning to Tim Starling for review, since the SVN log indicates that he wrote all previous OTRS patches. (Sorry if that's presumptuous.)

Any update here? Spam continues to come in directly to the main queues extremely heavy.

+CCing Jeff Green, maybe he could deploy that patch?

sumanah wrote:

Philippe: now that bug 22622 is moving forward (to upgrade our installation of OTRS), please let us know whether you would prefer to simply wait for the upgrade, or to deploy this particular patch immediately to cut down on spam. Thanks!

OK to deploy, unless Jeff Green has any hesitations.

Jeff: Any objections?
Jeff / Tim: If not, could you please deploy this?

I've asked for patch deployment in RT #4713.

The SVN OTRS repository is deprecated and locked so I requested that be moved into git, and manually deployed the patch in the mean-time. I will check it in to git once that's available.

This patch was deployed by Jeff, so hopefully working in OTRS is a bit less noisy now for everybody.

Keeping this open for the SVN -> git codebase migration part, cannot see the request listed on http://www.mediawiki.org/wiki/Git/New_repositories/Requests though.

(In reply to comment #13)

All done:
https://gerrit.wikimedia.org/r/gitweb?p=operations/software/otrs.git;a=log;
h=refs/heads/master
Closing as FIXED. Thanks everybody!

The patch implemented (I believe it is RT #4713 according to above) does not appear to have worked. We (the OTRS admins) still do not have an 'X-Spam-Score' option on the dropdown menu when creating PostMaster Filters.

I know there was other work done with regards to spam filtering on OTRS, so not sure if this SpamAssassin scoring is superseded by that and this bug is moot, or not - so re-opening.

Jeff: Any comments on comment 14?

The one additional thing I did re. spam filtering was to nuke the auto-whitelist database so spamassassin would start fresh. I checked logs and it looks to me as though spamassassin+otrs is generally working as expected. Beyond that I'll talk to Martin about how we can improve spam filtering with the upgrade.

We're still getting a lot of spam in the main queues, even though SpamAssassin is recognizing it as likely spam...one recent example, ticket 2013052810003205.

"X-Spam-Score: 2.6 (++)
X-Spam-Report: Spam detection software, running on the system "mchenry.wikimedia.org", has identified this incoming email as possible spam. If you have any questions, see the administrator of that system for details..."

This was delivered to a regular Wikiquote queue, rather than the junk queue.

A new version of OTRS was recently made available (see bug 22622) and SpamAssassin is still available for it ( https://gerrit.wikimedia.org/r/#/c/77391/ ) so an update on this bug report would be very welcome:

  • Is this still a problem?
  • Is the information provided in this bug report still correct?

(In reply to comment #19)

https://rt.wikimedia.org/Ticket/Display.html?id=5557#txn-125935 says:

  • junk queue -> mbox -> sa-learn

(err, left out: that's under "left to do")

(In reply to comment #18)

A new version of OTRS was recently made available (see bug 22622) and
SpamAssassin is still available for it (
https://gerrit.wikimedia.org/r/#/c/77391/ ) so an update on this bug report
would be very welcome:

  • Is this still a problem?
  • Is the information provided in this bug report still correct?

This issue still appears to be present. The above information that I've supplied in the bug info is correct, as far as I know. To summarize:

The spam filtering software that 'mchenry.wikimedia.org' is using is flagging messages as potential spam. Despite this, the messages are still being delivered to the main queues, rather than routed to the 'Junk' queue. This is happening with messages with high "spam scores", as well as low ones.

Here is an example message, from a ticket received today, #2013081210006772, that despite receiving a high score, was still delivered to the info-en Quality queue, as it was addressed to:

X-Spam-Score: 5.9 (+++++)
X-Spam-Report: Spam detection software, running on the system "mchenry.wikimedia.org", has identified this incoming email as possible spam. If you have any questions, see the administrator of that system for details. Content analysis details: (5.9 points, 4.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_PASS SPF: sender matches SPF record 1.5 URIBL_WS_SURBL Contains an URL listed in the WS SURBL blocklist [URIs: pechkin-mail.ru] 2.7 URI_UNSUBSCRIBE URI: URI contains suspicious unsubscribe link 0.0 HTML_MESSAGE BODY: HTML included in message 1.1 MPART_ALT_DIFF_COUNT BODY: HTML and text parts are different -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1% [score: 0.0000] 1.4 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76 chars 0.2 SARE_SUB_ENC_UTF8 Message uses character set often used in spam 1.7 SARE_UNSUB13 SARE_UNSUB13 -0.0 AWL AWL: From: address is in the auto white-list

I think I've finally fixed this.

Steinsplitter moved this task from Incoming to Resolved on the OTRS board.Mar 19 2015, 10:28 AM
grin added a subscriber: grin.Apr 12 2016, 7:38 PM
Restricted Application added subscribers: Matthewrbowker, Steinsplitter. · View Herald TranscriptApr 12 2016, 7:38 PM