Page MenuHomePhabricator

Spam solutions for Education-l mailing list
Closed, ResolvedPublic

Description

We are getting a very high number of spam posts to the Wikimedia Education mailing list, i.e. +2,000 posts/month and the number is increasing. That huge number of spam posts hinders us from moderating real messages coming to the list from interested non-member users. We (the list admins) filter email notifications coming to our inboxes everyday but we are not able to review the posts at all. I totally understand that there is no magic wand until now to filter all spam posts from reaching the mailing list or to divide spam posts from other non-spam posts coming from non-list members. I would really appreciate if anyone has solutions to this problem other than filtering email notifications and will appreciate if you keep us posted with any new updates.

Event Timeline

Selsharbaty-WMF raised the priority of this task from to Medium.
Selsharbaty-WMF updated the task description. (Show Details)
Selsharbaty-WMF added a project: acl*sre-team.
Selsharbaty-WMF added a subscriber: Selsharbaty-WMF.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 26 2015, 2:59 PM
Dzahn added a subscriber: Dzahn.May 27 2015, 1:40 AM

Hi,

all mail going to lists also goes through spamassassin and gets a score.

If that score is over a certain threshold it gets rejected. Example from exim log on the list server:

"rejected after DATA: This message scored 16.0 spam points."

Under the global limit we don't delete mails because we want to let list admins have a chance to filter them based on their own list needs which ca be quite different.

As list admins you can go to Privacy options...-> [Spam filters] -> Spam Filter Regexp and filter based on the score there. You can also filter by other rules.

See:

https://wikitech.wikimedia.org/wiki/Lists.wikimedia.org#Fighting_spam_in_mailman

and

http://nbcs.rutgers.edu/mailman/spam.shtml

http://wiki.list.org/DOC/Can%20you%20show%20me%20some%20examples%20of%20%27header_filter_rules%27%3F

for more details.

hope that helped,

Daniel

Selsharbaty-WMF added a comment.EditedJun 14 2015, 3:06 AM

Hi Daniel,

Thanks for your detailed helpful reply.

I see two filter rules now in the list settings:

  1. ^Subject: .*\?{5,}.*$
  2. x-spam-status: yes

If I want to make the spam filters more strict, do I need to add more filter rules? and which ones could help the best with this case?

If I want to make the spam filters more strict, do I need to add more filter rules?

Either more, or better ones. :)

and which ones could help the best with this case?

Hard to say without spam examples, or data which scores such messages receive. The links provided by Dzahn should provide some help though?

https://wikitech.wikimedia.org/wiki/Lists.wikimedia.org#Fighting_spam_in_mailman is the documentation that is on wikitech. Better spam management will come with mailman v3 which is on the radar for some point (not this year, next year may be a stretch in my opinion). Is there any reason to keep this open? (mostly condensing open tickets regarding spam.)

Restricted Application added a subscriber: Matanya. · View Herald TranscriptJul 7 2015, 10:32 PM

Hi John,

I will close it now. I have applied a new filter blocking messages with certain common spam words in the subject. It worked and no spam messages at all came to the list sense then.

The list is here: http://blog.hubspot.com/blog/tabid/6307/bid/30684/The-Ultimate-List-of-Email-SPAM-Trigger-Words.aspx

Thanks,

Selsharbaty-WMF closed this task as Resolved.Jul 9 2015, 8:24 AM
Selsharbaty-WMF set Security to None.
Dzahn reopened this task as Open.Sep 1 2015, 12:24 AM

reopened the ticket because of personal mail between Samir and John i was CCed on

Dzahn added a comment.Sep 1 2015, 2:59 PM

@JohnLewis please add the info from the mail you sent, so we can use it to refer to when other lists have similar questions

JohnLewis closed this task as Resolved.Nov 3 2015, 5:14 PM

https://wikitech.wikimedia.org/wiki/Mailman#Fighting_spam_in_mailman should be the aggregated source for this information. Really we should collectively work on improving it and as when I cleared up docs, that information was valid as best as could be.

Pine added a subscriber: Pine.Jul 12 2017, 5:41 PM