Page MenuHomePhabricator

Have ContentLengthFilter use wgMaxArticleSize and remove MAX_POST_LENGTH
Closed, ResolvedPublic

Description

Reportedly, Tech News was not delivered to https://ru.wikipedia.org/wiki/Википедия:Форум/Новости/Flow .

https://ru.wikipedia.org/w/index.php?title=Special%3ALog&type=massmessage&year=2015&month=4
16:40, 13 April 2015 Delivery of "Tech News: 2015-16" to Википедия:Форум/Новости/Flow failed with an error code of ntspamfilter

However, another post was delivered by MediaWiki message delivery: https://ru.wikipedia.org/wiki/Тема:Sf771s7j26yzishk

on the other ru.wiki pages Tech News came normally, e.g. https://ru.wikipedia.org/w/index.php?diff=69974884

Event Timeline

Mattflaschen-WMF updated the task description. (Show Details)
Mattflaschen-WMF raised the priority of this task from to Needs Triage.
Restricted Application added a project: Collaboration-Team-Triage. · View Herald TranscriptApr 14 2015, 1:54 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

it tripped one of the spam filters: https://ru.wikipedia.org/wiki/%D0%A1%D0%BB%D1%83%D0%B6%D0%B5%D0%B1%D0%BD%D0%B0%D1%8F:%D0%96%D1%83%D1%80%D0%BD%D0%B0%D0%BB%D1%8B/massmessage?uselang=en .

Thanks for finding that, @Sunpriat.

I'm not sure which one it hit, but it should probably have exemption.

Sunpriat updated the task description. (Show Details)Apr 14 2015, 2:24 AM
Sunpriat set Security to None.

-MassMessage, not an issue in the extension.

next step is to figure out which spam filter it hit. We might want to consider a way for internal clients to be exempt from the spam filter. We might also need some more detailed logging about what exact spam filter was triggered.

EBernhardson triaged this task as High priority.Apr 14 2015, 5:52 PM

It tripped a filter on en.wiki. I removed it from the list before checking why it had been added and finding the entry for this task (do re-add it if you want to keep on testing). Here's the log entry.

16:22, 4 May 2015 Delivery of "Tech News: 2015-19" to Wikipedia talk:Flow/Developer test page failed with an error code of ntspamfilter

I also noted it on the page at https://en.wikipedia.org/wiki/Topic:Sgomapx6fl8ordus

Change 208828 had a related patch set uploaded (by Mattflaschen):
Add logging so we know what filters are being hit

https://gerrit.wikimedia.org/r/208828

Mattflaschen-WMF renamed this task from Reported delivery failure for Flow page for Tech News to Reported MassMessage delivery failures for Flow pages due to SpamFilter.May 4 2015, 10:11 PM
Mattflaschen-WMF renamed this task from Reported MassMessage delivery failures for Flow pages due to SpamFilter to MassMessage delivery failures for Flow pages due to SpamFilter.

Change 208828 merged by jenkins-bot:
Add logging so we know what filters are being hit

https://gerrit.wikimedia.org/r/208828

Change 208887 had a related patch set uploaded (by Mattflaschen):
Add logging so we know what filters are being hit

https://gerrit.wikimedia.org/r/208887

Change 208888 had a related patch set uploaded (by Mattflaschen):
Add logging so we know what filters are being hit

https://gerrit.wikimedia.org/r/208888

I'll put this up for deployment tomorrow, so we don't have to wait as long to get better logging in production.

Change 208887 merged by jenkins-bot:
Add logging so we know what filters are being hit

https://gerrit.wikimedia.org/r/208887

Change 208888 merged by jenkins-bot:
Add logging so we know what filters are being hit

https://gerrit.wikimedia.org/r/208888

@Legoktm got this deployed on all WMF wikis. Please let us know if it happens again, and this time we will have better logging data to troubleshoot.

Just happened again on en.wiki, caught by ntpspamfilter again – the log message is identical apart from the date.

Sunpriat added a comment.EditedMay 11 2015, 5:20 PM

message has not changed: 18:38, 11 May 2015 Delivery of "Tech News: 2015-20" to Википедия:Форум/Новости/Flow failed with an error code of ntspamfilter

https://en.wikipedia.org/w/index.php?title=Special%3ALog&type=massmessage&user=&page=&year=2015&month=5&tagfilter=
16:02, 11 May 2015 Delivery of "Tech News: 2015-20" to Wikipedia talk:Flow/Developer test page failed with an error code of ntspamfilter

Flow\SpamFilter\Controller::validate: Spam filter failed on 'Википедия:Форум/Новости/Flow'. Old revid: None. New revid: sh4762phrwkl343s. Filter: Flow\SpamFilter\ContentLengthFilter

This one is failing on ContentLength filter.

@DannyH @matthiasmullie Do you recall why it was set to this? It seems we should expand it significantly. Currently, it's 25,600. I would recommend we multiply that by at least 10, maybe 20, if we want to keep a limit at all.

Note, when this is changed, there is a dead variable, MAX_POST_LENGTH.

Looks like it was introduced in https://gerrit.wikimedia.org/r/#/c/159805/ & it seems to have been important enough to cherrypick to production at that time.
Can't find the reason it was added, though.

That one's parent patch was the one where mention notifications are limited to 20, but I don't think it's immediately related to that.
My guess would be that we started thinking about storage space (since templates are expanded and all of it saved as a huge HTML blob), as I don't see any other technical reasons.
Maybe @EBernhardson knows more?

Mattflaschen-WMF renamed this task from MassMessage delivery failures for Flow pages due to SpamFilter to MassMessage delivery failures for Flow pages due to ContentLengthFilter.May 14 2015, 12:34 AM

IIRC at the time we were having trouble with posts that were big mostly for the purposes of showing that the interface didn't handle them well. A simple solution was just to put in a limit beyond which we didn't expect to have replies. It could likely just be removed.

We decided there's no longer a policy need to have a length limit.

ExternalStorage uses a longblob, which apparently has a limit of 4 GB. We might run into some other issue before that. So the question is whether to remove the limit, or just set it crazy-high.

I'll put up a patch that reuses $wgMaxArticleSize (2048 KB by default and for WMF), unless I hear otherwise.

Mattflaschen-WMF renamed this task from MassMessage delivery failures for Flow pages due to ContentLengthFilter to Have ContentLengthFilter use wgMaxArticleSize and remove MAX_POST_LENGTH.May 14 2015, 11:35 PM

Change 211332 had a related patch set uploaded (by Mattflaschen):
Expand maximum post length to be based on $wgMaxArticleSize

https://gerrit.wikimedia.org/r/211332

Change 211332 merged by jenkins-bot:
Expand maximum post length to be based on $wgMaxArticleSize

https://gerrit.wikimedia.org/r/211332

16:16, 25 May 2015 Delivery of "Tech News: 2015-22" to Википедия:Форум/Новости/Flow failed with an error code of ntspamfilter

@Mattflaschen Does Sunpriat's comment mean that this fix didn't work? Or was it not deployed yet?

It wasn't deployed yet.

The earliest branch this is in is wmf/1.26wmf7. That was deployed to all Wikipedias (including Russian) on 2015-05-27.

DannyH closed this task as Resolved.Jun 2 2015, 4:25 PM

Okay, good. Thanks!

DannyH reopened this task as Open.Jun 2 2015, 9:02 PM

@Sunpriat @Mattflaschen

Are we talking about the length of a post? I just copy-pasted the first 34 chapters of Pride and Prejudice in this thread:

https://www.mediawiki.org/wiki/Topic:Sihpa5dqbyxf086a

It posted fine, with 60,000+ words. So I assume that the problem Sunpriat is seeing isn't about length?

(reopening to make sure this problem has been solved)

Mattflaschen-WMF closed this task as Resolved.Jun 2 2015, 9:40 PM

Doesn't seem to be the same issue as before. Filed separately as T101177: Tech News Delivery to Flow page fails without error.