Page MenuHomePhabricator

Deprecate and then reject empty operators
Closed, ResolvedPublic

Description

AbuseFilterParser supports the following empty operators, whereas the CachingParser does not:

  • (foo|) is equivalent to foo
  • (foo&) is equivalent to false
  • (foo&!) is equivalent to foo
  • (!) is equivalent to true
  • if () ...
  • if ( cond ) then () ...
  • if ( cond ) then ( ... ) else () end
  • var :=
  • ()

and also others, see AbuseFilterParserTest::testEmptyOperands.

In all cases, the missing argument of a boolean operator is treated as null.
With this fact, people could inadvertently break filters if they leave a leading operator, so we should really stop allowing it. We should first emit debug notices, then fix filters for WMF prod, and then drop support.

Plan as of 15 Sept

  • Formally deprecate empty operands with logging (with a clearer message) in MW 1.34

[] Change logEmptyOperand to be e.g. throwEmptyOperand and throw in MW 1.35 or 1.36. Alternatively, do nothing if the old parser gets dropped before that.

New plan is: leave the deprecation in place and drop the old parser directly (T239990).

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

There are various occurrencies of channel:AbuseFilter AND "Empty operand of type" on logstash. Probably, we'd need to add the ID of the filter being parsed, in order to know exactly where to look.

Change 529474 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Add the filter ID to empty operand logging

https://gerrit.wikimedia.org/r/529474

Change 529474 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Add the filter ID to empty operand logging

https://gerrit.wikimedia.org/r/529474

Daimona renamed this task from Deprecate and remove wild syntax to Deprecate and then reject empty operators.Aug 20 2019, 4:24 PM
Daimona updated the task description. (Show Details)

Change 531289 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Log more empty operands

https://gerrit.wikimedia.org/r/531289

Change 531289 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Log more empty operands

https://gerrit.wikimedia.org/r/531289

So, now we have full logging data, and here is a list of affected filters:

  • jawiki filter 4 - empty boolean operand
  • thwiki filter 25 - empty boolean operand
  • plwiki filter 23 - empty boolean operand
  • fawiki filter 119 - empty boolean operand
  • nlwiki filter 57 - empty boolean operand
  • enwikivoyage filter 24 - empty sum operand
  • hewikisource filter 2 - empty boolean operand
  • simplewiki filter 72 - empty boolean operand
  • dawiki filters 20, 24 and 33 - empty boolean operand
  • alswiki filter 11 - empty boolean operand
  • shwiki filter 11 - empty boolean operand
  • hewiki filter 81 - empty boolean operand
  • jawikisource filters 1, 2 and 4 - empty boolean operand
  • jawikibooks filter 2 - empty boolean operand
  • jawikiquote filter 1 - empty boolean operand
  • skwiki filter 10 - empty boolean operand
  • lmowiki filter 1 - empty boolean operand
  • rmwiki filter 1 - empty boolean operand
  • eswiki filter 51 - empty boolean operand

And given that the RFC is still far from conclusion, we should announce these in Tech News, together with the list that I'm going to add to T153251.

@Daimona Please could you suggest an appropriate way to phrase this in Tech News? Ideally a 1 (or 2) sentence description that is easily translatable. Thanks!

@Quiddity sure... Something like

Soon, the AbuseFilter will recognize new syntax errors. Specifically, all new errors are about empty operands, see a list of examples on [[phab:T156096|phabricator]]. Any active filter with such an error will stop working; hence, please take a look at the [[phab:T156096#5464310|list]] of affected filters, and fix the ones that you can fix.

And, at the risk of sounding repetitive, I'd re-add:

Note that there's an ongoing [[:m:Requests_for_comment/Creating_abusefilter-manager_global_group|RFC]] on meta about the creation of a new abusefilter-manager global group. If it passes, this kind of AbuseFilter maintenance will be performed by members of this group.

I'm sorry if that's not short, or easily translatable, but I don't know how to simplify it. The message is addressed to AF editors, which are usually tech-savvy people, so I'm unsure about how much we can simplify.

Change 535169 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Throw AFPUserVisibleExceptions for empty operands in CachingParser

https://gerrit.wikimedia.org/r/535169

Change 535169 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Throw AFPUserVisibleExceptions for empty operands in CachingParser

https://gerrit.wikimedia.org/r/535169

So, now this is again the only blocker for the new parser (together with T153251, but they're actually the same thing). The tech news issue asking people to fix filters was published 6 days ago. While it's not that much time, I went to logstash to check how we're doing. And... We're not doing good. Here is the trend over the last 7 days [1]; we can say it has remained constant. If it keeps staying constant, I wonder what we could do to get things done quicker. The outcome of the meta RFC is distant and uncertain, hence it's not something we'll be able to use for this task anyway. I wonder, is the sysadmin group usable for this?

[1] - The empty operand's type uses the "type" key, which is overwritten by "mediawiki" and hence is not usable :-/

Linked list:

@Daimona @Krinkle on one hand, we have https://meta.wikimedia.org/wiki/Requests_for_comment/Creating_abusefilter-manager_global_group in which there is ongoing debate about whether a central user should be allowed to modify filters on different wikis; and as you know, I got Global Sysop rights explicitly to do that but it is limited to wikis that have not opted out.

On the other hand, I see in the post above (from less than 20 minutes ago) that a filter on fawiki needs a change and before I (a local admin) can get to it, I see Krinkle has fixed it. This is contradictory. If we already have a user (Krinkle) who is allowed to do this for wikis like fawiki (which has a large, active admin group and is in the opt-out list for Global Sysop), then why the RfC? If we think we need permission from the community and hence the RfC, then why is Krinkle allowed to do those edits?

PS: Krinkle, which permission are you using to make these edits? Global edit interface is not it, and I don't think you are a global sysop.

Mentioned in SAL (#wikimedia-operations) [2019-09-15T16:51:23Z] <Krinkle> Fixed a dozen abuse filtes, listed at https://phabricator.wikimedia.org/T156096#5494060. The trailing pipe character was removed from filters that had it which is no longer supported in a future version of AbuseFilter.

@Daimona @Krinkle on one hand, we have RFC in which there is ongoing debate about whether a central user should be allowed to modify filters on different wikis; and as you know, I got Global Sysop rights explicitly to do that but it is limited to wikis that have not opted out.

On the other hand, I see in the post above (from less than 20 minutes ago) that a filter on fawiki needs a change and before I (a local admin) can get to it, I see Krinkle has fixed it. This is contradictory. [..]

I upgraded the filter syntax as system administrator, and asked Daimona to help me by providing a link to each filter. Phabricator was a convenient place to share them with me, sorry about the confusion.

These are solely syntactical changes needed before a software update to keeping the existing filters functional. If this specific change were needed regularly, we could have developed a maintenance script instead. The change consist of removing a single redundant character from the filter syntax, example at skwiki filter 10 diff 299. For anything less trivial than this, I would not have offered my help as sysadmin. Such changes would require a decision to be made on how the filter should behave. I cannot decide on that.

The number of filters that exist and their complexity is in the hands of the community. As such, I believe the community should also have the responsibility and capability to maintain these on their own, in a timely manner. If the community chooses not to accept this RFC, or is unable to agree on other ways to ensure timely upgrade of abuse filters, then the result will likely be that unmaintained filters get disabled after some time due to invalid syntax. I support Daimona's RFC because I do not wish that to be the case. For this once, given the simplicity of the change, and the small number of filters that required changes, I took two hours of my weekend to solve the problem.

My intent was not to criticize you, personally, @Krinkle. What I am trying to figure out is why in that RFC we never mentioned that there exists a group of sysadmins, yourself included, that already have the rights and permissions to modify filters for technical reasons only. I think the responses to that RFC would have differed if the participants knew that a group like this already exists. Right now, their point of reference is the Global Sysop group (which is significantly restricted by the opt-out process).

Would you be okay if I clarify this on the RFC? And would you have a preference on the wording used to describe the rights and permissions of the sysadmins, as it comes to trivial, necessary, technical changes to abuse filters?

If we already have a user (Krinkle) who is allowed to do this for wikis like fawiki (which has a large, active admin group and is in the opt-out list for Global Sysop), then why the RfC?

All sysadmins can do this, and they've always been able to. That's more or less what happened with the abusefilter-manager group a few years ago. The idea behind the RFC was to make this process entirely managed on-wiki, instead of recurring to backend "hacks" like the sysadmin group. The RFC has been there for more than one year, and based on the last comments, it's not even clear what the outcome could be. Until the RFC is closed, IMHO it's OK to keep doing as before, i.e., ask sysadmins.

As I said above, the tech news issue asking to fix those filters went out roughly a week ago. Tomorrow a new one will be published, hence the chances of people noticing the call to action will go down. As I said, in the timespan of a week, no filter was fixed. And since things could only get worse, I've asked about sysadmins.

The number of filters that exist and their complexity is in the hands of the community. As such, I believe the community should also have the responsibility and capability to maintain these on their own, in a timely manner. If the community chooses not to accept this RFC, or is unable to agree on other ways to ensure timely upgrade of abuse filters, then the result will likely be that unmaintained filters get disabled after some time due to invalid syntax.

Yeah, which is the reason why the RFC was created. As a filter maintainer in my home wiki, I would be upset if a filter stopped working just because of a backend change, hence I don't want it to happen. But as I said, if the communities don't agree on the RFC and/or deem it as useless, sysadmins are the only way to go. Unless the communities also agree to let old filters die whenever we change the backend, but that'd be crazy.

(Uh-oh, edit conflict)

What I am trying to figure out is why in that RFC we never mentioned that there exists a group of sysadmins, yourself included, that already have the rights and permissions to modify filters for technical reasons only.

Sysadmins can technically do everything. But IMHO that's a hacky solution, and not suitable for the long-term. It's also something to be used in exceptional cases. Having a group would delegate everything to the community control, and avoid such exceptional cases altogether.

Would you be okay if I clarify this on the RFC? And would you have a preference on the wording used to describe the rights and permissions of the sysadmins, as it comes to trivial, necessary, technical changes to abuse filters?

(Not directed to me, but:) Personally I'm OK. But I'd also explicate that currently, fixing filters is NOT in the scope of sysadmins. And that granting sysadmin rights goes beyond the way things are done on-wiki (I've learned the right is assigned via shell). *If* people think it could become standard practice to perform technical AF maintenance using the sysadmin group, that's OK. The important is to know what the trade-offs are.

Little update: production filters now seem to be OK. Next steps here are:

  • Formally deprecate empty operands with logging (either via wfDeprecated, or stating it clearly in the message)
  • Before switching to CachingParser, re-check logstash to ensure no new problems (hence leaving open as subtask)
  • Change logEmptyOperand to be e.g. throwEmptyOperand and throw [in a future MW version]

Change 537074 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Hard-deprecate empty operands

https://gerrit.wikimedia.org/r/537074

Daimona moved this task from Blocker to Deprecate/Remove on the MW-1.34-release board.

The deprecation should happen in 1.34.

Note that the following production filters are still broken - probably unnoticed because on small wikis:

Note that the following production filters are still broken - probably unnoticed because on small wikis:

And also plwiki filter 45.

Change 540647 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/mediawiki-config@master] Enable AbuseFilterCachingParser for (almost) all wikis

https://gerrit.wikimedia.org/r/540647

Change 540647 merged by jenkins-bot:
[operations/mediawiki-config@master] Enable AbuseFilterCachingParser for (almost) all wikis

https://gerrit.wikimedia.org/r/540647

Note that the following production filters are still broken - probably unnoticed because on small wikis:

And also plwiki filter 45.

Adding svwiki 115 to the list. This filter is now erroring out since svwiki is already using the new parser.

Adding svwiki 115 to the list. This filter is now erroring out since svwiki is already using the new parser.

I have informed Svensson1, Larske and JohanahoJ about this at https://sv.wikipedia.org/wiki/Wikipediadiskussion:Redigeringsfilter#Filter_115

Adding svwiki 115 to the list. This filter is now erroring out since svwiki is already using the new parser.

I have informed Svensson1, Larske and JohanahoJ about this at https://sv.wikipedia.org/wiki/Wikipediadiskussion:Redigeringsfilter#Filter_115

Thanks! I've also notified the other wikis: plwiki, viwiki, and enwikivoyage.

@Daimona: Larske has now removed a dangling | at the end of https://sv.wikipedia.org/wiki/Special:Missbruksfilter/115

Can you confirm that the error is fixed?

@Daimona: Larske has now removed a dangling | at the end of https://sv.wikipedia.org/wiki/Special:Missbruksfilter/115

Can you confirm that the error is fixed?

Totally, thanks a lot!

FTR, enwikivoyage and viwiki done. Only plwiki is left!

Change 541028 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Align arg counting between the parsers

https://gerrit.wikimedia.org/r/541028

FTR, enwikivoyage and viwiki done. Only plwiki is left!

... And also ckbwiki filter 7, just discovered.

FTR, enwikivoyage and viwiki done. Only plwiki is left!

... And also ckbwiki filter 7, just discovered.

I'd like to ask sysops to fix it, but unfortunately gtranslate doesn't translate Sorani... Actually, trying to translate this page yields a pretty funny result.
Anyway, I've tried.

ckbwiki was fixed. So, again, only plwiki is left, and I've CC'ed the filter author. @Krinkle wanna deploy the change tomorrow?

plwiki fixed. So we can now enable the new parser on all wikis!

Change 541026 had a related patch set uploaded (by Krinkle; owner: Daimona Eaytoy):
[operations/mediawiki-config@master] Move the remaining wikis to AbuseFilterCachingParser

https://gerrit.wikimedia.org/r/541026

Change 541026 merged by jenkins-bot:
[operations/mediawiki-config@master] Move the remaining wikis to AbuseFilterCachingParser

https://gerrit.wikimedia.org/r/541026

Change 537074 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Hard-deprecate empty operands

https://gerrit.wikimedia.org/r/537074

Change 550528 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@REL1_34] Hard-deprecate empty operands

https://gerrit.wikimedia.org/r/550528

Daimona updated the task description. (Show Details)

I've updated the task description to reflect that we still need to decide what to do with the code in the old parser. We could change it to throw in 1.36 (?), or just do nothing if we're going to drop the old parser in 1.36 itself.

Change 550528 merged by Mobrovac:
[mediawiki/extensions/AbuseFilter@REL1_34] Hard-deprecate empty operands

https://gerrit.wikimedia.org/r/550528

Daimona updated the task description. (Show Details)

Change 541028 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Align arg counting between the parsers

https://gerrit.wikimedia.org/r/541028