@PlavorSeol Now we surely know that the problem is directly inside populateContentTables. The next thing to try is what Anomie suggested, then we'll probably get some sort of answer.
I'd make the change this way:
- Keep $wgAbuseFilterRestrictions as it is, and use it for point 1
- For point 2, add a new variable like $wgAbuseFilterDangerousActions
- Add $wgAbuseFilterBlockingActions for point 3 and 4 (they're used for the same reason), and rename $wgAbuseFilterDisallowGlobalLocalBlocks to $wgAbuseFilterLocallyDisableGlobalBlockingActions
Then make the change in WMF config and copy these lines for new variables.
@PlavorSeol Could you please add something like
I have a hunch: I see that @PlavorSeol runs MW on Windows. Maybe it's a problem with windows and Shell? I'm sure that Shell has problems with windows (see T199989 and T183759, although in this case the script seems to complete), maybe this is related? All I can suggest is, as I wrote in T204475#4600808, to add some echos and see where the execution is halted.
Hooray, I finally found the reason! There's an error in short circuit evaluation. This is how the parser behaves
- Sets variables correctly
- Evaluates a != false and this is correctly parsed as false
- Then we have the & and $result is already false, so we enter in the short-circuit case
- It tries to evaluate the next token, which is b, and levels cascade until doLevelAtom
- Now doLevelAtom should get the value of the given variable (since b is AFPToken::TID).
Thu, Sep 20
Well, this already happened in REL1_27... I didn't go beyond because I'd need to install an older version of MediaWiki, which I don't have time to do now. We can surely say that this is not a regression introduced recently (although I'm not sure about the column in Wikimedia-production-error).
Also, the pattern seems to be correctly tokenized, so the problem is strictly inside AbuseFilterParser. To enrich the set of non-working examples: (a != false & b) != false produces the same result, i.e. syntax OK but then the same error on evaluation. Instead, it can't be (a != false & b) != false because this doesn't pass syntax check.
Something that could help debugging is that syntax check is passing. The main (and, I think, only) difference between evaluateExpression and checkSyntax is that the latter has $this->mAllowShort = false, so that short circuits aren't allowed. This means that the problem only happen when we try to short-circuit the evaluation.
This has nothing to do with AbuseFilter, I'm afraid. MCR quits update.php without reporting errors, so the needed tables aren't created and of course AbuseFilter doesn't work. Given that we don't have any output from the script, the only thing I can suggest to try is to insert some echos all through populateContentTables to approximately see in which point it stops.
This seems to be due to a flaky AbuseFilter on commons. However, I don't know what this job is supposed to do and why such filter is triggered.
@matmarex Yes, you may be correct. In fact, (a != false) & (b != false) evaluates correctly as well, and it would also explain why this only happens with both conditions. Needless to say, the problem is still serious, as it may make filters fail. @Krinkle May you please check in logstash how many of these errors we have? If I'm correct, there should be a spike in such errors on itwiki for yesterday.
Wed, Sep 19
It's hard to assess the risks of this error, but it may cause filters to fail silently. This actually happened on itwiki (logstash should bear the evidence).
Tue, Sep 18
I experienced this problem again, more than once. It's really annoying, and the query probably needs some tweaking.
Mon, Sep 17
This shouldn't happen. MCR isn't even the last row for core updates. The only thing I can say for sure is, this isn't due to AbuseFilter, but more likely to MCR or something else in the update script, as long as you're sure that your wiki is properly configured.
@PlavorSeol This is weird, because the abuse_filter_action table has always been included in the main schema, i.e. it's not been added with a db_patch. SQLite schema has it as well, so it should be installed during update.php. The only thing I can think of at the moment is: the update.php output ends with MCR stuff, and not with a success message. Are you sure that the script has been fully executed?
Sun, Sep 16
The cleanup is low priority, since it should be addressed together with other stuff, and with care.
Sat, Sep 15
@MGChecker Thanks :-)
My proposal is to add this feature to Special:AbuseFilter/test. At the top (or bottom) of the form, we may add something like
Fri, Sep 14
Old entries are untestable. While for big wikis this isn't a problem, for small ones it definitely is. For instance, we're trying to test a new filter on itwikinews, but since title variables are null for old entries we're more or less paralyzed. While there's no hurry for point 3, point 2 is vital.
Agreed, definitely. This variable should only be used for point 1. Using it for point 2 should be fine, but I agree that we should split them. The one in point 3 makes few sense, and really needs to be split. Point 4 instead is completely wrong. Actually, we should completely rework the way AbuseFilter takes actions: instead of doing it on the fly, it should first iterate through all actions to perform, then choose the ones to apply following a given priority, and then actually do them. This was alrady done for blocks in order to choose the longest duration, and would make T24623 a lot easier. It would also make really clear the action priority, and of course it would automatically avoid double warnings.
Hmm the query seems normal. Might this be caused by a temporary DB overload? After all, I managed to get the results of the same query in a couple of seconds some time later.
Thu, Sep 13
Whatever we decide to do, we should handle this together with T204236, which needs a maintenance script for the afl_var_dump field as well. As I wrote in T204236#4581410, we should also change the way we store the dump to be simpler, and JSON may help with it.
The patch above is for point 2. Point 3, however, is quite hard, for several reasons. The main one is that storing the dump may happen (or, I should say, happened) in lots of ways. See loadVarDump to see what I mean: there are rows where it's stored directly in afl_var_dump, where in turn it may either be an array or (I think) an instance of AbuseFilterVariableHolder. Plus, it could have up to three flags (external, gzip and nativeDataArray) to determine how it's encoded. Also, it could be stored inside ExternalStorage or not. All these things mean that we'd need to handle lots of different cases, and this could require too much effort since with the above patch everything will work again. My proposal is to first find a standard way to store the dump. Having it directly inside the abuse_filter_log doesn't seem bad to me: direct access and no need to write in the text table. However, I guess that there's a reason if we do it. At any rate, the method should be simplified. Then, we should write the script to change variable names, and seize the opportunity to fix T187153 as well.
If we decide to change old dumps as part of T204236, this would be automatically fixed.
I don't see it that useful, i.e. it's only worth it if we can reimplement it with minimal effort. As for the reason it isn't working, looking at queries I can see that it is indeed ordered by afh_timestamp. I suspect a bug inside TablePager, or maybe we're just doing something weird with it.
@tstarling Yes, I guess this makes sense. Making the log entry anonymous seems pretty straightforward, but what about the row to insert in abuse_filter_log? We need an ID and a name for it, so we still need to know if the user is safe to load. Or, do we?
Wed, Sep 12
Also, non-ASCII characters may be a problem. I still have to properly think about this, but since we're using plain links I don't see any clean solution.
Note about this feature: it'd be cool to make this function accept arrays or be variadic (like contains_any/all). For instance, add two functions:
Reading the stacktrace, the last call in AbuseFilter is to ManualLogEntry::getRecentChange. The log entry in question is constructed here and includes a performer. Such performer is in turn the User object generated here, which can be either:
- An user object used to call AbuseFilter::filterAction
- $wgUser if no user is specified
- A dummy anonymous user if the user isn't safe to load
Tue, Sep 11
@Marostegui Yep, many thanks :-) I'll go ahead with that as soon as we'll be ready.
@Marostegui Perfect. I guess we should seize the opportunity and rename afl_filter to afl_filter_id to make sure that it's INT. At any rate, I think this should be momentarily delayed, since we're already dealing with several big changes to AF (actor migration, major overhaul for throttling and major overhaul for profiling). IMHO they should have the priority over this change, which (although necessary for performance etc.) is harmless on MySQL/MariaDB.
@SpookyGhost8 This is not easy as it may seem, and a dummy edit would probably be the best solution for now :-) The problem itself is not how to display the button, but what to do later: we'd need to add a submit button with a different formaction (i.e. unthrottle the filter), check for it in the PHP code and write a specific function to only change af_throttled for the filter to be 0; this isn't that hard. What makes this non trivial is that such unthrottling should be somehow shown in filter history, but we don't have a frame for it. More specifically, in filter history, the af_throttled field is not taken into account at all, so we'd have to add a check for it, show it in the diff, and reformat rows on history to show that the filter has been unthrottled. But there's no dedicated column to show it.
In practice, this could really be a handy feature, but it's a low priority one, given the needed amount of work involved, which would probably outweigh the benefit. After all, you would still have to open the filter and hit a button, and a history row would still appear. Adding some spacing to filter pattern or history before hitting that button is quick, and avoid an unnecessary (for the moment) refactor.
Mon, Sep 10
@Marostegui Unfortunately, neither do I :-) However, PG is just the cause here. What we want to do (and I'm asking DBAs about) is perform a schema change on abuse_filter_action. Simply, we'd like to add a column (afl_filter_global) and ALTER the afl_column to be integer instead of text, for every DBMS. As I was saying, the table isn't that small, so we need to make sure that the process will be smooth.
Right now we have a total of five cases where such a JOIN is done. Schema changes aren't something to take carelessly, but this column doesn't really make sense. The table is pretty large though (for instance, 22 millions rows on enwiki). I guess we should first add the new column, start writing it, and reading from both. Then clean the afl_filter column with a maintenance script and finally change it to be INT.
Adding DBA for thoughts about the process above.