@Halfak Yeah, I lately got to the same conclusion that this happens both because deleted revisions aren't handled properly, and because for itwiki only deleted revisions are left. I guess showing a clearer message would be great for the moment, while the proper solution would be of course to overcome the limitations of JSONP.
Using a sample API request, I noticed that:
- https://it.wikipedia.org/w/api.php?action=compare&format=json&fromrev=68597719&torev=68597875&callback=xxx (identical to the actual one, except for the callback name) returns an error
- https://it.wikipedia.org/w/api.php?action=compare&format=json&fromrev=68597719&torev=68597875 (the same as above but without callback) works fine
On m.o I read:
Mon, Nov 19
In the best scenario this only makes logs a bit messy. However we cannot exclude that something is broken underneath.
Fri, Nov 16
So, what's the status here? Adding wgUser is trivial, do we want to do it?
Just a comment: "_text" variables are for page title, and so are "_prefixedtext" variables. So, if you're interested in covering such variables, then you also have to include "_title" and "_prefixedtitle" per T173889.
Adding Platform team per maintainers, plus CC'ing people active in this project and Tim who reviewed the patch. Being the related task UBN and release blockers, the plan is to get this done before 1.32 release. Also, a note for whoever will run the script: it's not included in production (it'll be in wmf.6), but given the circumstances, could you please deploy it and do the dry run? Deploying the script alone (and adding it to autoload) would be enough, no other code from that patch is required.
Thu, Nov 15
Will file a separate task once the patch will be merged.
Setting to high to reflect the importance of the parent task and related ones (specifically T203336 which is 1.32 blocker).
@Tgr that is true for old_wikitext_slot and new_wikitext_slot (or whatever we decide to call them), but not for derived variables like added_lines_slot, edit_delta_slot etc. I also think that for users it would be quite confusing to see lazy-loaded array elements of the same array, so we should take this into account.
T209565 for the maintscript
Wed, Nov 14
I think per-slot filters arebb't the right way. As for using suffixes versus arrays, the idea is the same, and both ways are good. The advantage of arrays could be a less clogged variables list (and you get the idea that you're using a part of the whole). Another thing to take into account: each part of variable should be lazy-loaded independently of others, e.g. if the whole page has new_wikitext for slot A and B, only load the required ones, like if they were separate variables. Effectively making them separate variables would help with this.
Random idea: add a new AFPData type, e.g. DNONE, to tell between actual null variables, and non-initialized ones; make the parser be permissive when it encounters this data type. I wrote a stub-patch for this some weeks ago, but stopped working on it as new problems came out... I'm also not that confident in making such a breaking change to the parser, although I think it's worth investigating this idea.
@Tgr If correctly supervisioned, it won't be a breaking change. Arrays are currently handled as strings in several function, that's because of poor initial implementation. Just think of how common is for instance added_lines contains/like/rlike ...: in these cases we just cast the array to string, because added_lines was turned into array when lots of filters already used a syntax like that, and people decided not to break everything. A step forward is to add array-specific functions, see https://gerrit.wikimedia.org/r/#/c/424298/. This way we add new ways to play with arrays (mostly treating them as such), without changing anything in the current syntax.
Adding associative arrays would be the same: for sure a big (and possibly useful) change, but it wouldn't interfere with existing filters.
The two points above (specific functions and associative arrays) would IMHO make arrays acceptable, and we could use such arrays in this patch after having addressed T204654.
@Jdforrester-WMF Huh? Weren't we planning to make the change so that nothing will need to be updated? It wouldn't be easy to update all those filters...
@Tgr Well, that's an important point. I think we should instead improve arrays (and yeah, they pretty suck right now) somehow, maybe introducing associative arrays and changing my previous idea to use them.
@Jdforrester-WMF That would be easy, but I'm not sure it's worth it. It'd be similar to T173889, and as for that task we would end up keeping old variables anyway. Unless we want to edit hundreds of filters.
I think a good approach could be to:
- Keep current variables as they are, i.e. include all slots (concatenated)
- Add new variables, for instance new_wikitext_slot, added_lines_slot, ..., every of them being an array. To retrieve its content for the nth slot, you just have to access e.g. new_wikitext_slot and proceed as usual
I don't know if this will be feasible and painless for filter maintainers, but I guess it'd be a relatively simple but effective approach. However, please note that:
- I haven't investigated what code changes would be necessary
- Since all these arrays will have variable length, we'll need to change the syntax checker, which right now fails with non-fixed-length arrays (part of T204654)
- In any case, also given point 2, this will not be easy.
Note: arrays with variable length should also be allowed: the syntax check doesn't allow accessing an array element if such array is generated at runtime (for instance it is added_lines), although it makes sense to do so. To reproduce the problem you may try checking added_lines, although that gives the wrong error (T198531).
Tue, Nov 13
Mon, Nov 12
@Thryduulf Ah, that's a totally different thing. The /examine interface doesn't really contain a reference to the log entry itself, and thus it's also missing a link for oversighting it. The real point here is T20655. Currently, "details" is the one focused on the entry itself and all the variables, while "examine" is almost entry-independent (if two filters catch the same edit, "examine" is the same for both entries) and is more meant to do some testing with an arbitrary pattern, which may or may not be the one of the triggered filter. I don't know if I explained properly, but the point is that "examine" is in place only to allow people take the variables from a logged abusefilter hit, and do whatever they want with them, disregarding the entry. It may sound funny, but the problem reported is more or less a feature. My suggestion as filter maintainer is to use "details" in these cases, while for the future we'll see in T20655.
Thu, Nov 8
Wed, Nov 7
@Amorymeltzer I checked, and I confirm that it's a duplicate of T44734. Also, please note that this only happens in Special:AbuseLog, and not in Special:AbuseFilter/test or /examine (as it did for the problem originally reported in this task). And when it comes to specific AbuseLog entries, suppressing them is the right way, at least while waiting for a fix to T44734 (which won't be easy).
@Huji thanks, excellent analysis! The idea is great, however I agree that it will be quite complex, and need lots of work. I'm reopening this task, but I believe that for the moment we should focus on other major changes, which aim to solve software bugs, instead of adding new features.
If I understood T208537#4714444 correctly, this happens because we call numParams on null, which in turn means $filterId is null for the current entry. I have no idea about why this happens, though. Do we have some more context?
@Krinkle AFAICS The fatal only happens for uploads, specifically for uploads where the specified title isn't valid. As I was saying above, avoiding the fatal doesn't avoid silent errors, but I agree that it'd be less intrusive for users. I can't send a patch right now, but the one above is definitely an overkill. All we need to do is remove the "Title" typehint from line 1100. Also, could you please take a look at https://gerrit.wikimedia.org/r/466842?
Tue, Nov 6
@Amorymeltzer Hmm I don't think it's intended, probably it's just some edge-case, possibly due to the fact that the page doesn't exist (maybe we handle non-existing pages differently, or Revision class does so). I'll investigate tomorrow anyway. Could you please confirm that the reproduction steps below are correct?
*Create a page with some revisions (and trigger a filter with one of these)
*Delete the page
*Suppress the revision (already deleted) which triggered the AF
*Go to Special:AbuseLog for that same revision, which will only check for "deletedtext" right (i.e. it's visible to sysops (who aren't oversighters) but not to "normal" users)
Also, I'm unsure if that task is related... The reason could be similar, but in that case solving the problem isn't trivial at all.
I don't know how difficult would it be, but for sure it would require several changes: at least a new page in the UI (and we'll add several new pages for performance monitoring) and some sort of DB change, being (at least) a new column in abuse_filter_log if testing past hits is OK, or a new table to make it possible to add edits for which there are no entries in the AbuseLog. It could be done, but I'd rather wait for T36180 and see if this feature will still be needed.
This would require adding lots of new logic. I wonder if Special:AbuseFilter/test would be enough, especially when T36180 will be resolved.
Mon, Nov 5
Uh, as filter maintainer I don't really fancy adding tons of new variables. However, adding logic to apply filters only to some slots could be IMHO even worse. If we decide to add new variables, I guess we should make the existing ones refer to all slots, to keep B/C. But again, I'm new to MCR, so all I can do is probably express my thoughts as filter maintainer.
Just a note: this failure is often seen when something abruptly halts the script. It could be due to something happening elsewhere.
@Jdforrester-WMF I was just going to ask what's to be done here, since I don't know much about MCR. This also means that I probably won't be able to help much with this. Anyway, AF uses the EditFilterMergedContent for edits. If there are no changes to the logic related to this hook, AF will probably work as usual.
Sun, Nov 4
Sat, Nov 3
- We only need a fairly simple syntax for filters, there's no need to allow complicated stuff
- Our dedicated parser is probably faster than Lua's (exactly because it's dedicated), and this will be especially true as CachingParser will be implemented
- Higher expressive power means potential performance troubles from badly-written filters
- Adapting Lua to our needs (removing potentially dangerous stuff, injecting our variables, regex library etc.) could possibly nullify the performance gain
- AbuseFilter's syntax is actually simpler than Lua's, and would only make writing filters harder for non-tech people (they'd have to learn a full programming language, instead of a basic language).
Memo: we should also update the user_blocked variable during the process.
Wed, Oct 31
Tue, Oct 30
Sat, Oct 27
Short-term solution could be to remove the Title typehint from $title (which was added in https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/AbuseFilter/+/460335/). However, it wouldn't actually do much, since our code isn't really meant to handle filtering without a title.
Proper and long-term solution is to ensure filtering only happens with a real title. This may be done either by halting AF execution if the title is null, or changing the used hook to make sure it always passes a title. Anyway, this only happens with uploads, and I guess only for uploads where an invalid title is specified.
See T144265 (and related patches) for the progress in the long-term.
Fri, Oct 26
See T58371 (and consequently T191722) for why checking the summary doesn't work. I still have to check if the EditFilterMergedContent hook provides enough context to determine whether the edit is creating a new section.
Nothing more to do here, this task can now be made public.
Wed, Oct 24
Ah, I didn't know. If someone with merge access on wmf. branches will come across the patch, that'd be nice! Otherwise, we can just abandon it as the new version is already on group0 and the fix is already deployed on other wikis.
I'm not sure that we want to do this: since this log is an actual log on Special/Log, we cannot selectively hide entries from some people IIRC. This means that private filters deletions would show up as well for every user, and together with the fact that vandals (especially LTAs) may already know if there's a filter targeting them (and its ID), this way they could know if such filter is disabled.
Tue, Oct 23
@Huji the fix is already deployed to WMF wikis per T207085#4679131. There's still this patch to be merged on wmf.26 (could you please merge it?), but it doesn't really matters, it's just for completeness' sake.
This is nasty, and even if the cause may not be easy to understand, the test should at least be included in @Broken until it's fixed.
Pushed to gerrit, for all mentioned branches. I only merged the one on master, since I'd like someone else to double-check backports to be sure that everything is fine, CC @Reedy @Legoktm. Please note that I had to manually +2verify the one on REL1_31 due to T189560. Also: can this task be made public once everything will be merged?
Mon, Oct 22
I wouldn't say this is a good first bug. Something has changed in the meanwhile, and I don't know if it's worth it to change this behaviour. Also, it should be noted that we don't explicitly require caching for blockautopromote (like we do for throttle).
This is intended, see the patch where it was introduced, and also the reasoning behind it in T45646. However, I do recognize that reusing the messages isn't that clear, so we should probably add separate messages to handle this specific case. CC'ing @Tgr who wrote the patch.
Oct 21 2018
On my wiki, running the previously used query gives the following:
Claiming as I took over the patch.
Claiming as I took over the patch.
Claiming as I took over the patch.
I rewrote the task description to make it easier to understand. The second part was removed as we already have T187072. As for the remaining part, I'm closing it as invalid because now the interface is different: the uploadtext message is still there, but the filter warning is inside the section having the message "uploaderror" as title. Now the messages are clearly separated, and also it's completely fine to keep an introduction on Special:Upload.
I tried my best to understand the problem here, but I'm not sure I really got it. However, supposing that Andre's explanation is correct (and to me, it looks so), the behaviour is intended. If a filter is set to group throttle by page, throttle counters will be increased on a per-page basis. So, 3 edits on page XXX and 1 edit on page YYY means that the counter is 3 for XXX and 1 for YYY, and that's all. If the filter is set to take any action after e.g. 5 edits, then it'll execute such actions as soon as the higher counter reaches 5, whatever page is it for.
If, instead, the aim of this task was to report a different problem, please feel free to reopen it and provide further details, as suggested above.
A good solution could be to replace the API query with a loadMessagesIfMissing, and then just call mw.message(xxx).parse(). However, it doesn't support many things (like templates), and the goal of this task is exactly to improve the message in presence of such things.
I'll do whatever is needed to fix this, as part of T203587. This will probably get fixed on the way, but given that it seems to happen randomly I'm also adding a testme.
The original problem is gone, while for having a cleaner message there's T199622
Oct 20 2018
Thanks Krenair and Reedy. BTW @Cyberpower678 a workaround is to re-set throttle parameters which got lost, which you can see in filter history. However please note that this only works because you're disabling the filter: it wouldn't work as it used to (although recently it totally stopped working (since July) just because, alas, throttle is broken, and that's why the other task is UBN).
Not being able to disable filters is a great deal. Let's wait for the stacktrace, although I suspect this is T203336, which is already UBN and for which I already expressed my worries about its urgency.