Babel: it-N, en-3, fr-1
A little update: xdebug says that there's a cycle, which is where the slowness happens. (UnionType|GenericArrayType)::hasTemplateTypeRecursive is called like 5 million times for my simplified scratch, and that's obviously super-heavy. The call triggering the loop happens in UnionTypeVisitor::analyzeProp (obviously for the $info prop). For some reason, that prop's type_set is a deeply nested array of types (I suspect it has 5 million nesting levels looking at the code, but I didn't check), and thus the long recursion.
As for why the type_set is this crazy, I (still) don't know.
@hashar Being a static analysis tool, it wouldn't make any difference. We only need SMW to provide phan with all of the classes/methods definitions, as it would otherwise fail. My actual doubt is whether we can install SMW before doing the analysis, since AFAIK it's not hosted on gerrit.
@Trizek-WMF I'd say in the next T/N. It could take some time for people to fix affected filters, so the sooner the better.
I continued digging the code, but TBH I'm not getting anywhere. I managed to reproduce the issue with a version of the plugin having only 250 lines of visitor, but still nothing. This seems to be a combination of several factors. Almost surely, it has to do with the insane nesting level inside the $info property (which is bad practice anyway, but nvm), and class/method reanalysis for dependent method/vars. Also, xdebug doesn't help much due to some phan dark magic.
So if someone wants to continue investigating, feel free to; I won't mostly be around for the next week.
OK, so, first of all I'd like to announce it on Tech News. We should inform people that wpzero support will be removed, and thus update their filters accordingly. Then let's see how many of those are fixed, and decide how to move on.
Thanks @Legoktm I'm going to take a look at those. Although I have to say, this is more than I expected. And without a global abusefilter-manager group in place, it'll be even harder to fix all of those filters. That said, these filters won't be straightforward to fix (like it was e.g. in T209565). Every filter has its own fix, and sometimes the filter just has to be deleted. Actually, I don't even know if I'd do that with global AF-manager rights.
P8769 is the minimum I could find for now in the getid3 class. Inside the extension, you'll only need the following:
So, for now, I managed to get this down to the ID3Handler class in TMH, and part of the getid3.php file in the library. I suspect this has to do with the $info property inside getid3, but I'm still unsure. Apparently it needs several methods in getid3 to exist, in order to remain stalled. I'll probably continue investigating tomorrow.
Is there anything actionable here?
Not setting UBN until the exact impact is determined.
cawiki and trwiki communities have been notified of the change, so now we can really close this task (and related ones) as soon as r459245 is merged. Again, thanks to everyone involved for the huge help!
Confirming that everything was fixed correctly! So this task can be considered resolved. To be precise, there's still https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/AbuseFilter/+/459245/ awaiting for review.
Tue, Jul 16
Tagging, and CC'ing myself in case any help is needed with AF.
Some data has started to flow to logstash for group0, and we have another example: testwiki filter 176, using the variable "db", evaluated to the number "b" in base 2. So well, we really need this change.
OK So this is more complicated than it seems. I ran seccheck 2.0.1 on TMH master (with ast 1.0.1, PHP 7.3.4), and it completed in normal time. Ran it again, same result. I've also added a line to print the runtime and ran it 3 more times, taking respectively 73, 51 and 51 seconds. My machine is pretty powerful, but not so much to explain the 18 minutes of runtime of the change above. I also note that zuul is a little busy right now, but again, not so much to explain all this slowness: see for instance here. It started 6 minutes after the TMH one and finished in 12 seconds.
Uhm, I'd like to use xdebug for this, but phan would get way slower (around 5x), and doing that on an already-slow extension could be painful. So I think I'll just start removing pieces from the extension and see what's causing the slowness.
This is done, for the rest we have the parent task.
Yeah, I'm sorry, I sort of changed my mind while writing the description and didn't update the title accordingly. Nevertheless, what I wrote in the description is what this task should be about.
I didn't read T224351#5227653 carefully enough. I still have to determine whether this still happens, and how to fix.
Phan was added, and taint-check done in the subtask.
@Billinghurst Thanks! However, I'd suggest not to fix them manually. I'd like to see if the script is working properly, and if it's safe for use outside WMF wikis.
Mon, Jul 15
Huh, when I first reported I thought someone already knew about this. Anyway. Looking at this I see there has been a spike around 12:20-30, which is right before data stopped to flow. It's also interesting to note that, apparently, there hasn't been any drop in the amount of data, which was in fact replaced by these errors. So something is wrong with Elastic (?).
@Aklapper Yeah, I'm sorry, I only quickly mentioned it in T228049#5333127. This extension uses methods removed from core several versions ago (like ApiBase::dieUsage or OutputPage::addWikitext - T228048). Depending on the compat policy, we should either add back-compat code or just replace them. Such methods seem to cause several issues reported by phan in https://integration.wikimedia.org/ci/job/mwext-php72-phan-docker/3431/console, that's why I first wanted to know what to do with them.
That said, I definitely can start fixing unrelated stuff, and no doubt I'll find something to fix.
Eeeeeek, this error suggested me to add phan to PageForms, in order to catch this kind of stupid mistakes early on during development. Then I saw https://integration.wikimedia.org/ci/job/mwext-php72-phan-docker/3431/console, and it frightened me. This extension tries to use methods removed from core in 1.32, and that's just the first line of a huge report. Plus, T149869 scares me a little bit.
I'm unsure about what to do here. First of all, the extension claims to support MW 1.23+ at https://www.mediawiki.org/wiki/Extension:Page_Forms, although extension.json says 1.27+. Either way, those are way too old to kept compatibility with them without tons of back-compat code etc.
I'll probably open another task for that, although given all of the above, I think the undefined variable could be the least of the problems this extension has.
Current warnings list is here:
Another thing I noticed is that with check experimental, CI also runs a separate job for extensions, mwext-php72-phan-seccheck-docker (see https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/522419/), which fails hard. But that's another story.
(Note that I changed my mind and removed DangerousActions. The actions being disabled upon throttling are not something people are supposed to change, I believe)
@Urbanecm Just checking in - would tomorrow after EU SWAT be fine for you? The T/N is going out now, and I think tomorrow is fine; plus I probably won't be available since the end of the week.
Sun, Jul 14
Note that I had to suppress some DoubleEscaped warnings: the label passed to Xml::radio is indeed double escaped. However, it comes from a long chain of calls to SecurePoll_Context/getMessage/parse etc., which seems to be copying what Message::parse does. And I'm not touching that chain.
I guess this has to do with the methods used to retrieve properties etc. For a future fix, a sample test case:
Now passing with taint-check 2.0.2.
Sat, Jul 13
@Smalyshev Thanks a lot for your precious feedback! Before replying point-by-point, I'd like to remark that despite the name "PerfCheck", and one of its goal being to micro-optimize code, it also tries to improve readability; actually, it prefers readability over performance. So for some of the issues the performance gain could be very small, and even unnoticeable (unless for hotspots), but the readability should always be improved.