Page MenuHomePhabricator

quibble-vendor-mysql-hhvm-docker for WikibaseCirrusSearch takes over 40 minutes
Closed, ResolvedPublic

Description

I've looked recently on how long it takes for a patch to pass CI, and I've noticed that it takes somewhere between 44 and 46 minutes for a WikibaseCirrusSearch patch to get through quibble-vendor-mysql-hhvm-docker. I think this is way too long a wait to validate a patch, and we should look into somehow reducing this time. This time cost is taken each time a patch is submitted (which sometimes requires several iterations, until all bugs, reviews and phpcs complaints are fixed) and waiting this long for each validation is IMHO inefficient. We should look at reducing these times.

Event Timeline

See also: T221434, which this might be a dupe of.

The general issue is: running our tests is taking a long time for a number of reasons (eg: no clear "integration" vs "unit" test delineations uniformly enforced) and we don't want that.

debt subscribed.

Moving this to our #watching column for now :)

Hey @Smalyshev , sorry I have delayed my reply to this request. Beside what Greg mentioned (we run every single tests from all the dependent extensions), there is another infrastructure related issue.

End of April, I have noticed the wmf-quibble-vendor-mysql-hhvm-docker (which is a different job and different set of repositories) was taking 40 minutes long. I have self filled/closed T222023 and just assumed it was a faulty WMCS instance and deleted it.

Later Kunal noticed that the Jenkins jobs to generate MediaWiki code coverage would sometime time out after 4 hours when it usually runs in two hours. The TLDR is that the oldest WMCS servers have bad CPU performances for some reason T223971. I have disabled the Jenkins instance running on those hosts.

I am suspecting the slow WikibaseCirrusSearch runs are related.

For the future: when a patchset is send the default is to run the HHVM based job. Then on Code-Review +2 run the PHP 7.0 - 7.2 jobs. Surely we should nowadays default to 7.2 which gives a faster feedback and move HHVM to just Code-Review+2.

I also think it's probably better to run regular patch set on 7.2 and do hhvm only on submits. Especially as we're migrating to 7.x in production. Would that speed things up or that's because of the old VMs?

OK, the main task is quibble-vendor-mysql-hhvm-docker; with recent changes, this is currently taking about 25 minutes in WikibaseCirrusSearch, which isn't great, but it's not as bad as when this was filed.

I'm not sure what improvements we can make ahead of the removal of HHVM from production (when this job will be removed entirely). We could switch the default "test" job from HHVM to PHP72 (we'd keep the HHVM job in the "submit" pipeline)?

Ok I am getting multiple builds taking 50+ minutes again for Wikibase, e.g.:

https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php73-docker/2752/
https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php72-docker/16913/
https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php70-docker/27655/

Overall job takes over 2 hours. And it needs to be done twice to submit a patch (and another two times if it needs to be backported) - so to do an urgent fix, I'd need to wait for several hours just for CI (and God help us if there's a bug in the patch and it needs to be amended). I don't think it's a good situation.

Krinkle claimed this task.
Krinkle subscribed.

Recent example from https://gerrit.wikimedia.org/r/581007

[Mar 23 13:05] Patch Set 2: Code-Review+2
[Mar 23 13:27 (22min later)] Gate pipeline build succeeded. Change has been successfully merged by jenkins-bot.

quibble-vendor-mysql-php72-noselenium-docker SUCCESS in 21m 41s
quibble-vendor-mysql-php73-noselenium-docker SUCCESS in 17m 18s
quibble-vendor-mysql-php74-noselenium-docker SUCCESS in 17m 28s
mwgate-node10-docker SUCCESS in 40s
quibble-vendor-selenium-docker SUCCESS in 13m 41s
mwext-php72-phan-docker SUCCESS in 1m 17s
mwext-php72-phan-seccheck-docker SUCCESS in 1m 25s
wmf-quibble-vendor-mysql-php72-docker SUCCESS in 12m 42s
wmf-quibble-selenium-php72-docker SUCCESS in 10m 47s

This looks much better and it only a few minutes slower than the standard gate jobs for wmf extensions (12-15min currently vs 17-22min)