Page MenuHomePhabricator

Unit tests using database are slower than needed
Closed, ResolvedPublic

Description

Since the switch to MySQL for unit tests (T37912) our unit tests on Ubuntu Precise with Zend PHP 5.3 have slowed down from 5 to about 10 minutes.

The HHVM tests went up from 4 to about 5.2 minutes on average. While this is mostly unavoidable, I think in a lot of cases we're using the setup handling in PHP incorrectly or in an inefficient way.

Possible curlpits can be discovered as follows:

Since most tests can be fixed with a simple commit once found, I'm not creating this as a tracker bug. So when can this be resolved? Well, let's see if we can find a couple tests to resolve and close this issue when we feel like it.

Event Timeline

Krinkle raised the priority of this task from to Medium.
Krinkle updated the task description. (Show Details)
Krinkle subscribed.

On the job page, you can access a build time trend. It clearly shows that the runtime more than doubled when switching from baremetal production slaves to the labs slaves.

integration-slave1001 seems to be faster, but maybe it only one job running when the builds have been made.

Looking at install.php output on two slaves:

slave1001:

00:00:05.386 Creating tables
00:00:06.819 done

Or 1.5 seconds.

slave1004:

00:00:05.741 Creating tables
00:00:29.589 done

Or 24 seconds.
Maybe a MySQL configuration differences or it is just overloaded.

We probably want to revert back to the prod slaves :(

Looking at install.php output on two slaves:

slave1001: 1.5 seconds.

slave1004: 24 seconds.

Maybe a MySQL configuration differences or it is just overloaded.

Yeah, looks like it was an anomaly. I've compared two gate-and-submit builds of the same change for the master branch earlier today.

From https://integration.wikimedia.org/ci/job/mediawiki-phpunit-zend/4663/consoleFull

00:00:00.099 Building remotely on (phpflavor-zend UbuntuPrecise) in workspace mediawiki-phpunit-zend
00:00:00.188 PHP 5.3.10-1ubuntu3.17 with Zend Engine v2.3.0

00:00:00.290 + zuul-cloner --version
00:00:00.534 Zuul version: 2.0.0-304-g685ca22-wmf1precise1
00:00:00.555 + zuul-cloner --color --verbose --map /srv/deployment/integration/slave-scripts/etc/zuul-clonemap.yaml --workspace src https://gerrit.wikimedia.org/r/p mediawiki/core mediawiki/vendor

00:00:08.812 + /srv/deployment/integration/slave-scripts/bin/mw-install-mysql.sh
00:00:09.094 Setting up database
00:00:44.332 done

00:00:44.406 + /srv/deployment/integration/slave-scripts/bin/mw-apply-settings.sh
00:00:44.477 No syntax errors detected in /mnt/jenkins-workspace/workspace/mediawiki-phpunit-zend/src/LocalSettings.php

00:00:44.499 + /srv/deployment/integration/slave-scripts/bin/mw-run-phpunit.sh
00:00:49.596 PHPUnit 3.7.37 by Sebastian Bergmann.
00:08:14.338 ............................................................. 5185 / 9720 ( 53%)
00:14:50.603 ............................................................. 9638 / 9720 ( 99%)
00:14:54.246 ...............................................
00:14:54.246
00:14:54.246 Time: 14.02 minutes, Memory: 1035.25Mb
00:14:54.247 There were 21 skipped tests: ..
00:14:54.315 Tests: 9685, Assertions: 399643, Skipped: 21.

00:14:59.456 [xUnit] [INFO] - Converting '/mnt/jenkins-workspace/workspace/mediawiki-phpunit-zend/log/junit-mw-phpunit.xml' .
00:15:08.362 Archiving artifacts

00:15:08.566 [PostBuildScript] - Execution post build scripts.
00:15:08.604 + /srv/deployment/integration/slave-scripts/bin/mw-teardown-mysql.sh
00:15:10.119 Finished: SUCCESS

From https://integration.wikimedia.org/ci/job/mediawiki-phpunit-hhvm/6135/consoleFull

00:00:00.081 Building remotely on (phpflavor-hhvm UbuntuTrusty) in workspace mediawiki-phpunit-hhvm
00:00:00.168 HipHop VM 3.3.1 (rel)

00:00:00.250 + zuul-cloner --version
00:00:00.542 Zuul version: 2.0.0-304-g685ca22-wmf1trusty1
00:00:00.563 + zuul-cloner --color --verbose --map /srv/deployment/integration/slave-scripts/etc/zuul-clonemap.yaml --workspace src https://gerrit.wikimedia.org/r/p mediawiki/core mediawiki/vendor

00:00:04.686 + /srv/deployment/integration/slave-scripts/bin/mw-install-mysql.sh
00:00:07.470 Setting up database
00:00:42.731 done

00:00:42.774 + /srv/deployment/integration/slave-scripts/bin/mw-apply-settings.sh
00:00:43.189 No syntax errors detected in /mnt/jenkins-workspace/workspace/mediawiki-phpunit-hhvm/src/LocalSettings.php

00:00:43.216 + /srv/deployment/integration/slave-scripts/bin/mw-run-phpunit.sh
00:00:49.241 PHPUnit 3.7.37 by Sebastian Bergmann.
00:05:06.608 ............................................................. 5246 / 9723 ( 53%)
00:08:46.769 ............................................................. 9638 / 9723 ( 99%)
00:08:47.553 ..................................................
00:08:47.553
00:08:47.553 Time: 8.06 minutes, Memory: 772.75Mb
00:08:47.553 There were 15 skipped tests: ..
00:08:47.697 Tests: 9688, Assertions: 713151, Skipped: 15.

00:08:48.169 [xUnit] [INFO] - Converting '/mnt/jenkins-workspace/workspace/mediawiki-phpunit-hhvm/log/junit-mw-phpunit.xml' .
00:08:49.198 Archiving artifacts

00:08:49.389 [PostBuildScript] - Execution post build scripts.
00:08:49.405 + /srv/deployment/integration/slave-scripts/bin/mw-teardown-mysql.sh
00:08:50.902 Finished: SUCCESS

Ubuntu Precise with Zend PHP:

  • Setup (Jenkins/Zuul): 8.8 seconds
  • Database: 35.3 seconds
  • PHPUnit: 14.02 minutes

Ubuntu Trusty with HHVM:

  • Setup (Jenkins/Zuul): 4.6 seconds
  • Database: 35.3 seconds
  • PHPUnit: 8.06 minutes

MySQL database setup took exactly the same amount of time on both.

On the job page, you can access a build time trend. It clearly shows that the runtime more than doubled when switching from baremetal production slaves to the labs slaves.

The switch to labs did not do that. The switch to MySQL did.

On Precise/Zend, our test suite with MySQL ran 1.4x – 1.8x slower than with SQLite. On Trusty/HHVM, the switch to MySQL did not have significant impact.

For the mediawiki-core job, we migrated to Labs and to MySQL at the same time. So it's hard to measure the difference. In the past we've migrated many other jobs to labs (still using sqlite) and there was no significant slowdown there either. The switch to labs for Zend PHP tests did have some minor impact (due to CPU and Disk speeds I think), but not much.

Since a lot of the slowdown is related to MySQL so why have you removed the blocking task T96249: MySQL tuning on CI slaves (tracking) ?

Since a lot of the slowdown is related to MySQL so why have you removed the blocking task T96249: MySQL tuning on CI slaves (tracking)?

This task is about MediaWiki core unit tests. Not the CI infrastructure.

The MediaWIki core unit tests should interact less with the database. More isolated units, less integration with global state and database layers. It also uses setup/teardown quite ineffectively. Many unit tests aren't testing anything related to the database but happen to trigger code paths that involve the database and thus require re-cloning the database schema between every test.

Krinkle claimed this task.
Krinkle removed a project: Performance Issue.