Page MenuHomePhabricator

Fill split_groups for parallel testing in sequence rather than round robin
Closed, ResolvedPublic

Description

Per the feedback here (T378299#10266092), the way tests are currently assigned to split_groups tends to collect tests from a lot of different extensions into each group because tests are assigned round-robin to their respective buckets. This:

  • makes interactions between tests from different extensions more likely
  • means that jobs that test a different set of extensions can have a very different test group layout
  • more often exposes previously untested combinations of test classes

While in principle detecting more test-state-encapsulation issues is a positive thing, having frequently-occurring cross-extension parallel testing issues to debug makes life harder for developers.

Change the assignment of tests to groups to make the composition of groups more predictable and understandable.

Acceptance Criteria

  • The tests are assigned to their split_groups in the order they appear in the phpunit --list-tests-xml output.

Note that as well as making some bugs harder to detect, this might also lead to less balanced times for the split groups (if, for example, a particular extension's tests tend to take longer per class than average).

Event Timeline

Change #1084712 had a related patch set uploaded (by Arthur taylor; author: Arthur taylor):

[mediawiki/core@master] Allocate tests to groups sequentially instead of round-robin

https://gerrit.wikimedia.org/r/1084712

I've implemented this and made some tests with a bunch of empty patches for the popular extensions (e.g. https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Scribunto/+/1080212). It seems to work fine - the new ordering only breaks one extension and I've submitted a patch to fix that.

I think the new ordering does make the test distribution worse - some of the jobs are now taking longer to run. This is probably because the Wikibase tests are quite heavy and get all lumped into one group. The solution to this is probably to collect the test timing data from the results cache and apply that to the bucket building process.

Change #1084712 merged by jenkins-bot:

[mediawiki/core@master] Allocate tests to groups sequentially instead of round-robin

https://gerrit.wikimedia.org/r/1084712

Not that I'm aware. I was leaving the ticket open on our board for a minute to see how the patch landed, but I haven't had any indication that there's anything wrong so I have (at least on our board) moved that to the verification column and we'll get it closed up when we review those.