Page MenuHomePhabricator

Create a PHPUnit results cache service to provide timing data for CI runs
Closed, ResolvedPublic

Description

Since T378478, tests are allocated to split_groups in alphabetical order by filename rather than round-robin. This creates less time-balanced groups - the tests for some extensions take longer than for others, and the slower extensions then slow down the split_group that they are included in.

T378797 (1112718) makes the phpunit.results.cache files from parallel CI runs available for download at https://integration.wikimedia.org. These need to be combined into a single file that can be downloaded at the start of CI runs to provide timing information.

Because the timings for integration vs. unit test runs are different for the same PHP classes, we actually need two different sets of timings. In the test groupings we currently have in CI, these configurations are referred to as database (integration) and databaseless (unit), and 1112718 makes the results caches from the two groups available for individual download.

For this task, we need to create the server that collects the individual cache results files and serves the combined result.

As part of the previous spike, the phpunit-results-cache tool (source / toolforge tool) was created, which may provide inspiration for or the basis of a solution.

Acceptance Criteria

  • A server is running (possibly on toolforge) which:
    • receives notifications when new parallel CI runs are complete (or polls gerrit / Jenkins to discover new completed test runs) and integrates the new results cache data into a combined results file.
    • makes the combined results file available for unauthenticated download (by CI runs, for use in generating balanced split_groups).

Details

Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Fetch job data on demandtoolforge-repos/phpunit-results-cache!4lucaswerkmeister-wmdelatestmain
Customize query in GitLab

Event Timeline

ArthurTaylor renamed this task from Create a PHPUnit results cache service to provide timing data for CI runs to [SW] Create a PHPUnit results cache service to provide timing data for CI runs.Jan 28 2025, 1:00 PM
ArthurTaylor renamed this task from [SW] Create a PHPUnit results cache service to provide timing data for CI runs to Create a PHPUnit results cache service to provide timing data for CI runs.Feb 5 2025, 9:54 AM

Mentioned in SAL (#wikimedia-cloud) [2025-02-06T15:03:22Z] <wmbot~lucaswerkmeister-wmde@tools-bastion-13> deployed 5a969d593a ("latest" branch, MR !4, T384925, not yet merged into main) for testing

I deployed the above merge request on Toolforge so it can be tried out:

$ curl -s https://phpunit-results-cache.toolforge.org/results/quibble-vendor-mysql-php74-noselenium | jq '{ database: .database | length, databaseless: .databaseless | length }'
{
  "database": 53,
  "databaseless": 85
}

(Tweaking the result with jq just to show something without pasting the full and pretty long output ^^)

Seems to work. I'm a bit confused as to why there are so few classes there, and there's a question about how the jobs downloading the results will know their job name - you can kinda parse it from $LOG_NAME, but that seems a bit hacky. I wonder if we need to have /results combine all the various cached results files for the different jobs.

Hm, I assumed it was available in the job but didn’t check that. For having /results combine the cached results, I guess we’d hard-code the list of jobs in the tool?

yeah. hard-code it or put it in a yaml file in the repo. But we can do that in a follow-up ticket as far as I'm concerned.

karapayneWMDE removed karapayneWMDE as the assignee of this task.
karapayneWMDE subscribed.