Page MenuHomePhabricator

Update Excimer to 1.2.3 in production
Closed, ResolvedPublicBUG REPORT

Description

Our PHP 8.1 images ship Excimer 1.2.2:

 podman run --rm docker-registry.wikimedia.org/php8.1-cli:8.1.34-1-s6-20250316 php --ri excimer

excimer

excimer support => enabled
excimer version => 1.2.2

Directive => Local Value => Master Value
excimer.default_max_depth => 1000 => 1000

The logic responsible for staggering the start time of initial sampling, needed for our continuous profiling via ArcLamp, was fine in Excimer <= 1.21, then broken in Excimer 1.2.2, and fixed again Excimer 1.2.3. [1]

This is causing performance.wikimedia.org flamegraphs to show bogus data.

Please update Excimer to 1.2.3 in production.

[1]: https://gerrit.wikimedia.org/r/q/project:mediawiki/php/excimer

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.
StatusSubtypeAssignedTask
ResolvedNone
ResolvedBUG REPORTScott_French

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

This is causing performance.wikimedia.org flamegraphs to show bogus data.

Specifically, it means we have no telemetry on the first few whole seconds of a response — which for the majority of requests, is the entire duration of the response.

For example, on load.php:

Before: https://performance.wikimedia.org/arclamp/svgs/daily/2025-02-01.excimer-k8s.load.svgz

  • 33,000 samples a day
  • 20% covers WebStart, wmf-config/CommonSettings, Etcd, etc
  • 20% covers requests for modules=startup
  • 20% covers requests for other modules=*

After: https://performance.wikimedia.org/arclamp/svgs/daily/2025-03-17.excimer-k8s-php8.load.svgz:

  • 2,000 samples a day
  • nothing from WebStart, wmf-config, etc
  • nothing from requests for modules=startup
  • 100% covers the tail end of execution only, and only from the 99th percentile of slowest module builds.

Thanks for flagging. There are a couple of moving parts to coordinate, but I'll aim to get a build with 1.2.3 out this week.

This turned out to be a bit more involved than expected, in part because the Debian PHP team has not uploaded 1.2.3 to unstable yet (e.g., associated artifacts like pristine tars are missing), and in part because some non-essential Build-Depends bumps have been picked up (e.g., dh-php) that are newer than what we have in component/php81.

In any case, I was able to hack around this a bit and arrive at 1.2.3-1 builds, which indeed pass the regression test for the staggered-start issue in 1.2.2 (as well as all others, of course).

As I mentioned, there are other package updates to coordinate vs. production image rebuilds, however I expect the latter to happen this week.

Following up, two of the three package updates I was hoping to consolidate together (to minimize production image upgrades during the switchover week) are ready, but the third is still pending. At this point, I'm just going to go ahead first thing Monday with whatever is available at that time, including this.

Change #1130626 had a related patch set uploaded (by Scott French; author: Scott French):

[operations/docker-images/production-images@master] php8.1: rebuild to pick up new php and php-excimer packages

https://gerrit.wikimedia.org/r/1130626

Mentioned in SAL (#wikimedia-operations) [2025-03-24T15:45:55Z] <swfrench-wmf> reprepro include php-excimer 1.2.3-1+wmf11u1 in component/php81 - T389243

Change #1130626 merged by Scott French:

[operations/docker-images/production-images@master] php8.1: rebuild to pick up new php and php-excimer packages

https://gerrit.wikimedia.org/r/1130626

Mentioned in SAL (#wikimedia-operations) [2025-03-24T17:08:41Z] <swfrench-wmf> rebuilt php8.1 production image suite (8.1.32-1-s1) - T389243

Mentioned in SAL (#wikimedia-operations) [2025-03-24T17:09:57Z] <swfrench@deploy1003> Started scap sync-world: Deployment to pick up new php8.1 production image - T389243

Mentioned in SAL (#wikimedia-operations) [2025-03-24T17:18:37Z] <swfrench@deploy1003> swfrench: Deployment to pick up new php8.1 production image - T389243 synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-03-24T17:33:22Z] <swfrench@deploy1003> Finished scap sync-world: Deployment to pick up new php8.1 production image - T389243 (duration: 23m 54s)

Alright, the new excimer 1.2.3 package should now be live.

Spot checking https://performance.wikimedia.org/arclamp/svgs/daily/2025-03-24.excimer-k8s-php8.load.svgz after a couple of hours with the new php-excimer package live, I'm seeing over 10k samples in total and over 1k samples with call stacks traversing WebStart. Which is to say, I believe this is working as expected, and will close this out.

Thanks again for spotting this!