Page MenuHomePhabricator

dduvall (Dan Duvall)
Staff Software Engineer

Today

  • No visible events.

Tomorrow

  • No visible events.

Sunday

  • No visible events.

User Details

User Since
Oct 7 2014, 4:24 PM (582 w, 3 d)
Availability
Available
IRC Nick
marxarelli
LDAP User
Dduvall
MediaWiki User
DDuvall (WMF) [ Global Accounts ]

Recent Activity

Mon, Dec 1

dduvall closed T410049: Buildkit v0.26.2 released as Resolved.
Mon, Dec 1, 10:22 PM · Patch-For-Review, Essential-Work, Release-Engineering-Team, GitLab (CI & Job Runners)
dduvall added a comment to T410049: Buildkit v0.26.2 released.

@dancy Not quite. We still need the WMCS/trusted runner changes in puppet.

Mon, Dec 1, 5:40 PM · Patch-For-Review, Essential-Work, Release-Engineering-Team, GitLab (CI & Job Runners)

Thu, Nov 20

dduvall updated the task description for T410049: Buildkit v0.26.2 released.
Thu, Nov 20, 7:59 PM · Patch-For-Review, Essential-Work, Release-Engineering-Team, GitLab (CI & Job Runners)

Oct 30 2025

dduvall closed T405681: 1.45.0-wmf.25 deployment blockers as Resolved.
Oct 30 2025, 7:41 PM · Essential-Work, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments
dduvall closed T408540: PHP Deprecated: Asking for a replica from groups except dump/vslow is deprecated: watchlist [Called from Wikimedia\Rdbms\LoadBalancer::getConnectionInternal] as Resolved.
Oct 30 2025, 7:01 PM · MW-1.45-notes (1.45.0-wmf.25; 2025-10-28), FlaggedRevs, MW-1.46-notes (1.46.0-wmf.1; 2025-11-05), StructuredDiscussions, Wikimedia-production-error
dduvall closed T408540: PHP Deprecated: Asking for a replica from groups except dump/vslow is deprecated: watchlist [Called from Wikimedia\Rdbms\LoadBalancer::getConnectionInternal], a subtask of T405681: 1.45.0-wmf.25 deployment blockers, as Resolved.
Oct 30 2025, 7:01 PM · Essential-Work, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments
dduvall triaged T408851: PHP Deprecated: Asking for a replica from groups except dump/vslow is deprecated: watchlist [Called from Wikimedia\Rdbms\LoadBalancer::getConnectionInternal] as Unbreak Now! priority.
Oct 30 2025, 6:13 PM · Wikimedia-production-error
dduvall added a subtask for T405681: 1.45.0-wmf.25 deployment blockers: T408851: PHP Deprecated: Asking for a replica from groups except dump/vslow is deprecated: watchlist [Called from Wikimedia\Rdbms\LoadBalancer::getConnectionInternal].
Oct 30 2025, 6:13 PM · Essential-Work, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments
dduvall added a parent task for T408851: PHP Deprecated: Asking for a replica from groups except dump/vslow is deprecated: watchlist [Called from Wikimedia\Rdbms\LoadBalancer::getConnectionInternal]: T405681: 1.45.0-wmf.25 deployment blockers.
Oct 30 2025, 6:13 PM · Wikimedia-production-error
dduvall created T408851: PHP Deprecated: Asking for a replica from groups except dump/vslow is deprecated: watchlist [Called from Wikimedia\Rdbms\LoadBalancer::getConnectionInternal].
Oct 30 2025, 6:12 PM · Wikimedia-production-error
dduvall raised the priority of T408667: recentchanges API result contains wrong entries with redirect: False from Medium to Unbreak Now!.

Oh sorry, @Dillon. We collided. I'll set the priority back and leave it as a blocker for historical posterity. Thanks for resolving it.

Oct 30 2025, 5:11 PM · MW-1.45-notes, MW-1.46-notes (1.46.0-wmf.1; 2025-11-05), Moderator-Tools-Team (Kanban), MediaWiki-Action-API, Quality-and-Test-Engineering-Team (Test engineering), MediaWiki-Recent-changes, Pywikibot-tests, Pywikibot
dduvall lowered the priority of T408667: recentchanges API result contains wrong entries with redirect: False from Unbreak Now! to Medium.

Deescalating priority and removing this from 1.45.0-wmf.25 train blockers. I'll leave resolution up to you all.

Oct 30 2025, 5:08 PM · MW-1.45-notes, MW-1.46-notes (1.46.0-wmf.1; 2025-11-05), Moderator-Tools-Team (Kanban), MediaWiki-Action-API, Quality-and-Test-Engineering-Team (Test engineering), MediaWiki-Recent-changes, Pywikibot-tests, Pywikibot
dduvall closed T408525: PHP Deprecated: Deprecated cross-wiki access to MediaWiki\Revision\RevisionRecord. Expected: 'testwikidatawiki', Actual: the local wiki. Pass expected $wikiId. [Called from MediaWiki\Revision\RevisionRecord::getId] as Resolved.

Marking as resolved as I haven't seen the error since deploying the backport yesterday. Please reopen if need be.

Oct 30 2025, 5:06 PM · MW-1.45-notes (1.45.0-wmf.25; 2025-10-28), Wikidata-Omega, MW-1.46-notes (1.46.0-wmf.1; 2025-11-05), Wikidata, Wikimedia-production-error
dduvall closed T408525: PHP Deprecated: Deprecated cross-wiki access to MediaWiki\Revision\RevisionRecord. Expected: 'testwikidatawiki', Actual: the local wiki. Pass expected $wikiId. [Called from MediaWiki\Revision\RevisionRecord::getId], a subtask of T405681: 1.45.0-wmf.25 deployment blockers, as Resolved.
Oct 30 2025, 5:06 PM · Essential-Work, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments

Oct 29 2025

dduvall added a comment to T408525: PHP Deprecated: Deprecated cross-wiki access to MediaWiki\Revision\RevisionRecord. Expected: 'testwikidatawiki', Actual: the local wiki. Pass expected $wikiId. [Called from MediaWiki\Revision\RevisionRecord::getId].

I've escalated this issue to a 1.45.0-wmf.25 train blocker. Thanks to everyone already working on it.

Oct 29 2025, 6:36 PM · MW-1.45-notes (1.45.0-wmf.25; 2025-10-28), Wikidata-Omega, MW-1.46-notes (1.46.0-wmf.1; 2025-11-05), Wikidata, Wikimedia-production-error
dduvall added a subtask for T405681: 1.45.0-wmf.25 deployment blockers: T408525: PHP Deprecated: Deprecated cross-wiki access to MediaWiki\Revision\RevisionRecord. Expected: 'testwikidatawiki', Actual: the local wiki. Pass expected $wikiId. [Called from MediaWiki\Revision\RevisionRecord::getId].
Oct 29 2025, 6:26 PM · Essential-Work, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments
dduvall added a parent task for T408525: PHP Deprecated: Deprecated cross-wiki access to MediaWiki\Revision\RevisionRecord. Expected: 'testwikidatawiki', Actual: the local wiki. Pass expected $wikiId. [Called from MediaWiki\Revision\RevisionRecord::getId]: T405681: 1.45.0-wmf.25 deployment blockers.
Oct 29 2025, 6:26 PM · MW-1.45-notes (1.45.0-wmf.25; 2025-10-28), Wikidata-Omega, MW-1.46-notes (1.46.0-wmf.1; 2025-11-05), Wikidata, Wikimedia-production-error
dduvall triaged T408525: PHP Deprecated: Deprecated cross-wiki access to MediaWiki\Revision\RevisionRecord. Expected: 'testwikidatawiki', Actual: the local wiki. Pass expected $wikiId. [Called from MediaWiki\Revision\RevisionRecord::getId] as Unbreak Now! priority.

This seems to affect commons and wikidata as well. I've seen over 20k errors in the last 10 minutes following group1 promotion of 1.45.0-wmf.25.

Oct 29 2025, 6:25 PM · MW-1.45-notes (1.45.0-wmf.25; 2025-10-28), Wikidata-Omega, MW-1.46-notes (1.46.0-wmf.1; 2025-11-05), Wikidata, Wikimedia-production-error

Oct 28 2025

dduvall closed T407294: failed to configure registry cache exporter: invalid reference format error with new kokkuri as Resolved.

Closing this out. @cmassaro please reopen if the latest version of Kokkuri still does not work for you.

Oct 28 2025, 4:00 PM · Essential-Work, GitLab (CI & Job Runners), Release-Engineering-Team

Oct 27 2025

dduvall added a comment to T405119: Set up zuul web on zuul1001/zuul2001.

The other alternative is .. to not try to use TLS. I mean.. none of the other zookeeper servers in WMF prod do it.. as evidenced by the need to add support for it.

And we are not leaving our VM with this traffic.

Oct 27 2025, 8:44 PM · collaboration-services, Essential-Work, Continuous-Integration-Infrastructure (Zuul upgrade)
dduvall added a comment to T405119: Set up zuul web on zuul1001/zuul2001.

Ok, great! I'm glad you found a path forward.

Oct 27 2025, 8:37 PM · collaboration-services, Essential-Work, Continuous-Integration-Infrastructure (Zuul upgrade)
dduvall added a comment to T405119: Set up zuul web on zuul1001/zuul2001.

Disabling TLS appears to work, so it seems the server does not actually have TLS enabled. :)

Oct 27 2025, 8:31 PM · collaboration-services, Essential-Work, Continuous-Integration-Infrastructure (Zuul upgrade)
dduvall added a comment to T405119: Set up zuul web on zuul1001/zuul2001.

@Dzahn Looking at the zuul-web container logs, it seems like the zookeeper connection is failing outright.

Oct 27 2025, 8:14 PM · collaboration-services, Essential-Work, Continuous-Integration-Infrastructure (Zuul upgrade)

Oct 22 2025

dduvall closed T407916: Add a floating `v1` tag that tracks the latest stable Blubber release as Resolved.
2025-10-22 22:12:12,151 Built target 'buildkit'
2025-10-22 22:12:12,151 Target 'buildkit' image published to 'docker-registry.wikimedia.org/repos/releng/blubber/buildkit:v1@sha256:dbfdff4703fdf6f74d0eed3169b6d56006e6f272704d15c03046678f9737002e'
2025-10-22 22:12:12,152 Target 'buildkit' image published to 'docker-registry.wikimedia.org/repos/releng/blubber/buildkit:v1.5@sha256:dbfdff4703fdf6f74d0eed3169b6d56006e6f272704d15c03046678f9737002e'
2025-10-22 22:12:12,152 Target 'buildkit' image published to 'docker-registry.wikimedia.org/repos/releng/blubber/buildkit:v1.5.1@sha256:dbfdff4703fdf6f74d0eed3169b6d56006e6f272704d15c03046678f9737002e'
Oct 22 2025, 10:14 PM · Essential-Work, Release-Engineering-Team (Doing 😎), Release Pipeline (Blubber)
dduvall closed T405651: Enable use of `.kokkuri:bake` on trusted runners as Resolved.
Oct 22 2025, 9:45 PM · Essential-Work, GitLab (CI & Job Runners), Release-Engineering-Team (Doing 😎)
dduvall updated the task description for T405651: Enable use of `.kokkuri:bake` on trusted runners.
Oct 22 2025, 9:45 PM · Essential-Work, GitLab (CI & Job Runners), Release-Engineering-Team (Doing 😎)

Oct 21 2025

dduvall added a comment to T407294: failed to configure registry cache exporter: invalid reference format error with new kokkuri.

@cmassaro try the latest Kokkuri release (2.11.0). It should work with your existing configuration.

Oct 21 2025, 4:20 PM · Essential-Work, GitLab (CI & Job Runners), Release-Engineering-Team

Oct 15 2025

dduvall added a comment to T407294: failed to configure registry cache exporter: invalid reference format error with new kokkuri.

Jobs using needs are not getting the dotenv artifacts from kokkuri:setup-variables. Apparently this is a documented side effect of using explicit needs.

Oct 15 2025, 5:26 PM · Essential-Work, GitLab (CI & Job Runners), Release-Engineering-Team

Oct 2 2025

dduvall added a subtask for T403125: Investigate WMCS Magnum for GitLab runners: T406271: Grant gitlab-runners-staging access to fast-iops volume type and a 4xiops instance flavor.
Oct 2 2025, 9:18 PM · collaboration-services, Release-Engineering-Team (Priority Backlog 📥), GitLab (CI & Job Runners)
dduvall added a parent task for T406271: Grant gitlab-runners-staging access to fast-iops volume type and a 4xiops instance flavor: T403125: Investigate WMCS Magnum for GitLab runners.
Oct 2 2025, 9:18 PM · Release-Engineering-Team (Radar), Cloud-VPS (Quota-requests)
dduvall created T406271: Grant gitlab-runners-staging access to fast-iops volume type and a 4xiops instance flavor.
Oct 2 2025, 9:18 PM · Release-Engineering-Team (Radar), Cloud-VPS (Quota-requests)

Sep 30 2025

dduvall added a comment to T405118: Set up zuul scheduler on zuul1001.

I tried it and I can confirm using mysql+pymysql gets us past the error.

Sep 30 2025, 7:49 PM · collaboration-services, Essential-Work, Continuous-Integration-Infrastructure (Zuul upgrade)
dduvall added a comment to T405118: Set up zuul scheduler on zuul1001.

Interesting! This looks like it might be something missing in the executor image. My guess is that we copied upstream, but upstream doesn't use mariadb like we do. @dduvall can you take a look at this?

Sep 30 2025, 5:40 PM · collaboration-services, Essential-Work, Continuous-Integration-Infrastructure (Zuul upgrade)

Sep 25 2025

dduvall changed the status of T405651: Enable use of `.kokkuri:bake` on trusted runners from Open to In Progress.
Sep 25 2025, 6:54 PM · Essential-Work, GitLab (CI & Job Runners), Release-Engineering-Team (Doing 😎)
dduvall created T405651: Enable use of `.kokkuri:bake` on trusted runners.
Sep 25 2025, 6:54 PM · Essential-Work, GitLab (CI & Job Runners), Release-Engineering-Team (Doing 😎)

Sep 22 2025

dduvall created T405287: Hanging NotReady status following OOM on node.
Sep 22 2025, 10:22 PM · Release-Engineering-Team (Priority Backlog 📥), GitLab (CI & Job Runners)

Sep 17 2025

dduvall added a subtask for T396380: 1.45.0-wmf.19 deployment blockers: T404902: Wikimedia\Assert\InvariantException: Invariant failed: getBasePageBundle called on non-Parsoid ContentHolder.
Sep 17 2025, 6:28 PM · Essential-Work, Release-Engineering-Team (Doing 😎), Release, Train Deployments
dduvall added a parent task for T404902: Wikimedia\Assert\InvariantException: Invariant failed: getBasePageBundle called on non-Parsoid ContentHolder: T396380: 1.45.0-wmf.19 deployment blockers.
Sep 17 2025, 6:28 PM · MW-1.45-notes (1.45.0-wmf.19; 2025-09-16), MW-Interfaces-Team, MediaWiki-REST-API, MediaWiki-Parser, Wikimedia-production-error
dduvall created T404902: Wikimedia\Assert\InvariantException: Invariant failed: getBasePageBundle called on non-Parsoid ContentHolder.
Sep 17 2025, 6:25 PM · MW-1.45-notes (1.45.0-wmf.19; 2025-09-16), MW-Interfaces-Team, MediaWiki-REST-API, MediaWiki-Parser, Wikimedia-production-error

Sep 15 2025

dduvall created T404668: Increase gitlab-runners-staging volumes to 12.
Sep 15 2025, 11:36 PM · Cloud-VPS (Quota-requests)
dduvall reopened T404386: Request creation of gitlab-runners-staging VPS project as "Open".

@Andrew I don't see any zones listed in the project. Is that normal for a new project?

No, it is not normal for a project to have 0 Designate zones. There should be svc.$PROJECT.eqiad1.wikimedia.cloud., $PROJECT.eqiad1.wmcloud.org., and $PROJECT.wmcloud.org. zones assigned to the project in Designate.

Sep 15 2025, 3:52 PM · Cloud-VPS (Project-requests)

Sep 12 2025

dduvall added a comment to T404386: Request creation of gitlab-runners-staging VPS project.

@Andrew I don't see any zones listed in the project. Is that normal for a new project?

Sep 12 2025, 10:37 PM · Cloud-VPS (Project-requests)

Sep 11 2025

dduvall updated the task description for T404386: Request creation of gitlab-runners-staging VPS project.
Sep 11 2025, 7:07 PM · Cloud-VPS (Project-requests)
dduvall created T404386: Request creation of gitlab-runners-staging VPS project.
Sep 11 2025, 6:59 PM · Cloud-VPS (Project-requests)
dduvall added a comment to T404150: Additional floating IPs for gitlab-cloud-runner testing in testlabs project.

I asked @Andrew about this, and my understanding is that floating IPs are not required to create Octavia load balancers in OpenStack. But I don't have a full understanding of how Magnum works, so I might be wrong! Can you share more details like tofu code, errors you're getting, etc.?

Sep 11 2025, 4:35 PM · Release-Engineering-Team (Radar), Cloud-VPS (Quota-requests)

Sep 10 2025

dduvall added a comment to T404238: InvalidArgumentException: $aspect must use one of the XXX_USAGE constants, "A" given!.

A spike of these errors occurred during wmf.18 group1 promotion today but strangely all instances of the error were from 1.45.0-wmf.17.

Sep 10 2025, 6:23 PM · Wikidata Integration in Wikimedia projects, Wikidata, Wikimedia-production-error

Sep 9 2025

dduvall created T404150: Additional floating IPs for gitlab-cloud-runner testing in testlabs project.
Sep 9 2025, 9:43 PM · Release-Engineering-Team (Radar), Cloud-VPS (Quota-requests)

Sep 4 2025

dduvall updated the task description for T396245: Build zuul images for production.
Sep 4 2025, 2:58 PM · Essential-Work, collaboration-services, Release-Engineering-Team (Doing 😎), Continuous-Integration-Infrastructure (Zuul upgrade)
dduvall closed T396245: Build zuul images for production as Resolved.
Sep 4 2025, 2:58 PM · Essential-Work, collaboration-services, Release-Engineering-Team (Doing 😎), Continuous-Integration-Infrastructure (Zuul upgrade)
dduvall updated the task description for T396245: Build zuul images for production.
Sep 4 2025, 2:58 PM · Essential-Work, collaboration-services, Release-Engineering-Team (Doing 😎), Continuous-Integration-Infrastructure (Zuul upgrade)

Aug 15 2025

dduvall added a comment to T392526: Refactor `build-images.py` to use a common code image and `docker buildx`.

I'm also unsure how to resolve the difference in package name - e.g., whether there's some suitable override mechanism (create an empty transitional package?) - or whether there are subtle differences in the packaging configuration that make the result incompatible.

One thing to check is whether these debs are actually arch-dependent, usually most vendor debs are just statically linked, we could try one of them in a bullseye and bookworm container to find out. If so, we could have a single update definition and then use it to sync to bullseye and bookworm.

Aug 15 2025, 4:40 PM · Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥)

Jul 23 2025

dduvall added a comment to T392610: SpiderPig should support train deployments.

SpiderPig is hilarious and awesome. Great job everyone!

Jul 23 2025, 4:28 PM · Essential-Work, Scap (SpiderPig 🕸️), Release-Engineering-Team (Yak Shaving 🐃🪒)

Jul 22 2025

dduvall added a comment to T398873: Move nightly image build from releases-jenkins to deployment.eqiad.wmnet.

Yes, but in the meantime scap prep next would continue to work correctly (assuming at least one prior successful branch cut was merged). As it stands, if`MediaWiki branch and publish WMF single-version image` fails, a subsequent scap prep next will fail.

Jul 22 2025, 9:57 PM · Release-Engineering-Team (Doing 😎), OKR-Work
dduvall added a comment to T398873: Move nightly image build from releases-jenkins to deployment.eqiad.wmnet.

@dduvall I'd like to see MediaWiki branch and publish WMF single-version image changed so that instead of destroying and recreating the wmf/next branch each time it runs, it updates wmf/next if it already exists. This means being able to handle added/dropped extensions.

Jul 22 2025, 9:49 PM · Release-Engineering-Team (Doing 😎), OKR-Work
dduvall added a comment to T398873: Move nightly image build from releases-jenkins to deployment.eqiad.wmnet.

If CI fails (which happens about 10% of the time), we're left with an unusable wmf/next branch until the next run

Jul 22 2025, 9:45 PM · Release-Engineering-Team (Doing 😎), OKR-Work

Jul 11 2025

dduvall added a comment to T399120: [kokuri] Use a unique per CI run tag by default.

@bd808 Kokkuri 2.8.0 will include the digest in the image ref. See if that solves your issue.

Jul 11 2025, 10:00 PM · Release-Engineering-Team (Doing 😎), Patch-For-Review, GitLab (CI & Job Runners)

Jun 18 2025

dduvall added a comment to T395938: puppetize setup of new zuul VMs.

@Dzahn the WMF based production images for Zuul and Nodepool have been built and published to our registry. I'll post a summary about how we're managing them in T396245: Build zuul images for production tomorrow, but here are the latest image refs by service:

Jun 18 2025, 11:59 PM · Patch-For-Review, collaboration-services, Continuous-Integration-Infrastructure (Zuul upgrade)

Jun 9 2025

dduvall added a comment to T390119: Plan for porting PipelineLib to Zuul Ansible.

Looking at the above results, I believe that most of the functionality being served by PipelineLib could potentially be served by docker buildx bake (in conjunction w/ buildkitd and Blubber). Docker bake can build multiple sets of targets/contexts/configs simultaneously and even export the results as generic artifacts (to serve the one case that is using copy).

Jun 9 2025, 11:50 PM · Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥), Continuous-Integration-Infrastructure (Zuul upgrade)
dduvall added a comment to T390119: Plan for porting PipelineLib to Zuul Ansible.

PipelineLib actions in use, according to codesearch results of 27 Gerrit hosted projects that include a .pipeline/config.yaml file.

Jun 9 2025, 11:13 PM · Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥), Continuous-Integration-Infrastructure (Zuul upgrade)
dduvall updated the task description for T390119: Plan for porting PipelineLib to Zuul Ansible.
Jun 9 2025, 10:48 PM · Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥), Continuous-Integration-Infrastructure (Zuul upgrade)

Jun 6 2025

dduvall claimed T396245: Build zuul images for production.
Jun 6 2025, 11:00 PM · Essential-Work, collaboration-services, Release-Engineering-Team (Doing 😎), Continuous-Integration-Infrastructure (Zuul upgrade)
dduvall updated subscribers of T396245: Build zuul images for production.

I refactored the blubber.yaml that @dancy had written back when we were experimenting with a Zuul setup for GitLab and created a wmf/12.0.0 branch.

Jun 6 2025, 11:00 PM · Essential-Work, collaboration-services, Release-Engineering-Team (Doing 😎), Continuous-Integration-Infrastructure (Zuul upgrade)

Jun 5 2025

dduvall updated subscribers of T396111: Wikimedia\NormalizedException\NormalizedException: Invalid username: {username}.

Spotted this today as well, following wmf.4 promotion to all wikis.

Jun 5 2025, 6:35 PM · ConfirmEdit (CAPTCHA extension), MediaWiki-Platform-Team, MediaWiki-extensions-CentralAuth, Wikimedia-production-error

Jun 3 2025

dduvall added a subtask for T392174: 1.45.0-wmf.4 deployment blockers: T395957: PHP Warning: Undefined array key "clientPref".
Jun 3 2025, 7:35 PM · Release-Engineering-Team (Priority Backlog 📥), Essential-Work, Release, Train Deployments
dduvall added a parent task for T395957: PHP Warning: Undefined array key "clientPref": T392174: 1.45.0-wmf.4 deployment blockers.
Jun 3 2025, 7:35 PM · MW-1.45-notes (1.45.0-wmf.5; 2025-06-10), MediaWiki-Platform-Team, MediaWiki-extensions-CentralAuth, Wikimedia-production-error
dduvall triaged T395957: PHP Warning: Undefined array key "clientPref" as Unbreak Now! priority.
Jun 3 2025, 7:35 PM · MW-1.45-notes (1.45.0-wmf.5; 2025-06-10), MediaWiki-Platform-Team, MediaWiki-extensions-CentralAuth, Wikimedia-production-error
dduvall created T395957: PHP Warning: Undefined array key "clientPref".
Jun 3 2025, 7:33 PM · MW-1.45-notes (1.45.0-wmf.5; 2025-06-10), MediaWiki-Platform-Team, MediaWiki-extensions-CentralAuth, Wikimedia-production-error

May 14 2025

dduvall removed a member for MW-on-K8s: dduvall.
May 14 2025, 8:24 PM

May 6 2025

dduvall created T393496: Increase zuul3 quotas for cpu/ram/disk/instances.
May 6 2025, 5:35 PM · cloud-services-team, Release-Engineering-Team (Priority Backlog 📥), Continuous-Integration-Infrastructure (Zuul upgrade), Cloud-VPS (Quota-requests)
dduvall closed T391374: Stand up Zuul 11 experiment environment in zuul3 cloud VPS project as Resolved.
May 6 2025, 5:22 PM · Essential-Work, Release-Engineering-Team (Doing 😎), Continuous-Integration-Infrastructure (Zuul upgrade)

May 1 2025

dduvall added a comment to T393034: Investigate out of date refs following gerrit switchover.

@thcipriani is this still a blocker or are we good for group1/all wiki promotion today?

May 1 2025, 3:59 PM · Wikimedia-Incident, Release-Engineering-Team, collaboration-services, Gerrit

Apr 24 2025

dduvall updated the task description for T392610: SpiderPig should support train deployments.
Apr 24 2025, 3:58 PM · Essential-Work, Scap (SpiderPig 🕸️), Release-Engineering-Team (Yak Shaving 🐃🪒)
dduvall created T392610: SpiderPig should support train deployments.
Apr 24 2025, 3:57 PM · Essential-Work, Scap (SpiderPig 🕸️), Release-Engineering-Team (Yak Shaving 🐃🪒)
dduvall added a comment to T390251: docker-registry.wikimedia.org keeps serving bad blobs.

serializes only layers per push. Multiple pushes can still happen simultaneously and IIUC scap does do that. Maybe we could have a flag in scap like e.g. --image-push-concurrency=1 to at least verify/rule out this hypothesis? It would slow down deployment for a couple of weeks but would give a strong signal to guide us better into resolving this.

+1 I like this, @dancy @dduvall what do you think about it?

As a short-term mitigation, it seems reasonable, and serializing the image pushes in build-images.py should be straightforward.

Apr 24 2025, 3:21 PM · Patch-For-Review, serviceops
dduvall changed the status of T392526: Refactor `build-images.py` to use a common code image and `docker buildx` from Open to In Progress.
Apr 24 2025, 3:19 PM · Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥)

Apr 23 2025

dduvall updated the task description for T392526: Refactor `build-images.py` to use a common code image and `docker buildx`.
Apr 23 2025, 6:44 PM · Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥)
dduvall created T392526: Refactor `build-images.py` to use a common code image and `docker buildx`.
Apr 23 2025, 6:38 PM · Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥)

Apr 22 2025

dduvall added a comment to T390251: docker-registry.wikimedia.org keeps serving bad blobs.

serializes only layers per push. Multiple pushes can still happen simultaneously and IIUC scap does do that. Maybe we could have a flag in scap like e.g. --image-push-concurrency=1 to at least verify/rule out this hypothesis? It would slow down deployment for a couple of weeks but would give a strong signal to guide us better into resolving this.

+1 I like this, @dancy @dduvall what do you think about it?

Apr 22 2025, 11:47 PM · Patch-For-Review, serviceops

Apr 17 2025

dduvall updated subscribers of T391869: PHP Warning: Undefined property: Wikimedia\Parsoid\NodeData\DataMw::$caption.

Looks to have been introduced in:

Apr 17 2025, 6:18 PM · Essential-Work, Content-Transform-Team (Work In Progress), Parsoid, Wikimedia-production-error

Apr 16 2025

dduvall removed a subtask for T386220: 1.44.0-wmf.25 deployment blockers: T392086: PHP Warning: Array to string conversion / RuntimeException: PCRE failure on Special:PasswordReset.
Apr 16 2025, 6:23 PM · Essential-Work, Release-Engineering-Team (Doing 😎), Release, Train Deployments
dduvall removed a parent task for T392086: PHP Warning: Array to string conversion / RuntimeException: PCRE failure on Special:PasswordReset: T386220: 1.44.0-wmf.25 deployment blockers.
Apr 16 2025, 6:23 PM · MW-1.44-notes (1.44.0-wmf.25; 2025-04-15), MW-1.43-notes, MediaWiki-Platform-Team, MediaWiki-User-login-and-signup, Wikimedia-production-error
dduvall lowered the priority of T392086: PHP Warning: Array to string conversion / RuntimeException: PCRE failure on Special:PasswordReset from Unbreak Now! to Medium.

Removing this task as a blocker as the errors only occurred during a short-ish window, occurred for wmf.24 as well as wmf.25, and only for internal wikis.

Apr 16 2025, 6:23 PM · MW-1.44-notes (1.44.0-wmf.25; 2025-04-15), MW-1.43-notes, MediaWiki-Platform-Team, MediaWiki-User-login-and-signup, Wikimedia-production-error
dduvall added a project to T392086: PHP Warning: Array to string conversion / RuntimeException: PCRE failure on Special:PasswordReset: MediaWiki-Platform-Team.
Apr 16 2025, 6:03 PM · MW-1.44-notes (1.44.0-wmf.25; 2025-04-15), MW-1.43-notes, MediaWiki-Platform-Team, MediaWiki-User-login-and-signup, Wikimedia-production-error
dduvall added a comment to T391935: scap train-presync failed to push image: blob upload unknown.

Closed as a duplicate that, while ongoing, is not strictly a train blocker.

Apr 16 2025, 4:58 PM · serviceops, Release-Engineering-Team
dduvall merged T391935: scap train-presync failed to push image: blob upload unknown into T390251: docker-registry.wikimedia.org keeps serving bad blobs.
Apr 16 2025, 4:57 PM · Patch-For-Review, serviceops
dduvall merged task T391935: scap train-presync failed to push image: blob upload unknown into T390251: docker-registry.wikimedia.org keeps serving bad blobs.
Apr 16 2025, 4:57 PM · serviceops, Release-Engineering-Team

Apr 15 2025

dduvall added a comment to T390251: docker-registry.wikimedia.org keeps serving bad blobs.

Other possibly relevant discussions around this issue.

Apr 15 2025, 5:24 PM · Patch-For-Review, serviceops
dduvall added a comment to T390251: docker-registry.wikimedia.org keeps serving bad blobs.

Also, I wonder if there's a way we can force monolithic uploads?

Apr 15 2025, 5:22 PM · Patch-For-Review, serviceops
dduvall added a comment to T390251: docker-registry.wikimedia.org keeps serving bad blobs.

Also, I wonder if there's a way we can force monolithic uploads?

Apr 15 2025, 5:05 PM · Patch-For-Review, serviceops
dduvall added a comment to T390251: docker-registry.wikimedia.org keeps serving bad blobs.

In the case of uploads, here is the bad sequence:

  • The client (e.g. dockerd) issues POST /v2/<repo>/blob/uploads/ to initiate an upload. This returns a new URL for subsequent operations (hereafter called the upload URL)
  • The client issues a PATCH to the upload URL to transmit the data.
  • The client issues a PUT to the upload URL to finalize the upload. This is where a 404 is sometimes returned by the registry (basically saying that it doesn't know about this upload). A 404 is more likely to be seen if a prior upload was large (i.e, if the replicator is busy). Retrying this PUT does eventually succeed.

In this case it's not the content of the upload that hasn't made it to the replica, but the existence of the upload state itself.

Apr 15 2025, 5:04 PM · Patch-For-Review, serviceops
dduvall added a comment to T390251: docker-registry.wikimedia.org keeps serving bad blobs.

At this point we have two problems.

  • Large image pushes are now unreliable (this seems new for mediawiki deployments). No workaround proposed yet.
Apr 15 2025, 4:17 PM · Patch-For-Review, serviceops

Apr 14 2025

dduvall added a comment to T391374: Stand up Zuul 11 experiment environment in zuul3 cloud VPS project.

The Zuul dashboard is available at https://zuul-dev.wmcloud.org/tenants

Apr 14 2025, 11:52 PM · Essential-Work, Release-Engineering-Team (Doing 😎), Continuous-Integration-Infrastructure (Zuul upgrade)
dduvall added a comment to T391374: Stand up Zuul 11 experiment environment in zuul3 cloud VPS project.

@dduvall excellent!

The event stream permission is probably good enough, it does not grant any specific access beside the ability to receive events and we have multiple bots on WMCS using that same setup. As long as the user is not granted more permission, it can't do much. Setting up a Gerrit + repos + config might add a bit of a burden, then if you can reuse an existing setup that let gives us a great playground \o/

Apr 14 2025, 11:44 PM · Essential-Work, Release-Engineering-Team (Doing 😎), Continuous-Integration-Infrastructure (Zuul upgrade)

Apr 8 2025

dduvall edited projects for T391374: Stand up Zuul 11 experiment environment in zuul3 cloud VPS project, added: Release-Engineering-Team (Doing 😎); removed Release-Engineering-Team (Priority Backlog 📥).

@hashar FYI I've set up Zuul and friends on zuul-1001.zuul3.eqiad1.wikimedia.cloud using https://opendev.org/zuul/zuul/src/branch/master/doc/source/examples/docker-compose.yaml

Apr 8 2025, 11:30 PM · Essential-Work, Release-Engineering-Team (Doing 😎), Continuous-Integration-Infrastructure (Zuul upgrade)
dduvall changed the status of T391374: Stand up Zuul 11 experiment environment in zuul3 cloud VPS project from Open to In Progress.
Apr 8 2025, 11:15 PM · Essential-Work, Release-Engineering-Team (Doing 😎), Continuous-Integration-Infrastructure (Zuul upgrade)

Mar 26 2025

dduvall added a comment to T389499: Refactor scap's kubernetes DeploymentsConfig to support selection of image kinds.

So, a possibly adequate analogy for the relationship between image "kind" (a new name for an existing concept that did not previously have a name) and image "flavour" (an existing name for an existing concept) would be that between a class definition and the specific set of constructor arguments that produce a concrete instantiation.

Mar 26 2025, 3:35 PM · MW-on-K8s, Release-Engineering-Team, serviceops

Mar 18 2025

dduvall added a comment to T388769: Add support for Alpine Linux in Blubber.

The apt config and implementation is also Debian-based base image specific. Alpine base images would need an apk config and implementation. Red Hat-based base images would need a yum (or dnf?) config and implementation. Arch uses pacman. OpenSUSE uses zypper. I am not aware of any unifying abstraction over the various distro specific package managers that would simplify this readily.

Mar 18 2025, 3:47 PM · Release Pipeline (Blubber)

Mar 17 2025

dduvall added a comment to T388769: Add support for Alpine Linux in Blubber.

Thanks for pointing out that this isn't a bug, @bd808.

Mar 17 2025, 9:10 PM · Release Pipeline (Blubber)

Mar 7 2025

dduvall added a comment to T387927: Improve garbage collection of unused MediaWiki images on deployment host.

I agree with running a daily timer and trying to spread the knowledge about the availability of scap clean-images to quickly recover space in unusual circumstances.

Mar 7 2025, 10:08 PM · Essential-Work, Release-Engineering-Team (Doing 😎)
dduvall added a comment to T387927: Improve garbage collection of unused MediaWiki images on deployment host.

The scap clean-images implementation has been merged. I plan on doing a release early next week. Sample behavior from train-dev:

Mar 7 2025, 7:18 PM · Essential-Work, Release-Engineering-Team (Doing 😎)