Page MenuHomePhabricator

dancy (Ahmon Dancy)
Staff Software Engineer, Release EngineeringAdministrator

Projects (7)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Jun 27 2020, 12:14 AM (198 w, 17 h)
Roles
Administrator
Availability
Available
IRC Nick
dancy
LDAP User
Ahmon Dancy
MediaWiki User
ADancy (WMF) [ Global Accounts ]

Recent Activity

Thu, Apr 11

dancy added a comment to T329857: MediaWiki deploy servers should not be mediawiki installation targets.

@Clement_Goubert I noticed the /srv/mediawiki.old.20230424.T329857 directory on deploy1002.eqiad.wmnet today. It's safe to delete.

Thu, Apr 11, 8:37 PM · serviceops, Performance-Team (Radar), Deployments, Release-Engineering-Team
dancy changed the status of T359643: Get rid of the /srv/mediawiki/php symbolic link from Open to In Progress.
Thu, Apr 11, 8:27 PM · Patch-For-Review, MW-1.42-notes (1.42.0-wmf.22; 2024-03-12), Release-Engineering-Team (Now this 🫠), MediaWiki-libs-Mime, MediaWiki-Platform-Team (Radar), Scap

Tue, Apr 9

dancy added a comment to T359643: Get rid of the /srv/mediawiki/php symbolic link.

Thanks for the clarification. In that case I will keep experimenting. I may end up having to change static.php to accommodate this special case.

Tue, Apr 9, 8:31 PM · Patch-For-Review, MW-1.42-notes (1.42.0-wmf.22; 2024-03-12), Release-Engineering-Team (Now this 🫠), MediaWiki-libs-Mime, MediaWiki-Platform-Team (Radar), Scap
dancy added a comment to T359643: Get rid of the /srv/mediawiki/php symbolic link.

@dancy Are you proposing a redirect in general, or only for appservers? (Is this easier than a rewrite?)

I believe we intentionally chose this as canonical URL such that it only exists on mediawiki.org as opposed to the docs directory which is reachable from any wiki. This is similar to e.g. https://www.mediawiki.org/xml/ in that regard, so as to not create ambiguity over from where it should be used.

Tue, Apr 9, 8:24 PM · Patch-For-Review, MW-1.42-notes (1.42.0-wmf.22; 2024-03-12), Release-Engineering-Team (Now this 🫠), MediaWiki-libs-Mime, MediaWiki-Platform-Team (Radar), Scap
dancy added a comment to T288381: Connect WikiBugs IRC bot to Wikimedia GitLab.

@bd808 I read over your proposal and all of the ideas sound reasonable. The code behind gitlab-webhooks is pretty simple and can easily be modified to have a better design (for example, separate producer and consumer threads, storing unprocessed events in a file, etc). I'm happy to work with you on necessary changes, or to just review.

Tue, Apr 9, 3:20 PM · User-bd808, Release-Engineering-Team (Priority Backlog 📥), GitLab (Integrations), Wikibugs

Mon, Apr 8

dancy added a comment to T361608: RESTBase scap deployment failed.

It failed again but with a different error:

:* restbase1030.eqiad.wmnet

Setting lfs.url of restbase to https://gerrit.wikimedia.org/r/mediawiki/services/restbase/info/lfs
Running ['git', 'config', 'lfs.url', 'https://gerrit.wikimedia.org/r/mediawiki/services/restbase/info/lfs'] with {'cwd': '/srv/deployment/restbase/deploy-cache/cache/restbase', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Unhandled error:
deploy-local failed: <FileNotFoundError> {}
Mon, Apr 8, 3:20 PM · Scap, Release-Engineering-Team, RESTBase
dancy created P59861 T361608-re-run.
Mon, Apr 8, 3:19 PM

Fri, Apr 5

dancy closed T361720: Helm was left in limbo due to interrupted deployment/rollback as Resolved.

Scap 4.75.0 deployed with the fix.

Fri, Apr 5, 2:59 PM · Release-Engineering-Team, Scap, serviceops, Wikimedia-Incident
dancy closed T361720: Helm was left in limbo due to interrupted deployment/rollback , a subtask of T361706: 2024-04-03 calico/typha down, as Resolved.
Fri, Apr 5, 2:57 PM · Patch-For-Review, Prod-Kubernetes, Wikimedia-Incident

Thu, Apr 4

dancy lowered the priority of T361720: Helm was left in limbo due to interrupted deployment/rollback from Unbreak Now! to Medium.
Thu, Apr 4, 8:19 PM · Release-Engineering-Team, Scap, serviceops, Wikimedia-Incident
dancy added a comment to T361724: scap should check if it is running within a tmux/screen.

There is a tmux/screen check for scap stage-train, but nothing else. This could be factored out to cover other scap subcommands.
Suggestions:

  • scap backport
  • scap deploy
  • scap deploy-promote
  • scap sync-*
  • scap train
  • scap stage-train
  • scap lock
Thu, Apr 4, 6:58 PM · Patch-For-Review, Sustainability (Incident Followup), Scap, Release-Engineering-Team, serviceops
dancy reopened T361720: Helm was left in limbo due to interrupted deployment/rollback as "Open".

Scap already has a check for helm releases in pending-upgrade state but it looks like it needs to handle pending-rollback too.

Thu, Apr 4, 6:29 PM · Release-Engineering-Team, Scap, serviceops, Wikimedia-Incident
dancy added projects to T361720: Helm was left in limbo due to interrupted deployment/rollback : Scap, Release-Engineering-Team.
Thu, Apr 4, 6:29 PM · Release-Engineering-Team, Scap, serviceops, Wikimedia-Incident
dancy reopened T361720: Helm was left in limbo due to interrupted deployment/rollback , a subtask of T361706: 2024-04-03 calico/typha down, as Open.
Thu, Apr 4, 6:28 PM · Patch-For-Review, Prod-Kubernetes, Wikimedia-Incident
dancy added a comment to T328472: analytics/refinery: Stop using git-fat.

@Sfaci The issue with the analytics-refinery-update-jars-docker job should be resolved now. https://integration.wikimedia.org/ci/job/analytics-refinery-update-jars-docker/108/console has the output of a test run. Note that it created https://gerrit.wikimedia.org/r/c/analytics/refinery/+/1017074.

Thu, Apr 4, 6:08 PM · Patch-For-Review, git-lfs, Release-Engineering-Team (Now this 🫠), Data-Engineering, Data-Platform-SRE, Scap

Tue, Apr 2

dancy added a comment to T357612: Create a special-purpose Trusted Runner with Dockerfile frontend.

The Trusted Dockerfile Runner gitlab-runner2004 is available now. The first project which is allowed to use this runner is buildkit. I merged the change above to build the dockerfile frontend image also in CI, which should be a good test.

@dancy are you okay to push a new tag for buildkit to trigger the image build pipeline?

If this works as expected, the only missing steps are to update docs about the Dockerfile Runner and install a Dockerfile Runner in our test environment as well.

Tue, Apr 2, 2:58 PM · Patch-For-Review, GitLab (CI & Job Runners), collaboration-services
dancy renamed T361585: scap deploy-promote needs a timeout when waiting for CI from deploy promote needs a timeout when waiting for CI to scap deploy-promote needs a timeout when waiting for CI.
Tue, Apr 2, 2:35 PM · Scap, Release-Engineering-Team

Wed, Mar 27

dancy added a project to T360729: Ask for a commit/change summary in scap train if one not provided?: Release-Engineering-Team.
Wed, Mar 27, 2:43 PM · Release-Engineering-Team, Scap

Fri, Mar 22

dancy added a comment to T354441: 1.42.0-wmf.23 deployment blockers.

Hey all ideally we would have caught this yesterday but what with the train delay we now have a quite serious bug impacting editors. Is anyone able to help me backport this configuration change? https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1013583

Fri, Mar 22, 5:47 PM · Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments
dancy moved T317795: scap backport check prevents using it to fix an about-to-be-live branch from Done to In progress on the Release-Engineering-Team (Now this 🫠) board.
Fri, Mar 22, 5:15 PM · Patch-For-Review, Release-Engineering-Team (Now this 🫠), Scap
dancy moved T350628: Scap backporting a patch that gets a -2 hangs from Backlog to Done on the Release-Engineering-Team (Now this 🫠) board.
Fri, Mar 22, 5:14 PM · Release-Engineering-Team (Now this 🫠), Patch-For-Review, Scap
dancy moved T317795: scap backport check prevents using it to fix an about-to-be-live branch from In progress to Done on the Release-Engineering-Team (Now this 🫠) board.
Fri, Mar 22, 5:14 PM · Patch-For-Review, Release-Engineering-Team (Now this 🫠), Scap

Thu, Mar 21

dancy added a comment to T354441: 1.42.0-wmf.23 deployment blockers.

I'm rolling the train back to group1 while T360717 is being investigated.

Thu, Mar 21, 9:26 PM · Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments

Wed, Mar 20

dancy removed a subtask for T354441: 1.42.0-wmf.23 deployment blockers: T360565: MediaWiki\Linter\MissingCategoryException: Cannot find id for 'night-mode-unaware-background-color'.
Wed, Mar 20, 9:51 PM · Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments
dancy removed a parent task for T360565: MediaWiki\Linter\MissingCategoryException: Cannot find id for 'night-mode-unaware-background-color': T354441: 1.42.0-wmf.23 deployment blockers.
Wed, Mar 20, 9:51 PM · Essential-Work, MW-1.42-notes (1.42.0-wmf.24; 2024-03-26), MediaWiki-extensions-Linter, Content-Transform-Team, Wikimedia-production-error
dancy lowered the priority of T360565: MediaWiki\Linter\MissingCategoryException: Cannot find id for 'night-mode-unaware-background-color' from Unbreak Now! to Medium.
Wed, Mar 20, 9:51 PM · Essential-Work, MW-1.42-notes (1.42.0-wmf.24; 2024-03-26), MediaWiki-extensions-Linter, Content-Transform-Team, Wikimedia-production-error
dancy added a comment to T360565: MediaWiki\Linter\MissingCategoryException: Cannot find id for 'night-mode-unaware-background-color'.

And rolling back the train did not fix the problem. :-(

Wed, Mar 20, 6:54 PM · Essential-Work, MW-1.42-notes (1.42.0-wmf.24; 2024-03-26), MediaWiki-extensions-Linter, Content-Transform-Team, Wikimedia-production-error
dancy added a comment to T360565: MediaWiki\Linter\MissingCategoryException: Cannot find id for 'night-mode-unaware-background-color'.

And rolling back the train did not fix the problem. :-(

Wed, Mar 20, 6:50 PM · Essential-Work, MW-1.42-notes (1.42.0-wmf.24; 2024-03-26), MediaWiki-extensions-Linter, Content-Transform-Team, Wikimedia-production-error
dancy updated the task description for T360565: MediaWiki\Linter\MissingCategoryException: Cannot find id for 'night-mode-unaware-background-color'.
Wed, Mar 20, 6:50 PM · Essential-Work, MW-1.42-notes (1.42.0-wmf.24; 2024-03-26), MediaWiki-extensions-Linter, Content-Transform-Team, Wikimedia-production-error
dancy added a comment to T360565: MediaWiki\Linter\MissingCategoryException: Cannot find id for 'night-mode-unaware-background-color'.

Note that though I rolled 1.42.0-wmf.23 to group1, the errors are being reported against .22

Wed, Mar 20, 6:50 PM · Essential-Work, MW-1.42-notes (1.42.0-wmf.24; 2024-03-26), MediaWiki-extensions-Linter, Content-Transform-Team, Wikimedia-production-error
dancy added a subtask for T354441: 1.42.0-wmf.23 deployment blockers: T360565: MediaWiki\Linter\MissingCategoryException: Cannot find id for 'night-mode-unaware-background-color'.
Wed, Mar 20, 6:44 PM · Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments
dancy added a parent task for T360565: MediaWiki\Linter\MissingCategoryException: Cannot find id for 'night-mode-unaware-background-color': T354441: 1.42.0-wmf.23 deployment blockers.
Wed, Mar 20, 6:44 PM · Essential-Work, MW-1.42-notes (1.42.0-wmf.24; 2024-03-26), MediaWiki-extensions-Linter, Content-Transform-Team, Wikimedia-production-error
dancy triaged T360565: MediaWiki\Linter\MissingCategoryException: Cannot find id for 'night-mode-unaware-background-color' as Unbreak Now! priority.
Wed, Mar 20, 6:44 PM · Essential-Work, MW-1.42-notes (1.42.0-wmf.24; 2024-03-26), MediaWiki-extensions-Linter, Content-Transform-Team, Wikimedia-production-error
dancy attached a referenced file: F42861232: gerrtit1.png.
Wed, Mar 20, 5:13 PM · Gerrit (Gerrit 3.7)
dancy updated the task description for T360550: Gerrit 3.7.8: CI has completed checks. Reload the change view? RELOAD button doesn't work.
Wed, Mar 20, 5:13 PM · Gerrit (Gerrit 3.7)
dancy created T360550: Gerrit 3.7.8: CI has completed checks. Reload the change view? RELOAD button doesn't work.
Wed, Mar 20, 5:08 PM · Gerrit (Gerrit 3.7)

Tue, Mar 19

dancy added a comment to T360403: Helm deployment of MediaWiki now takes 6 minutes.

I used scap backport to deploy a mediawiki-config change today. The sync part of the operation took 15 minutes to complete. As an occasional user of scap, this felt "too long". That limits us to 4 backports per hour. I think under 10 minutes is a reasonable target. Anyway, here's a manual time profile of the operation:

sync-prod-k8s:                   06m 17s
php-fpm-restarts:                02m 51s (bm)
sync-canaries-k8s:               00m 54s
php-fpm-restarts (canaries):     00m 48s (bm)
check-testservers:               00m 39s
build-and-push-container-images: 00m 30s
sync-apaches:                    00m 30s (bm)
sync-testservers-k8s:            00m 27s
canary traffic wait:             00m 20s
sync-testservers:                00m 16s (bm)
sync-masters:                    00m 11s
sync-proxies:                    00m 07s (bm)

Items marked (bm) are bare-metal-only operations that should disappear eventually. That's 4m32s of time.

Tue, Mar 19, 7:21 PM · serviceops-radar, Release-Engineering-Team (Radar), MW-on-K8s
dancy added a project to T360461: Update Integration project puppetmaster: Release-Engineering-Team.
Tue, Mar 19, 6:55 PM · Continuous-Integration-Infrastructure, Release-Engineering-Team, VPS-Projects, Puppet (Puppet 7.0), cloud-services-team
dancy updated subscribers of T360459: Update gitlab-runners project puppetmaster.

@Jelto @eoghan @Arnoldokoth Looks like your department.

Tue, Mar 19, 6:50 PM · collaboration-services, VPS-Projects, Puppet (Puppet 7.0), cloud-services-team
dancy moved T359643: Get rid of the /srv/mediawiki/php symbolic link from Waiting for review to In progress on the Release-Engineering-Team (Now this 🫠) board.
Tue, Mar 19, 3:13 PM · Patch-For-Review, MW-1.42-notes (1.42.0-wmf.22; 2024-03-12), Release-Engineering-Team (Now this 🫠), MediaWiki-libs-Mime, MediaWiki-Platform-Team (Radar), Scap
dancy added a comment to T352262: Review supporting deployment pipeline documentation.

Looks great!

Tue, Mar 19, 2:49 PM · Patch-For-Review, Release-Engineering-Team, Documentation, Tech-Docs-Team

Mon, Mar 18

dancy moved T359643: Get rid of the /srv/mediawiki/php symbolic link from Backlog to Waiting for review on the Release-Engineering-Team (Now this 🫠) board.
Mon, Mar 18, 7:55 PM · Patch-For-Review, MW-1.42-notes (1.42.0-wmf.22; 2024-03-12), Release-Engineering-Team (Now this 🫠), MediaWiki-libs-Mime, MediaWiki-Platform-Team (Radar), Scap
dancy edited projects for T359643: Get rid of the /srv/mediawiki/php symbolic link, added: Release-Engineering-Team (Now this 🫠); removed Release-Engineering-Team.
Mon, Mar 18, 7:53 PM · Patch-For-Review, MW-1.42-notes (1.42.0-wmf.22; 2024-03-12), Release-Engineering-Team (Now this 🫠), MediaWiki-libs-Mime, MediaWiki-Platform-Team (Radar), Scap
dancy updated subscribers of T357066: CirrusSearch\BuildDocument\BuildDocumentException: ParserOutput cannot be obtained..

This is the #1 error in logspam-watch right now. @Gehel would you be able to help with this?

Mon, Mar 18, 6:46 PM · Patch-For-Review, Discovery-Search (Current work), User-brennen, CirrusSearch, Wikimedia-production-error

Fri, Mar 15

dancy added a comment to T359643: Get rid of the /srv/mediawiki/php symbolic link.

For it to serve COPYING it would need to:

  1. hardcode a mime type in static.php, or define in MimeAnalyzer in MediaWiki core, that an extension-less file called COPYING is of type text/plain.
Fri, Mar 15, 3:56 PM · Patch-For-Review, MW-1.42-notes (1.42.0-wmf.22; 2024-03-12), Release-Engineering-Team (Now this 🫠), MediaWiki-libs-Mime, MediaWiki-Platform-Team (Radar), Scap
dancy closed T354787: gitlab-cloud-runner: Roll back pending helm releases before running terraform apply as Resolved.

I think we can consider this done.

Fri, Mar 15, 3:37 PM · Release-Engineering-Team (Now this 🫠), Patch-For-Review
dancy moved T354787: gitlab-cloud-runner: Roll back pending helm releases before running terraform apply from In progress to Done on the Release-Engineering-Team (Now this 🫠) board.
Fri, Mar 15, 3:36 PM · Release-Engineering-Team (Now this 🫠), Patch-For-Review

Thu, Mar 14

dancy created P58800 logstash_checker stream.
Thu, Mar 14, 7:42 PM
dancy added a comment to T357877: foreachwiki on beta does not include en_rtlwiki.
dancy@deployment-mwmaint02:~$ foreachwiki sql.php </dev/null | grep en_rtlwiki
en_rtlwiki
Thu, Mar 14, 5:34 PM · Release-Engineering-Team (Now this 🫠), Patch-For-Review, Beta-Cluster-Infrastructure
dancy closed T357877: foreachwiki on beta does not include en_rtlwiki as Resolved.
Thu, Mar 14, 5:31 PM · Release-Engineering-Team (Now this 🫠), Patch-For-Review, Beta-Cluster-Infrastructure
dancy moved T357877: foreachwiki on beta does not include en_rtlwiki from Backlog to Done on the Release-Engineering-Team (Now this 🫠) board.
Thu, Mar 14, 5:30 PM · Release-Engineering-Team (Now this 🫠), Patch-For-Review, Beta-Cluster-Infrastructure

Mar 13 2024

dancy moved T354439: 1.42.0-wmf.21 deployment blockers from Backlog to Done on the Release-Engineering-Team (Now this 🫠) board.
Mar 13 2024, 7:11 PM · Release-Engineering-Team (Now this 🫠), Release, Train Deployments
dancy moved T359899: CodeReviewBot missed 3 of 5 transactions associated with an MR from Backlog to Done on the Release-Engineering-Team (Now this 🫠) board.
Mar 13 2024, 7:11 PM · Release-Engineering-Team (Now this 🫠), GitLab
dancy added a comment to T345319: TypeError: Argument 1 passed to HtmlFormatter\HtmlFormatter::onHtmlReady() must be of the type string, null given, called in /srv/mediawiki/php-1.41.0-wmf.24/vendor/wikimedia/html-formatter/src/HtmlFormatter.php on line 314.

Thanks @hashar !!

Mar 13 2024, 5:18 PM · HtmlFormatter, MediaWiki-Parser, Discovery-Search, CirrusSearch, Wikimedia-production-error
dancy closed T359899: CodeReviewBot missed 3 of 5 transactions associated with an MR as Resolved.

Fix deployed.

Mar 13 2024, 2:44 PM · Release-Engineering-Team (Now this 🫠), GitLab
dancy added a project to T359899: CodeReviewBot missed 3 of 5 transactions associated with an MR: Release-Engineering-Team (Now this 🫠).
Mar 13 2024, 2:44 PM · Release-Engineering-Team (Now this 🫠), GitLab

Mar 12 2024

dancy added a comment to T338317: Python torch fills disk of CI Jenkins instances.

We could configure buildkit gc rules for the Docker daemon: https://docs.docker.com/build/cache/garbage-collection/

Mar 12 2024, 3:31 PM · Machine-Learning-Team, Continuous-Integration-Infrastructure, Release-Engineering-Team
dancy added a comment to T359899: CodeReviewBot missed 3 of 5 transactions associated with an MR.

Taking a look...

Mar 12 2024, 3:15 PM · Release-Engineering-Team (Now this 🫠), GitLab

Mar 11 2024

dancy added a comment to T316877: wikimedia/discovery/analytics: replace git-fat with git-lfs.

I tried to move this forward today but my change https://gerrit.wikimedia.org/r/c/wikimedia/discovery/analytics/+/1010297 was rejected by CI, saying that the repo has been archived. Does that mean this ticket can be closed?

hrm, repo is still active in gerrit, but I see the commit that moved it to archived in integration/config is T346176: Archive wikimedia/discovery/analytics so sounds like we just need to archive this in gerrit, too, so folks can't push.

Mar 11 2024, 7:50 PM · git-lfs, Release-Engineering-Team (Priority Backlog 📥), Discovery-Search, Scap
dancy moved T359661: gitlab-cloud-runners k8s cluster bootstrapping problem from In progress to Done on the Release-Engineering-Team (Now this 🫠) board.
Mar 11 2024, 7:35 PM · Release-Engineering-Team (Now this 🫠)
dancy closed T359661: gitlab-cloud-runners k8s cluster bootstrapping problem as Resolved.
Mar 11 2024, 7:35 PM · Release-Engineering-Team (Now this 🫠)
dancy added a comment to T316877: wikimedia/discovery/analytics: replace git-fat with git-lfs.

I tried to move this forward today but my change https://gerrit.wikimedia.org/r/c/wikimedia/discovery/analytics/+/1010297 was rejected by CI, saying that the repo has been archived. Does that mean this ticket can be closed?

Mar 11 2024, 7:18 PM · git-lfs, Release-Engineering-Team (Priority Backlog 📥), Discovery-Search, Scap
dancy added a comment to T316876: wdqs: replace git-fat with git-lfs.

Hello followers of this ticket! Can someone tell me if this ticket is still relevant? If so, what git repository is the one needing migration from git-fat to git-lfs?

Mar 11 2024, 6:47 PM · git-lfs, Data-Platform-SRE, Release-Engineering-Team (Priority Backlog 📥), Wikidata, Wikidata-Query-Service, Scap
dancy moved T359661: gitlab-cloud-runners k8s cluster bootstrapping problem from Backlog to In progress on the Release-Engineering-Team (Now this 🫠) board.
Mar 11 2024, 5:40 PM · Release-Engineering-Team (Now this 🫠)
dancy added a comment to T359809: Patch testing in releases Jenkins: prevent false positives.

Interesting problem. I'm very curious about how to properly deal w/ submodule removal. Ideally scap prep auto should handle this situation.

Mar 11 2024, 2:55 PM · Continuous-Integration-Infrastructure, Jenkins, Release-Engineering-Team

Mar 8 2024

dancy added a comment to T359661: gitlab-cloud-runners k8s cluster bootstrapping problem.

Need to retest staging cluster destroy/rebuild.

Mar 8 2024, 10:38 PM · Release-Engineering-Team (Now this 🫠)
dancy changed the status of T359661: gitlab-cloud-runners k8s cluster bootstrapping problem from Open to In Progress.
Mar 8 2024, 10:38 PM · Release-Engineering-Team (Now this 🫠)
dancy moved T359594: Upgrade production gitlab-cloud-runners to kubernetes 1.29 from Backlog to Done on the Release-Engineering-Team (Now this 🫠) board.
Mar 8 2024, 10:37 PM · Release-Engineering-Team (Now this 🫠), GitLab (CI & Job Runners)
dancy added a comment to T359643: Get rid of the /srv/mediawiki/php symbolic link.

@Krinkle Would it be feasible to bring COPYING and CREDITS under the care of /w/static.php (or something)?

Mar 8 2024, 8:42 PM · Patch-For-Review, MW-1.42-notes (1.42.0-wmf.22; 2024-03-12), Release-Engineering-Team (Now this 🫠), MediaWiki-libs-Mime, MediaWiki-Platform-Team (Radar), Scap
dancy updated the task description for T359661: gitlab-cloud-runners k8s cluster bootstrapping problem.
Mar 8 2024, 7:07 PM · Release-Engineering-Team (Now this 🫠)
dancy added a project to T359661: gitlab-cloud-runners k8s cluster bootstrapping problem: Release-Engineering-Team.
Mar 8 2024, 7:00 PM · Release-Engineering-Team (Now this 🫠)
dancy created T359661: gitlab-cloud-runners k8s cluster bootstrapping problem.
Mar 8 2024, 6:59 PM · Release-Engineering-Team (Now this 🫠)
dancy created P58695 Staging cloud runners k8s cluster failure.
Mar 8 2024, 6:58 PM
dancy updated subscribers of T359643: Get rid of the /srv/mediawiki/php symbolic link.

@Krinkle I would like your input on this please.

Mar 8 2024, 4:16 PM · Patch-For-Review, MW-1.42-notes (1.42.0-wmf.22; 2024-03-12), Release-Engineering-Team (Now this 🫠), MediaWiki-libs-Mime, MediaWiki-Platform-Team (Radar), Scap
dancy created T359643: Get rid of the /srv/mediawiki/php symbolic link.
Mar 8 2024, 4:15 PM · Patch-For-Review, MW-1.42-notes (1.42.0-wmf.22; 2024-03-12), Release-Engineering-Team (Now this 🫠), MediaWiki-libs-Mime, MediaWiki-Platform-Team (Radar), Scap
dancy added a comment to T350065: Notify MediaWiki security tasks as soon as an uploaded patch fails to apply.

This is brilliant, thank you for this!

Mar 8 2024, 3:47 PM · Release-Engineering-Team (Now this 🫠), SecTeam-Processed, Security-Team

Mar 7 2024

dancy added a comment to T359594: Upgrade production gitlab-cloud-runners to kubernetes 1.29.

Make sure https://gitlab.wikimedia.org/repos/releng/scap/-/pipelines/44180/builds completes successfully first. This is a test of the scap pipeline in the staging cluster.

Mar 7 2024, 10:19 PM · Release-Engineering-Team (Now this 🫠), GitLab (CI & Job Runners)
dancy added a comment to T359594: Upgrade production gitlab-cloud-runners to kubernetes 1.29.

Make sure https://gitlab.wikimedia.org/repos/releng/scap/-/pipelines/44180/builds completes successfully first. This is a test of the scap pipeline in the staging cluster.

Mar 7 2024, 9:52 PM · Release-Engineering-Team (Now this 🫠), GitLab (CI & Job Runners)
dancy triaged T359594: Upgrade production gitlab-cloud-runners to kubernetes 1.29 as High priority.
Mar 7 2024, 9:14 PM · Release-Engineering-Team (Now this 🫠), GitLab (CI & Job Runners)
dancy created T359594: Upgrade production gitlab-cloud-runners to kubernetes 1.29.
Mar 7 2024, 9:14 PM · Release-Engineering-Team (Now this 🫠), GitLab (CI & Job Runners)
dancy placed T357739: Package logstash-logback-encoder for Debian up for grabs.
Mar 7 2024, 9:08 PM · Release-Engineering-Team (Now this 🫠), Cassandra
dancy closed T357739: Package logstash-logback-encoder for Debian as Resolved.
Mar 7 2024, 8:59 PM · Release-Engineering-Team (Now this 🫠), Cassandra
dancy moved T357739: Package logstash-logback-encoder for Debian from Waiting for review to Done on the Release-Engineering-Team (Now this 🫠) board.
Mar 7 2024, 8:58 PM · Release-Engineering-Team (Now this 🫠), Cassandra
dancy created P58655 logstash-logback-encoder deploy.
Mar 7 2024, 8:54 PM
dancy closed T358117: Adapt scap's testing strategy to mw-on-k8s as Resolved.

Changes deployed and tested. Resolving this task.

Mar 7 2024, 5:31 PM · Release-Engineering-Team (Now this 🫠), Scap, SRE, serviceops, MW-on-K8s
dancy moved T357572: scap install fails on new Phabricator/Phorge host due to missing user from Backlog to Done on the Release-Engineering-Team (Now this 🫠) board.
Mar 7 2024, 5:30 PM · User-brennen, Release-Engineering-Team (Now this 🫠), collaboration-services, Scap
dancy moved T358117: Adapt scap's testing strategy to mw-on-k8s from In progress to Done on the Release-Engineering-Team (Now this 🫠) board.
Mar 7 2024, 5:29 PM · Release-Engineering-Team (Now this 🫠), Scap, SRE, serviceops, MW-on-K8s
dancy closed T358117: Adapt scap's testing strategy to mw-on-k8s, a subtask of T357402: Scap should check errors coming from mw-on-k8s canaries during deployments, as Resolved.
Mar 7 2024, 5:29 PM · Release-Engineering-Team (Now this 🫠), Scap, SRE, serviceops, MW-on-K8s
dancy added a comment to T239376: Run Swagger checks in Scap before exposing to prod MW traffic .

Thanks @dancy. Does the below match your understanding?

  • Swagger checks have been removed in favour of httpbb checks.
Mar 7 2024, 5:28 PM · Release-Engineering-Team (Seen), Scap
dancy closed T239376: Run Swagger checks in Scap before exposing to prod MW traffic as Resolved.

@Krinkle I think the work on T358117 satisfies the goals stated here so I'm resolving this ticket.

Mar 7 2024, 4:59 PM · Release-Engineering-Team (Seen), Scap

Mar 6 2024

dancy created P58603 make systemtest failure with https://gerrit.wikimedia.org/r/c/integration/pipelinelib/+/1009348.
Mar 6 2024, 9:33 PM
dancy created P58602 make systemtest hang with https://gerrit.wikimedia.org/r/c/integration/pipelinelib/+/1009347.
Mar 6 2024, 9:27 PM

Mar 5 2024

dancy added a comment to T359114: Slow and failed deployments.

Thanks for the help everyone.

Mar 5 2024, 4:09 PM · serviceops, MW-on-K8s
dancy added a comment to T359114: Slow and failed deployments.

I have mediawiki deployments locked until we know what's going on. You can unlock by killing my scap process on deploy2002. It is pid 3272.

Mar 5 2024, 12:36 AM · serviceops, MW-on-K8s
dancy added a comment to T358117: Adapt scap's testing strategy to mw-on-k8s.

@Clement_Goubert We have some questions:

  1. Does mwdebug.discovery.wmnet resolve to a random bare-metal/k8s target?
  2. Do you anticipate a case where you'd want to check only k8s testservers (e.g., when using scap sync-world --k8s-only), or only bare metal testservers? If so, what's the best way to achieve this?
Mar 5 2024, 12:25 AM · Release-Engineering-Team (Now this 🫠), Scap, SRE, serviceops, MW-on-K8s
dancy raised the priority of T359114: Slow and failed deployments from High to Unbreak Now!.
Mar 5 2024, 12:10 AM · serviceops, MW-on-K8s

Mar 4 2024

dancy triaged T359114: Slow and failed deployments as High priority.
Mar 4 2024, 11:41 PM · serviceops, MW-on-K8s
dancy updated subscribers of T359114: Slow and failed deployments.

@Clement_Goubert Looks like this wasn't just a one-off problem.

Mar 4 2024, 11:36 PM · serviceops, MW-on-K8s
dancy moved T354438: 1.42.0-wmf.20 deployment blockers from In progress to Done on the Release-Engineering-Team (Now this 🫠) board.
Mar 4 2024, 8:04 PM · Release-Engineering-Team (Now this 🫠), Release, Train Deployments
dancy moved T317795: scap backport check prevents using it to fix an about-to-be-live branch from Backlog to In progress on the Release-Engineering-Team (Now this 🫠) board.
Mar 4 2024, 8:03 PM · Patch-For-Review, Release-Engineering-Team (Now this 🫠), Scap