Page MenuHomePhabricator

[bulids-builder,tekton] upgrade taskuns to v1 from v1beta1 in storage and delete v1beta1 from stored versions
Closed, ResolvedPublic

Description

That will allow the resources to not need to be converted, and speed up considerably the queries for them.

See T376710: [builds-builder,builds-api] After the upgrade build actions take >10s for how we partially did this for pipelineruns

Details

Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
builds-builder: add script to upgrade taskrunsrepos/cloud/toolforge/toolforge-deploy!1010dcaromigrate_taskrunsmain
Customize query in GitLab

Event Timeline

dcaro triaged this task as High priority.
dcaro changed the task status from Open to In Progress.Oct 27 2025, 10:21 AM
dcaro claimed this task.
dcaro moved this task from Next Up to In Progress on the Toolforge (Toolforge iteration 24) board.

Updating the stored version improved considerably the time it takes to get a taskrun:

### Before the upgrade it took ~14s to get a single taskrun:
root@tools-k8s-control-9:~/toolforge-deploy# time kubectl get taskruns -n image-build -l 'tekton.dev/pipelineRun=ac2wd-buildpacks-pipelinerun-dv9z2'
NAME                                                SUCCEEDED   REASON      STARTTIME   COMPLETIONTIME
ac2wd-buildpacks-pipelinerun-dv9z2-build-from-git   True        Succeeded   6d2h        6d2h

real    0m4.471s
user    0m0.111s
sys     0m0.049s


### Now it takes ~4s:
root@tools-k8s-control-9:~/toolforge-deploy# time kubectl get taskruns -n image-build -l 'tekton.dev/pipelineRun=ac2wd-buildpacks-pipelinerun-dv9z2'
NAME                                                SUCCEEDED   REASON      STARTTIME   COMPLETIONTIME
ac2wd-buildpacks-pipelinerun-dv9z2-build-from-git   True        Succeeded   6d2h        6d2h

real    0m4.471s
user    0m0.111s
sys     0m0.049s

But that's still not awesome, will investigate how to deactivate the webhook completely.

Mentioned in SAL (#wikimedia-cloud) [2025-10-27T11:10:52Z] <dcaro> removing taskruns/pipelineruns v1beta1 version from the stored list in the crds (T408127)

Patched toolsbeta, no clear difference, running tests now:

root@toolsbeta-test-k8s-control-11:~# kubectl patch customresourcedefinitions pipelineruns.tekton.dev --subresource='status' --type='merge' -p '{"status":{"storedVersions":["v1"]}}'
root@toolsbeta-test-k8s-control-11:~# kubectl patch customresourcedefinitions taskruns.tekton.dev --subresource='status' --type='merge' -p '{"status":{"storedVersions":["v1"]}}'

Mentioned in SAL (#wikimedia-cloud) [2025-10-27T11:16:45Z] <dcaro> removing taskruns/pipelineruns v1beta1 version from the stored list in the crds (T408127)

Deployed in tools, no big difference either :/, running tests:

root@tools-k8s-control-9:~# time kubectl get taskruns -n image-build -l 'tekton.dev/pipelineRun=wm-lol-buildpacks-pipelinerun-7hxrc' 
NAME                                                 SUCCEEDED   REASON      STARTTIME   COMPLETIONTIME
wm-lol-buildpacks-pipelinerun-7hxrc-build-from-git   True        Succeeded   61d         61d

real    0m4.579s
user    0m0.110s
sys     0m0.047s

Probably the bottleneck is not the webhook anymore

Completely disabling the webhook in the crd did not change anything in toolsbeta (setting the strategy: None), so won't do in tools either as that's what comes from upstream.

Looking at something else.

Getting a single taskrun by name is pretty fast though (<0.2 seconds in prod, vs 4s when filtering with label), so we can modify https://gitlab.wikimedia.org/repos/cloud/toolforge/builds-api/-/commit/895f85567160638b70b76656bc719bd0e054ac59#0cfd90216a04e25257d891d68ee18c147f7f0d8c_220_220 to extract the taskrun from the pipelinerun status.childReferences:

root@tools-k8s-control-9:~# time kubectl get pipelineruns -n image-build wm-lol-buildpacks-pipelinerun-7hxrc -o yaml | grep -C 10 wm-lol-buildpacks-pipelinerun-7hxrc-build-from-git
  - emptyDir: {}
    name: source-ws
  - emptyDir: {}
    name: cache-ws
  - emptyDir: {}
    name: aptbuildpack-ws
status:
  childReferences:
  - apiVersion: tekton.dev/v1
    kind: TaskRun
    name: wm-lol-buildpacks-pipelinerun-7hxrc-build-from-git
    pipelineTaskName: build-from-git
  completionTime: "2025-08-26T13:45:25Z"
  conditions:
  - lastTransitionTime: "2025-08-26T13:45:25Z"
    message: 'Tasks Completed: 1 (Failed: 0, Cancelled 0), Skipped: 0'
    reason: Succeeded
    status: "True"
    type: Succeeded
  pipelineSpec:
    description: The Buildpacks pipeline builds source from a Git repository into

real    0m0.124s
user    0m0.134s
sys     0m0.025s

Instead of querying the taskruns by label.