Page MenuHomePhabricator

Cannot deploy function-orchestrator in staging environment due to insufficient quotas
Closed, ResolvedPublic

Description

The name of the appArmor field has apparently changed. Error log:

ERROR:
  exit status 1

EXIT STATUS
  1

STDERR:
  W0813 14:10:13.216701 2851566 warnings.go:70] spec.template.metadata.annotations[container.apparmor.security.beta.kubernetes.io/function-orchestrator-main-orchestrator]: deprecated since v1.30; use the "appArmorProfile" field instead
  W0813 14:20:14.368859 2851566 warnings.go:70] spec.template.metadata.annotations[container.apparmor.security.beta.kubernetes.io/function-orchestrator-main-orchestrator]: deprecated since v1.30; use the "appArmorProfile" field instead
  Error: UPGRADE FAILED: release main-orchestrator failed, and has been rolled back due to atomic being set: context deadline exceeded

COMBINED OUTPUT:
  W0813 14:10:13.216701 2851566 warnings.go:70] spec.template.metadata.annotations[container.apparmor.security.beta.kubernetes.io/function-orchestrator-main-orchestrator]: deprecated since v1.30; use the "appArmorProfile" field instead
  W0813 14:20:14.368859 2851566 warnings.go:70] spec.template.metadata.annotations[container.apparmor.security.beta.kubernetes.io/function-orchestrator-main-orchestrator]: deprecated since v1.30; use the "appArmorProfile" field instead
  Error: UPGRADE FAILED: release main-orchestrator failed, and has been rolled back due to atomic being set: context deadline exceeded

Event Timeline

Those are warning (note the W prefix). They wouldn't stop the deployment from happening.

The actual reason is this https://logstash.wikimedia.org/app/discover#/doc/logstash-*/logstash-k8s-1-7.0.0-1-2025.08.13?id=8Q7Uo5gBgiE0yhV9mhEm

Pasting for convenience

(combined from similar events): Error creating: pods "function-orchestrator-main-orchestrator-df5fdb7c9-dmrg5" is forbidden: exceeded quota: quota-compute-resources, requested: limits.memory=4172Mi, used: limits.memory=7220Mi, limited: limits.memory=10Gi

Simply put, the namespace in staging isn't provisioned for pods that are this large. Resources in staging are scarce. We can probably bump the quotas up a bit, but probably just enough to allow this to run.

akosiaris renamed this task from Cannot deploy function-orchestrator due to deprecated appArmor field to Cannot deploy function-orchestrator in staging environment due to insufficient quotas.Aug 13 2025, 2:54 PM

Change #1178566 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/deployment-charts@master] staging: Bump wikifunctions quotas

https://gerrit.wikimedia.org/r/1178566

Those are warning (note the W prefix). They wouldn't stop the deployment from happening.

The actual reason is this https://logstash.wikimedia.org/app/discover#/doc/logstash-*/logstash-k8s-1-7.0.0-1-2025.08.13?id=8Q7Uo5gBgiE0yhV9mhEm

Pasting for convenience

(combined from similar events): Error creating: pods "function-orchestrator-main-orchestrator-df5fdb7c9-dmrg5" is forbidden: exceeded quota: quota-compute-resources, requested: limits.memory=4172Mi, used: limits.memory=7220Mi, limited: limits.memory=10Gi

Simply put, the namespace in staging isn't provisioned for pods that are this large. Resources in staging are scarce. We can probably bump the quotas up a bit, but probably just enough to allow this to run.

Aha, we can pull the memory for staging back down. I suppose the issue is the switch-over with the new pods and the old ones, which is why the change to increase the memory itself deployed fine, but the next deploy failed?

Change #1178566 merged by jenkins-bot:

[operations/deployment-charts@master] staging: Bump wikifunctions quotas

https://gerrit.wikimedia.org/r/1178566

Change #1178571 had a related patch set uploaded (by Jforrester; author: Jforrester):

[operations/deployment-charts@master] wikifunctions: Pull down the memory limits for staging instances

https://gerrit.wikimedia.org/r/1178571

Those are warning (note the W prefix). They wouldn't stop the deployment from happening.

The actual reason is this https://logstash.wikimedia.org/app/discover#/doc/logstash-*/logstash-k8s-1-7.0.0-1-2025.08.13?id=8Q7Uo5gBgiE0yhV9mhEm

Pasting for convenience

(combined from similar events): Error creating: pods "function-orchestrator-main-orchestrator-df5fdb7c9-dmrg5" is forbidden: exceeded quota: quota-compute-resources, requested: limits.memory=4172Mi, used: limits.memory=7220Mi, limited: limits.memory=10Gi

Simply put, the namespace in staging isn't provisioned for pods that are this large. Resources in staging are scarce. We can probably bump the quotas up a bit, but probably just enough to allow this to run.

Aha, we can pull the memory for staging back down.

That's probably not needed. I bumped the quotas from 10G to 20G which should provide enough space for now.

I suppose the issue is the switch-over with the new pods and the old ones, which is why the change to increase the memory itself deployed fine, but the next deploy failed?

Yes. The deployment will start first and make sure new pods are ready and then kill new ones. Which in this cases means double the usage (cause it's 1+1 pod). Usually this is set to 25% (except MediaWiki which is on the very low single digits due to sheer size).

Patches deployed, you should be good to retry @cmassaro

Patches deployed, you should be good to retry @cmassaro

Thank you!

Change #1178571 merged by jenkins-bot:

[operations/deployment-charts@master] wikifunctions: Pull down the memory limits for staging instances

https://gerrit.wikimedia.org/r/1178571

Change #1178817 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] wikifunctions: Bump staging quota to 20G

https://gerrit.wikimedia.org/r/1178817

Change #1178821 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/deployment-charts@master] admin: Brown paper bag fix for wikifunctions

https://gerrit.wikimedia.org/r/1178821

Change #1178821 abandoned by Clément Goubert:

[operations/deployment-charts@master] admin: Brown paper bag fix for wikifunctions

Reason:

Done in https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1178817

https://gerrit.wikimedia.org/r/1178821

Change #1178817 merged by jenkins-bot:

[operations/deployment-charts@master] wikifunctions: Bump staging quota to 20G

https://gerrit.wikimedia.org/r/1178817

Thanks @claime. Stupid typo on my side.

Jdforrester-WMF assigned this task to akosiaris.

Confirm this is now fixed via a deploy, thank you!