Page MenuHomePhabricator

`toolforge-jobs run` fails with 403 error
Closed, DuplicatePublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

tools.mix-n-match@tools-sgebastion-10:~$ toolforge-jobs run --image tf-bullseye-std --mem 200Mi --continuous --command '/data/project/mix-n-match/mixnmatch_rs/run.sh' rustbot

What happens?:

ERROR: unable to create job: "HTTP 403: likely an internal bug: 403 Client Error: Forbidden for url: https://k8s.tools.eqiad1.wikimedia.cloud:6443/apis/apps/v1/namespaces/tool-mix-n-match/deployments. k8s JSON:

{
  "kind": "Deployment",
  "apiVersion": "apps/v1",
  "metadata": {
    "name": "rustbot",
    "namespace": "tool-mix-n-match",
    "labels": {
      "toolforge": "tool",
      "app.kubernetes.io/version": "1",
      "app.kubernetes.io/managed-by": "toolforge-jobs-framework",
      "app.kubernetes.io/created-by": "mix-n-match",
      "app.kubernetes.io/component": "deployments",
      "app.kubernetes.io/name": "rustbot",
      "jobs.toolforge.org/filelog": "yes",
      "jobs.toolforge.org/emails": "none"
    }
  },
  "spec": {
    "template": {
      "metadata": {
        "labels": {
          "toolforge": "tool",
          "app.kubernetes.io/version": "1",
          "app.kubernetes.io/managed-by": "toolforge-jobs-framework",
          "app.kubernetes.io/created-by": "mix-n-match",
          "app.kubernetes.io/component": "deployments",
          "app.kubernetes.io/name": "rustbot",
          "jobs.toolforge.org/filelog": "yes",
          "jobs.toolforge.org/emails": "none"
        }
      },
      "spec": {
        "restartPolicy": "Always",
        "containers": [
          {
            "name": "rustbot",
            "image": "docker-registry.tools.wmflabs.org/toolforge-bullseye-standalone:latest",
            "workingDir": "/data/project/mix-n-match",
            "command": [
              "/bin/sh",
              "-c",
              "--",
              "/data/project/mix-n-match/mixnmatch_rs/run.sh 1>>rustbot.out 2>>rustbot.err"
            ],
            "resources": {
              "limits": {
                "memory": "200Mi"
              },
              "requests": {
                "memory": "200Mi"
              }
            }
          }
        ]
      }
    },
    "replicas": 1,
    "selector": {
      "matchLabels": {
        "toolforge": "tool",
        "app.kubernetes.io/version": "1",
        "app.kubernetes.io/managed-by": "toolforge-jobs-framework",
        "app.kubernetes.io/created-by": "mix-n-match",
        "app.kubernetes.io/component": "deployments",
        "app.kubernetes.io/name": "rustbot",
        "jobs.toolforge.org/filelog": "yes",
        "jobs.toolforge.org/emails": "none"
      }
    }
  }
}

What should have happened instead?:

Start k8s job

Event Timeline

The command is a thin bash wrapper around a Rust binary, compiled on toolforge. Binary starts just fine when run manually in shell.

I have other k8s jobs scheduled/running for this tool, they all work fine

--wait instead of --continuous seems to work?

--continuous option creates a k8s deployment, for which there is a quota of 3 per tool. I see that you already have 3 deployments running, which might be the reason for 403. You can request a quota increase by filing a ticket against Toolforge (Quota-requests).

Ouch. That's certainly an unhelpful error message that should be fixed.