Page MenuHomePhabricator

[components-api] add order to the components deployment
Open, HighPublic

Description

There's no new endpoints, but there's some modifications to the existing ones.

Extracting the current state of the deployment

In order to be able to create a deployment plan, we will have to be able to extract (from builds/jobs-api) the current state of the deployment to compare against the desired one.

This means that we might want to fully model a 'tool status', so this will include that exercise.

DAG pipeline resolution

When we get a new deployment request, the /tool/<toolname>/deploy endpoint has to:

  • Diff the intended state with the current state
  • Get the sorted list of tasks to execute

There is a small POC of that resolution algorithm here https://gitlab.wikimedia.org/sstefanova/toolforge-toolconfig-poc (by Slavina Stefanova)

Pipeline execution

See the decision request on how to implement the asynchronous pipeline running: T362224: Decision request - What to use for toolforge components api task execution

Showing the expected deployment plan

We could implement some option to just show the deployment plan but not acting on it, that would help debugging.

That could be done with something like toolforge components deploy --only-plan.

If that helps, then we will have to add that option to the deploy endpoint.

Example config to be implemented

The key new data bits are:

  • type -> specifies the component type, either continuous or scheduled
  • run[*] -> add all the options for both scheduled jobs and continuous jobs respectively
  • wait-for -> to define the order of deployment

This might have changed (depends on how the previous tasks end), but you should be able to do something like:

$ cat mytool.yaml
config-version: 0.1

components:
 toolhunt-cleanup:
   description: cleanup job
   type: scheduled
   build:
     repository: https://github.com/wikimedia/toolhunt-cleanup
   wait-for:
      - toolhunt-api
   run:
      command: cleanup
      schedule: "00 * * * *"

 toolhunt-api:
   description: a python-flask api
   type: continuous
   build:
     repository: https://github.com/wikimedia/toolhunt
   wait-for:
     - toolhunt-migratedb

 toolhunt-migratedb:
   description: run db migrations
   type: one-off
   build:
     reuse-from: toolhunt-api
   run:
      command: migratedb

 toolhunt-flower:
   description: dashboard for monitoring and managing celery
   type: continuous
   build:
     reuse-from: toolhunt-api
   run:
     command: flower
   wait-for:
      - toolhunt-api


$ toolforge component deploy mytool.yaml
Starting deploy for toolhunt v1.0.0...
Building images:
   [✅] toolhunt-cleanup
   [✅] toolhunt-api
Deploying components:
   [✅] toolhunt-migratedb
   [✅] toolhunt-api (waiting for toolhunt-migratedb)
   [✅] toolhunt-cleanup (waiting for toolhunt-api)
   [✅] toolhunt-flower (waiting for toolhunt-api)
All components deployed

Related Objects

StatusSubtypeAssignedTask
In Progresskomla
Resolveddcaro
ResolvedLucasWerkmeister
Resolvedmatmarex
ResolvedLegoktm
ResolvedLegoktm
In Progressdcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
Invaliddcaro
Resolveddcaro
ResolvedSlst2020
Resolveddcaro
ResolvedSlst2020
Resolveddcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
Opendcaro
Resolveddcaro
ResolvedNone
Opendcaro
StalledFeatureRaymond_Ndibe
ResolvedFeatureRaymond_Ndibe
In ProgressRaymond_Ndibe
OpenNone
ResolvedFeatureRaymond_Ndibe
ResolvedFeatureRaymond_Ndibe
InvalidRaymond_Ndibe
InvalidRaymond_Ndibe
DuplicateRaymond_Ndibe
In ProgressRaymond_Ndibe
StalledRaymond_Ndibe
In ProgressRaymond_Ndibe
OpenRaymond_Ndibe
OpenRaymond_Ndibe
OpenRaymond_Ndibe
OpenRaymond_Ndibe
OpenRaymond_Ndibe
Opendcaro
ResolvedSlst2020
Resolveddcaro
ResolvedSlst2020
ResolvedSlst2020
Resolveddcaro
ResolvedSlst2020
OpenNone
Resolved aborrero
Resolveddcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
ResolvedRaymond_Ndibe
ResolvedFeatureRaymond_Ndibe
OpenRaymond_Ndibe
Resolveddcaro
ResolvedRaymond_Ndibe
In ProgressRaymond_Ndibe
OpenRaymond_Ndibe
ResolvedRaymond_Ndibe
DeclinedNone
ResolvedRaymond_Ndibe
ResolvedRaymond_Ndibe
ResolvedRaymond_Ndibe
OpenRaymond_Ndibe
Resolvedtaavi
ResolvedRaymond_Ndibe

Event Timeline

dcaro renamed this task from [component-api] add one-off, scheduled and continuous jobs support to the yaml + api (unrefined) to [component-api] add one-off, scheduled and continuous jobs support to the yaml + api (to refine).Apr 10 2024, 12:52 PM
dcaro updated the task description. (Show Details)
dcaro renamed this task from [component-api] add one-off, scheduled and continuous jobs support to the yaml + api (to refine) to [component-api] add one-off, scheduled and continuous jobs support to the yaml + api.Apr 10 2024, 12:56 PM
dcaro triaged this task as High priority.
dcaro updated the task description. (Show Details)
dcaro updated the task description. (Show Details)
dcaro renamed this task from [component-api] add one-off, scheduled and continuous jobs support to the yaml + api to [components-api] add one-off, scheduled and continuous jobs support to the yaml + api.Apr 16 2024, 12:28 PM

The "wait-for" idea is really nice, but I don't think it's a requirement for "one-off, scheduled and continuous jobs support".

Maybe we could start implementing everything else, and let users handle the order of deployments? They could even start two deployments in parallel (e.g. toolhunt-api and toolhunt-flower), if they want to do so.

Having a "deployment plan" is only needed to implement an endpoint that deploys all components of a tool (the all endpoint discussed in T362066). But I think that could be implemented in a second step, so that we can have some early adopters of the component API and see what are their needs and use cases.

Just my 2c :)

The "wait-for" idea is really nice, but I don't think it's a requirement for "one-off, scheduled and continuous jobs support".

Maybe we could start implementing everything else, and let users handle the order of deployments? They could even start two deployments in parallel (e.g. toolhunt-api and toolhunt-flower), if they want to do so.

Having a "deployment plan" is only needed to implement an endpoint that deploys all components of a tool (the all endpoint discussed in T362066). But I think that could be implemented in a second step, so that we can have some early adopters of the component API and see what are their needs and use cases.

Just my 2c :)

I would have started with that too, but the thing is that the DAG resolution is already done, imo the complicated part of this task is the state extraction from k8s, so there's no extra work needed for the DAG resolution.

You would still need the pipeline-like feature for build+deploy of a single component, so we don't avoid the need for a pipeline either.

So though I agree that it's not a requirement, it's related enough and simple enough to be able to be done at the same time.

I have a thing against reuse-from. It is not immediately clear what it means by just looking at it. depends-on is a more descriptive name if I understand the supposed meaning of reuse-from correctly. In contrast reuse-from sounds like we are somehow reusing the configuration of a particular component in another. Obviously this is not already set in stone but it's important to point it out

I have a thing against reuse-from. It is not immediately clear what it means by just looking at it. depends-on is a more descriptive name if I understand the supposed meaning of reuse-from correctly. In contrast reuse-from sounds like we are somehow reusing the configuration of a particular component in another. Obviously this is not already set in stone but it's important to point it out

An earlier version os this yaml used depends-on, but it confusingly meant two distinct things depending on context:

  1. a build-time dependency on another component's image (i.e use of another compenent's image as opposed to building from source)
  2. a run-time dependency on another component (e.g. don't run a db-dependent job before having run the db-migration)

To disambiguate this and make the terms more descriptive, depends-on was split into reuse-from to indicate case 1) and wait-for to indicate case 2). reuse-from mirrors the FROM directive in Dockerfile. I would not go back to depends-on because I think it's actually less descriptive, but maybe we can consider other alternatives such as use-image or image-from or inherit-image, if you think that would be more descriptive.

Maybe just use: <other-component>?

I like use-image, image-from or reuse-image. Basically, anything with the word image in it to clarify that the image is "reused", not the config.

I like use-image, image-from or reuse-image. Basically, anything with the word image in it to clarify that the image is "reused", not the config.

I suggest using build instead of image, as it's not really an image you are configuring, just the build (the fact that it builds a container image is dependent on the runtime, that might change eventually, for example mircoVMs or whatever 👾).

Hmm, we might not be the best folks to decide which term is more descriptive/unambiguous because we're too deep in the soup. 🙈 Could we somehow run a quick poll?

To me an image is a "thing" while a build is a "process", which feels more fuzzy/unclear in terms what goes into it.

Having tinkered a bit with microVMs, one of the more convenient ways to create one is by extracting the rootfs from a container image and combining it with a kernel... 😈

I like where this is going :)

Thanks! :) that's very nice to hear!

dcaro renamed this task from [components-api] add one-off, scheduled and continuous jobs support to the yaml + api to [components-api] add order to the components deployment.May 13 2025, 7:00 AM