Page MenuHomePhabricator

[Hypothesis] WE6.3.10 start a beta for the push-to-deploy features
Closed, ResolvedPublic

Description

Original document (Comment-only): https://docs.google.com/document/d/1uYjqGpfvb8Q27Nc0-xRZ-WrHNCnBGeA-c8_CKKpd5zU/edit?tab=t.0#heading=h.uhbe07clzg9n

DONE: Beta start email: https://lists.wikimedia.org/hyperkitty/list/cloud-announce@lists.wikimedia.org/thread/5D7NK7Z7KMWQPWQC23453YB7FV555Q5R/

Toolforge push-to-deploy beta plan

Other docs:

We want to gather feedback from users on the current direction and implementation for the push-to-deploy feature for toolforge.

Timeline (1 FTEs, better 2 FTEs partial time for reviews/pairing) - ~10 months plan

[now - 27th June 2025] Announce the upcoming beta
  • Prepare/define the feedback flows
[30th June - 31st October 2025] Start the beta
  • Announce to the community including
    • Documentation
      • How to use the new features
      • How to give feedback and how we are going to handle it
  • Gather feedback from users
  • Fix bugs that might arise
  • [every 2 weeks] Check-point for direction/summarize feedback to date (2 weeks after start, 2 weeks cadence, at least 7 cycles - 3.75 months)
    • Decide and implement a round of new features
    • Iterate if needed
[1st November - 31st December 2025] Start stabilization phase (duration 4 months)
  • Fix bugs + stabilization features
[5th Jan 2026] Release as stable
  • Regular development cycle

Beta Scope and Limitations

For a detailed list of features see Components API MVP, the following is a summary of the main features.

The minimal scope will include all of:

  • Only continuous components support
  • Only buildservice based components support (build from source code)
  • Trigger deploy with deploy-token and cli
  • Single deployment (no queues)

An extended scope: some of these features might be included if there is time:

Out of scope:

  • Rollback support
  • Use an external URL for the tool config yaml (instead of having to manually configure the first time)
  • Component deployment ordering/dependency definition

Features notes

Minimal example of the supported tool config:

config-version: v0.1
components:
  api:
    description: A python-flask api
    type: continuous
    build:
      repository: https://gitlab.wikimedia.org/toolforge-repos/mytool
      ref: main
    run:
      port: 5000
      command: api

  celery-worker:
    description: Celery worker for long-running tasks
    type: continuous 
    build:
      repository: https://gitlab.wikimedia.org/toolforge-repos/mytool
      ref: main
    run:
        command: celery-worker
Deployment Workflow

For the beta, the deployment process will look like something like this:

Basic CLI flow:

  1. User uploads their tool configuration toolforge components config create
    1. We should have a clear documentation on the structure of a config object
    2. Document how to import from a file or similar external source (git url, etc.)
  2. User triggers a deployment for their tool toolforge components deployment create
    1. System generates a unique deploy_id (e.g., datestamp-randint)
    2. System creates a ToolDeployment CRD instance with:
      1. The generated deploy_id
      2. The current timestamp as creation_time
      3. Initial status set to "PENDING"
      4. Empty builds object
      5. Empty runs object
  3. For each component in the ToolConfig:
    1. The system initiates a build process
    2. Adds an entry to the builds object in the ToolDeployment CRD with:
      1. A generated build_id
      2. Initial status set to "PENDING"
    3. System updates the ToolDeployment CRD status to "IN_PROGRESS"
  4. As each component build progresses:
    1. The system updates the corresponding build status in the ToolDeployment CRD
    2. When all the builds are complete, for each components the system runs it (ex. create continuous job)
      1. The system updates the corresponding run status in the ToolDeployment CRD
    3. When all runs are complete:
      1. The system sets the overall ToolDeployment status to "FINISHED" if all builds and runs succeeded, or "FAILED" if any build or run failed.
  5. User can check the deployment status (per-build, per-run, general) and details with toolforge components deployment show <deploy_id>
  6. User can see all the latest deployments, and if they failed or not, with toolforge components deployment list

Basic Automated/webhook flow:

  1. User uploads their tool configuration toolforge components config create
  2. User creates a deploy token toolforge components deploy-token create
  3. User configures their CI scripts/system to do a call to https://api.svc.beta.toolforge.org (see this for a full working example)
  4. When the user triggers the CI action (ex. On push to the repository main branch), a deployment is created, that follows the same process as the basic cli flow.

Related Objects

StatusSubtypeAssignedTask
In Progresskomla
Resolveddcaro
OpenNone
ResolvedLucasWerkmeister
Resolvedmatmarex
ResolvedLegoktm
ResolvedLegoktm
In Progressdcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
Invaliddcaro
Resolveddcaro
ResolvedSlst2020
Resolveddcaro
ResolvedSlst2020
Resolveddcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
Opendcaro
Resolveddcaro
ResolvedNone
Opendcaro
In ProgressFeatureRaymond_Ndibe
ResolvedFeatureRaymond_Ndibe
In ProgressRaymond_Ndibe
OpenNone
ResolvedFeatureRaymond_Ndibe
ResolvedFeatureRaymond_Ndibe
InvalidRaymond_Ndibe
InvalidRaymond_Ndibe
DuplicateRaymond_Ndibe
In ProgressRaymond_Ndibe
InvalidRaymond_Ndibe
OpenRaymond_Ndibe
OpenRaymond_Ndibe
OpenRaymond_Ndibe
In ProgressRaymond_Ndibe
DuplicateRaymond_Ndibe
In ProgressRaymond_Ndibe
ResolvedRaymond_Ndibe
ResolvedRaymond_Ndibe
Resolveddcaro
Opendcaro
ResolvedSlst2020
Resolveddcaro
ResolvedSlst2020
ResolvedSlst2020
Resolveddcaro
ResolvedSlst2020
OpenNone
Resolved aborrero
Resolveddcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
OpenNone
ResolvedRaymond_Ndibe
ResolvedRaymond_Ndibe
ResolvedFeatureRaymond_Ndibe
OpenRaymond_Ndibe
Resolveddcaro
ResolvedRaymond_Ndibe
In ProgressRaymond_Ndibe
OpenRaymond_Ndibe
ResolvedRaymond_Ndibe
DeclinedNone
ResolvedRaymond_Ndibe
ResolvedRaymond_Ndibe
ResolvedRaymond_Ndibe
OpenRaymond_Ndibe
Resolvedtaavi
ResolvedRaymond_Ndibe
Resolveddcaro
Resolveddcaro
ResolvedRaymond_Ndibe
ResolvedRaymond_Ndibe
Resolved Chuckonwumelu
Duplicate Chuckonwumelu
Resolvedtaavi
Resolveddcaro
Resolveddcaro
ResolvedFeaturedcaro
Resolveddcaro
OpenNone
OpenNone
ResolvedDamianZaremba
OpenNone
StalledRaymond_Ndibe
OpenBUG REPORTNone
Opendcaro

Event Timeline

dcaro changed the task status from Open to In Progress.
dcaro triaged this task as High priority.
dcaro updated the task description. (Show Details)
dcaro updated the task description. (Show Details)
dcaro removed subscribers: Harej, zhuyifei1999, Bstorm, nskaggs.
dcaro updated the task description. (Show Details)