Execution of the deployment pipeline should be configurable via .pipeline/config.yaml
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Ladsgroup
	Nov 23 2018, 1:00 PM

Description

ORES repo has two services, one uwsgi and one celery. Right now, blubber/the pipeline only makes one Dockerfile per repo. After talking to @akosiaris, it seems we need to discuss how we can fix this.

Maybe we can just have two production stages (like 'production-celery' and 'production-uwsgi') My knowledge about docker/blubber/helm is pretty basic. Would this work?

Details

	Subject	Repo	Branch	Lines +/-
	pipeline: Builder and stage implementation	integration/pipelinelib	master	+1 K -8
	pipeline: Directed graph execution model	integration/pipelinelib	master	+950 -1

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Open	None	T198901 Migrate production services to kubernetes using the pipeline
Declined	None	T182331 [Epic] Deploy ORES in kubernetes cluster
Declined	• ACraze	T210268 Build blubber file for ORES
Resolved	dduvall	T210267 Execution of the deployment pipeline should be configurable via .pipeline/config.yaml
Resolved	dduvall	T222199 Post generated docs for pipelinelib

Event Timeline

Ladsgroup created this task.Nov 23 2018, 1:00 PM

zeljkofilipin added a project: Release-Engineering-Team (Backlog).Nov 29 2018, 9:38 AM

jijiki triaged this task as Medium priority.Dec 3 2018, 1:22 PM

I didn't change the priority.

Ottomata subscribed.Jan 7 2019, 6:44 PM

thcipriani renamed this task from Blubber should be able to make multi docker files per repo to The continuous release pipeline should support more than one service per repo.Jan 7 2019, 6:49 PM

thcipriani updated the task description. (Show Details)

I'll need multiple service deployments for the same repo for T211247, but I don't need different Docker images for them. So I think this won't affect my use case, but I'd like to note that it would make sense to not couple specific Docker images with a repo.

Perhaps the blubber configs should live elsewhere somehow? Could blubber be configured to build docker images with the app repo as a dependency, rather than have the config expect to live in the repo itself?

Q: would blubber's variants be enough to support the wsgi vs celery use case?

thcipriani edited parent tasks, added: T210268: Build blubber file for ORES; removed: T182331: [Epic] Deploy ORES in kubernetes cluster.Jan 8 2019, 5:37 PM

thcipriani moved this task from Backlog to Doing on the Release Pipeline (Blubber) board.

In T210267#4860670, @Ottomata wrote:

Q: would blubber's variants be enough to support the wsgi vs celery use case?

I thought about it too and it's not super bad idea but we probably need to define all stages on two levels, like uwsgi-build, uwsgi-dev, uwsgi-test, uwsgi-prep, uwsgi-prod, celery-build, celery-dev, celery-test, celery-prep, celery-prod (maybe not test and dev but everything else)

I think there are a couple of problems with the current Continuous Delivery pipeline implementation:

Implicit assumption that every repo is one service
Implicit assumption that there is one test entrypoint per repo

There are workarounds for problem 2, but there are no workarounds for problem 1 just yet. Discussion here will probably inform the solution.

In T210267#4860350, @Ottomata wrote:

Perhaps the blubber configs should live elsewhere somehow? Could blubber be configured to build docker images with the app repo as a dependency, rather than have the config expect to live in the repo itself?

That's possible. There are a couple of ways we could do this. (1) Something akin to the deploy repos for scap3 that we use now. That is, a top repo that contains a .pipeline/blubber.yaml and the code itself as a submodule. Or (2) Something akin to the deployment-charts repo where the top level is a list of repositories that we can map to Blubberfiles or Dockerfiles. I think I'd like to avoid the latter since that would mean either (a) a single team becomes the bottleneck for making changes (as RelEng is now for integration/config) or (b) everyone is able to make modifications to all charts.

In T210267#4868941, @Ladsgroup wrote:

In T210267#4860670, @Ottomata wrote:

Q: would blubber's variants be enough to support the wsgi vs celery use case?

I thought about it too and it's not super bad idea but we probably need to define all stages on two levels, like uwsgi-build, uwsgi-dev, uwsgi-test, uwsgi-prep, uwsgi-prod, celery-build, celery-dev, celery-test, celery-prep, celery-prod (maybe not test and dev but everything else)

that'd work for building images manually, but currently the Continuous Delivery pipeline on Jenkins (which does image building magically on postmerge or when you push a tag) only looks for a single test variant and a single production variant. We could do as in .gitlab-ci.yaml which is close to this solution, that is: look for variants matching the expression /.*test/ and /.*production/.

What we've been talking about internally

For whatever reason it's become common to keep top-level dotfiles within a project itself (which I think looks cluttered, but that's likely been judged to be not a valid concern given the practice's proliferation). We've been talking about how to keep the .pipeline directory and still solve our implicit assumption problems. What our (folks in Release-Engineering-Team, so far) discussion has focused around is another file (groundbreaking, I know :)).

The file would look something like

.pipeline/config.yaml

---
# Tests for serviceOne: both run in parallel during the "test" stage of the Pipeline
- name: serviceOne-phpunit
  blubberfile: blubber-serviceOne.yaml
  stage: test
  variant: phpunit
  directory: .
- name: serviceOne-mocha
  blubberfile: blubber-serviceOne.yaml
  stage: test
  variant: mocha
  directory: .

# Tests for serviceTwo: run in parallel with the the serviceOne tests, also during the "test" stage of the Pipeline
- name: serviceTwo-junit
  blubberfile: blubber-serviceTwo.yaml
  stage: test
  variant: junit
  directory: src/serviceTwo

# Production service one. Image is build in the "production" stage of the Pipeline
- name: serviceOne
  blubberfile: blubber-serviceOne.yaml
  stage: production
  directory: .

# Production service two. Image is built in the "production" stage of the Pipeline (in parallel with the servieOne image)
- name: serviceTwo
  blubberfile: blubber-serviceTwo.yaml
  stage: production
  directory: src/serviceTwo
...

Would that solve the problems in this task?

What new problems does this create?

Harder to link an image to a repo
Potential that one merge can use a lot of CI executors (also currently the case, but RelEng has some ability to mitigate currently)
Others?

I'd be interested in what folks think about the above?

Seems like it would work, but it doesn't look like this provides much beyond the different variants in the blubber.config files. Could the stage and directory keys just be built into the variant config? Or, does that couple the blubber format to our CI pipeline in a way we don't want?

In T210267#4873881, @thcipriani wrote:

What new problems does this create?

Potential that one merge can use a lot of CI executors (also currently the case, but RelEng has some ability to mitigate currently)

Others?

Thanks for the write-up, @thcipriani! A couple of concerns I had today after thinking more about the problem and reading the proposal:

Fragmentation of job scheduling. Introducing this piece into the pipeline (especially if it supports running separate services through the pipeline in parallel) might result in an overall job scheduling system that's difficult to grok and troubleshoot. We'd have Zuul on the one side scheduling jobs for the repo but then we'd have a single job from that scheduler forking off multiple CD pipeline runs. This sort of speaks to shortcomings in Zuul v2—if it supported repo-authoritative pipeline-job mappings we could probably implement this logic there—and maybe we're willing to accept the cost of this complexity for now since plans to replace Zuul v2 are still very much undefined.
Lack of visibility/reporting into the multiple CD pipeline invocations. Since it's downstream from Zuul, I don't see a way to report results back to Gerrit for each distinct pipeline run. Upon failure of any of the runs, we'd only see a single failure reported in Gerrit and a single scheduled job in Zuul. We'd also only ever see one job on the CI/Zuul dashboard—a user would have to go digging for the progress and/or results of individual runs.
The example config format could be more constrained IMO. The format you have would allow for some really odd configurations such as having different Blubber files for test and production stages of the same service, or defining a production stage for a service without a test stage. What about something simpler like the following?

.pipeline/config.yaml

pipelines:
  serviceOne:
    blubberfile: serviceOne/blubber.yaml # could be the default based on service name for the dir
    helmConfig: serviceOne/helm.yaml # ditto
    directory: src/serviceOne
    variants:
      test: [phpunit, mocha] # defaults to ["test"]
      production: foo # defaults to "production", also supports false for test-only runs
  serviceTwo:
    directory: src/serviceTwo

# room for future parameters, e.g.
# concurrency: 2

• LarsWirzenius subscribed.Jan 16 2019, 3:21 PM

In T210267#4873881, @thcipriani wrote:

I think there are a couple of problems with the current Continuous Delivery pipeline implementation:

Implicit assumption that every repo is one service

For the usecase of ores, the only difference between ServiceOne and ServiceTwo is the entrypoint. Everything else is the same.

Implicit assumption that there is one test entrypoint per repo

Regarding tests, python libraries run flake8 (linting) and pytest (unit/integration tests). I can wrap it in tox.ini but I rather not but if it's decided and you only let one entrypoint, I can fix that.

• Phabricator_maintenance moved this task from Backlog to Acknowledged on the SRE board.Jan 26 2019, 10:44 PM

thcipriani mentioned this in T216272: The pipeline should provide a way to save artifacts from a stage.Feb 15 2019, 7:19 PM

thcipriani edited projects, added Release Pipeline; removed Release Pipeline (Blubber).

I went a little crazy with a new config proposal in anticipation of us implementing T216272: The pipeline should provide a way to save artifacts from a stage. It's more loosely coupled, like what @thcipriani proposed earlier, with some extra fields for clearly defining the way in which stages should be executed and different methods for publishing artifacts. We'd likely want some basic policy/validation that enforces sanity (e.g. if it's publishing an image in a stage, it must also specify testDeploy, etc.). Useful defaults would also be important to cut down on configuration duplication.

Overall, something like this would decouple our servier-pipeline scripts from project needs, and render the former "just" an implementation that could be swapped out down the road depending on which way we go with CI technologies (Zuul/Jenkins) in coming quarters.

.pipeline/config.yaml

pipelines:
  serviceOne:
    blubberfile: serviceOne/blubber.yaml # could be the default based on service name for the dir
    directory: src/serviceOne
    execution:                           # an "execution plan" (a directional graph of stages to run)
      - [unittests, mocha]               # set of stages to run in parallel
      - production                       # next stage to run if the previous ran successfully
    stages:                              # stage defintions
      - name: unittests
        variant: phpunit                 # defaults to the stage name but can be different
        publish:
          - type: files                  # publish select artifact files from the built/run image
            paths: ["foo/*", "bar"]      # copy files {foo/*,bar} from the image fs to ./artifacts/{foo/*,bar}
      - name: mocha                      # default (build/run "mocha" variant, no artifacts, etc.)
      - name: production
        testDeploy:                      # deploy to the "ci" k8s cluster, run `helm test`, etc.
          - chart: http://helm/chart     # use this chart (don't need the helmConfig field anymore)
        publish:
          - type: image                  # publish built image to our docker registry
            tags: [candidate]            # additional tags
        deploy: true                     # al final, trigger production deployment (however)
  serviceTwo:
    directory: src/serviceTwo

dduvall mentioned this in T209106: Setup session storage service testing/continuous integration.Feb 15 2019, 11:43 PM

akosiaris mentioned this in T182332: Refactor ORES puppet for Kubernetes.Feb 21 2019, 11:51 AM

brennen subscribed.Mar 13 2019, 4:36 PM

awight unsubscribed.Mar 21 2019, 4:04 PM

Ladsgroup mentioned this in T219708: [Discuss] Changes to ores codebase for migrating to kubernets .Mar 30 2019, 4:02 PM

Change 502917 had a related patch set uploaded (by Dduvall; owner: Dduvall):
[integration/pipelinelib@master] pipeline: Execution graph and contexts

https://gerrit.wikimedia.org/r/502917

Change 502918 had a related patch set uploaded (by Dduvall; owner: Dduvall):
[integration/pipelinelib@master] pipeline: Builder and stage implementation

https://gerrit.wikimedia.org/r/502918

thcipriani renamed this task from The continuous release pipeline should support more than one service per repo to Execution of the deployment pipeline should be configurable via .pipeline/config.yaml.Apr 30 2019, 4:16 PM

dduvall closed subtask T222199: Post generated docs for pipelinelib as Resolved.May 7 2019, 11:10 PM

Change 502917 merged by jenkins-bot:
[integration/pipelinelib@master] pipeline: Directed graph execution model

https://gerrit.wikimedia.org/r/502917

Change 502918 merged by jenkins-bot:
[integration/pipelinelib@master] pipeline: Builder and stage implementation

https://gerrit.wikimedia.org/r/502918

Maintenance_bot removed a project: Patch-For-Review.May 22 2019, 3:45 PM

• LarsWirzenius unsubscribed.Jun 9 2019, 5:52 PM

• Phabricator_maintenance edited projects, added Release-Engineering-Team-TODO; removed Release-Engineering-Team (Backlog).Jun 12 2019, 11:52 PM

• Phabricator_maintenance moved this task from Should be empty (use Release-Engineering-Team) to Later / Need volunteer on the Release-Engineering-Team-TODO board.Jun 12 2019, 11:55 PM

greg added a project: Release-Engineering-Team.Jun 21 2019, 10:35 PM

greg edited projects, added Release-Engineering-Team (Pipeline); removed Release-Engineering-Team.Jun 24 2019, 9:20 PM

MSantos mentioned this in T223275: Create blubberfile for deploying kartotherian into docker environment..Jul 18 2019, 4:15 PM

Maintenance_bot moved this task from Backlog/Lift Wing to Backlog/ORES on the Machine-Learning-Team board.Jan 19 2021, 11:41 PM

thcipriani removed a project: Release-Engineering-Team (Pipeline).Apr 20 2021, 1:25 AM

thcipriani edited projects, added Release-Engineering-Team (thcipriani-workboard-fiddling); removed Release-Engineering-Team-TODO.Apr 20 2021, 3:41 AM

thcipriani moved this task from thcipriani-workboard-fiddling to Seen (ARCHIVE) on the Release-Engineering-Team board.Apr 20 2021, 3:53 AM

thcipriani edited projects, added Release-Engineering-Team; removed Release-Engineering-Team (thcipriani-workboard-fiddling).

thcipriani edited projects, added Release-Engineering-Team (Seen); removed Release-Engineering-Team.Apr 20 2021, 3:23 PM

There are now many services taking advantage of test execution and image building instrumented via .pipeline/config.yaml. The documentation provides quickstart guides and tutorials: https://wikitech.wikimedia.org/wiki/PipelineLib

Calling this one complete.

Execution of the deployment pipeline should be configurable via .pipeline/config.yamlClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

What we've been talking about internally

Execution of the deployment pipeline should be configurable via .pipeline/config.yaml
Closed, ResolvedPublic
Actions

Related Objects
Search...