Page MenuHomePhabricator

Unconference: [Workshop] CI counselling session
Closed, ResolvedPublic

Description

Etherpad: https://etherpad.wikimedia.org/p/WMTC19-T238275


Introduction to current CI (new CI will be covered by another session)

Task: https://phabricator.wikimedia.org/T238275

Presentation of Zuul status page and the various pipelines

  • Zuul overview: https://integration.wikimedia.org/zuul/
  • Each pipeline group does different tests (very quick overview of each)
  • Test and gate-and-submit are split between the generic one and some MW-specific (one per minor release)

This code was made by Timo and upstreamed and does lazy refresh of content.

A different view is in Jenkins (Zuul controls Jenkins): https://integration.wikimedia.org/ci/

It's configured via the repository integration/config
3 parts repository

  • Docker containers
  • JJB for the Jenkins jobs definitions
  • Zuul config file zuul/layout.yaml that maps from repos to jenkins jobs and pipeline

Example with operations-dns-lint-docker

We use docker-pkg to convert dockerfile templates into dockerfiles.

  • docker-pkg written and maintained by SRE. Used to build WMF production Docker images
  • https://doc.wikimedia.org/docker-pkg/
  • Templating plus some Debian-packaging-style metadata
  • run.sh entry point doing the git fetch/checkout and invoke an entry point in the dev repo

zuul/layout.yaml is the entry point that maps repositories to pipeline and ultimately jobs in Jenkins. One can see it as a workflow system.

There is the history for all jobs but we do cleanup after a bit.

What does not work:

  • the experimental puppet compiler job
  • Moritz: if you start a new project what do you recommend? Travis?
    • Would be nice to run wikidata tests daily
      • 1) new CI would make it self serve, no more have to talk to us to setup fix. Similar to Travis. Going to take time
      • 2) We don't have not-triggered jobs right now, but we could do them. In our experiences it sends an email that gets ignored making just noise.
    • Generally fill a task in Phabricator against Continuous-Integration-Config
  • Extension dependencies and testing them together. Might be issues.
  • wmf-config* jobs are testing ~30 extensions together + mediawiki, you might break CI for many many projects, of which many critical ones.
  • We do not add more right now cause it is hmm complex and right now slow (cause we run all tests of all the repositories participating).
  • Some extensions can't be put in the gate because each-other "unit" tests breaks each other (they are really integration tests) and that's bad.
  • Run daily tests could be used to catch failures.
  • Master should always be buildable. There are ~1000 MW extensions (and ~50 skins) configured in CI and ~200 of them are in production.
    • We have a task to run tests for every single MW repo on a weekly basis - but CI would be unusable for most of the weekend.
    • In practice unbreak nows get realllly unbreak-now-y on the weekends
  • Math tests keep breaking
  • KH: "Naive questions":
    • QUESTION: Is it possible to add more hardware?
      • GG: The answer there is unfortunately we kinda messed up the timeline - the plan was to get a few $10k to get us over to Cloud Services - we have a few machines ??? and I felt bad...
      • GG: Hey Brooke - if we ''had'' money, what is the timeline?
      • Brooke: It isn't hard as far standing them up...
      • ACTION: GG: I can try and get a request in for ~$50k for a couple
      • JF / AM: We probably use more than 10% of Cloud Services; we use a lot for Beta Cluster which maybe should be re-considered.
      • Brooke: We have a need to duplicate our ??? soon
        • Lot of scary hardware, different kind of hardware etcs
  • QUESTION: Are we considering running Quibble on master when making changes, maybe an experimental job?
    • There is a similar task about it: https://phabricator.wikimedia.org/T235118
    • AM: I wanted to try to do proper release management for the software, tag a version, make sure that every user has the same version, Changelog, etc.
      • Also wanted to make sure that when we build CI containers, use same version across all containers
    • AM: We update Quibble test runner, we have to rebuild all the CI containers... Don't know how to solve this.
    • We ping always to specific versions of docker images and not LATEST to make sure everything works to avoid random upgrades of underlying softwares
    • ACTION: make it easier to upgrade Quibble on our CI infrastructure

Lars:

  • With new CI this big file will go away, yes?
    • Yes, the big layout.yaml file will go away when we move to self-service CI, but the same complexity for MediaWiki will remain, just moved elsewhere, probably directly inside the MediaWiki repos/branches.
    • However, it'll be simpler because the config file will exist per branch, rather than having different pieces for each of mw supported branches.
    • (This is where you can see that we have dropped php 7.0, 7.1, and HHVM!)

Brooke: Have a task about Go testing. How to take forward?

  • ACTION: Booke to ping James on the Phabricator task.