Page MenuHomePhabricator

Migrate Proton to k8s
Closed, ResolvedPublic

Description

  • Chart created, reviewed and merged
  • proton namespaces on k8s created
  • proton tokens for k8s created
  • calico rules created, if applicable
  • helmfile.d stanzas created
  • deploy
  • Switchover traffic from proton hosts to kubernetes

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 516710 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[mediawiki/services/chromium-render@master] build: Create initial pipeline configuration

https://gerrit.wikimedia.org/r/516710

mobrovac renamed this task from Migrate chromium-render to k8s to Migrate Proton to k8s.Jun 13 2019, 7:23 AM
LGoto triaged this task as Medium priority.Jun 20 2019, 3:47 PM
LGoto moved this task from Needs triage to Upcoming on the Product-Infrastructure-Team-Backlog board.
LGoto added subscribers: Mholloway, LGoto.

@Mholloway Can you update this ticket with a description?

Change 516710 merged by jenkins-bot:
[mediawiki/services/chromium-render@master] build: Create initial pipeline configuration

https://gerrit.wikimedia.org/r/516710

Change 520317 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[integration/config@master] layout: [chromium-render] Enable pipeline tests

https://gerrit.wikimedia.org/r/520317

Change 520317 abandoned by Jforrester:
layout: [chromium-render] Enable pipeline tests

Reason:
Done elsewise.

https://gerrit.wikimedia.org/r/520317

MSantos raised the priority of this task from Medium to High.Nov 21 2019, 2:17 PM
MSantos added a parent task: Restricted Task.Dec 12 2019, 5:39 PM
Mholloway assigned this task to MSantos.Feb 5 2020, 3:52 PM
This comment was removed by akosiaris.
akosiaris updated the task description. (Show Details)
akosiaris added a subscriber: MSantos.

Change 580294 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[labs/private@master] Add k8s dummy tokens for 3 new services.

https://gerrit.wikimedia.org/r/580294

Change 580295 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Kubernetes: Create token stanzas for some new services

https://gerrit.wikimedia.org/r/580295

Change 599812 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/deployment-charts@master] Create namespaces/calico rules for new services

https://gerrit.wikimedia.org/r/599812

Change 580294 merged by Alexandros Kosiaris:
[labs/private@master] Add k8s dummy tokens for 3 new services.

https://gerrit.wikimedia.org/r/580294

Change 580295 merged by Alexandros Kosiaris:
[operations/puppet@production] Kubernetes: Create token stanzas for some new services

https://gerrit.wikimedia.org/r/580295

Change 599812 merged by jenkins-bot:
[operations/deployment-charts@master] Create namespaces/calico rules for new services

https://gerrit.wikimedia.org/r/599812

akosiaris updated the task description. (Show Details)Jun 2 2020, 12:06 PM

@Mholloway, namespaces, rules, tokens have been created. Chart has been merged and publish. You are free to deploy. You will require a change like 968132909b4d24192b2f69a657c14bb30acd7a42 in order to instantiate the first deploy, feel free to add me as a reviewer.

Change 602164 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[operations/deployment-charts@master] Chromium-render: Add initial helmfile stanzas

https://gerrit.wikimedia.org/r/602164

Change 602164 merged by jenkins-bot:
[operations/deployment-charts@master] Chromium-render: Add initial helmfile stanzas

https://gerrit.wikimedia.org/r/602164

Mholloway added a comment.EditedJun 8 2020, 9:08 PM

Hmm, I've merged https://gerrit.wikimedia.org/r/602164 in preparation to deploy the service, but it doesn't look like Puppet is going to create the .hfenv files and private/ subdirectories in the per-cluster service directories as I would expect. Something must be wrong.

Mholloway updated the task description. (Show Details)Jun 9 2020, 2:02 PM

Change 604046 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[operations/deployment-charts@master] Proton: Fix helmfile paths and namespace refs

https://gerrit.wikimedia.org/r/604046

Change 604046 merged by jenkins-bot:
[operations/deployment-charts@master] Proton: Fix helmfile paths and namespace refs

https://gerrit.wikimedia.org/r/604046

@akosiaris I'm still running into a problem deploying this. It seems that private/secrets.yaml is still missing:

mholloway-shell@deploy1001:/srv/deployment-charts/helmfile.d/services/staging/proton$ source .hfenv; helmfile diff
Adding repo stable https://releases.wikimedia.org/charts/
"stable" has been added to your repositories

Updating repo
Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
...Successfully got an update from the "stable" chart repository
Update Complete.

skipping missing values file matching "private/secrets.yaml"
Comparing release=production, chart=stable/chromium-render
in ./helmfile.yaml: helm exited with status 1:
  Error: Get http://localhost:8080/api/v1/namespaces/proton/pods?labelSelector=app%3Dhelm%2Cname%3Dtiller: dial tcp [::1]:8080: connect: connection refused

The same thing happens for mobileapps.

@akosiaris I'm still running into a problem deploying this. It seems that private/secrets.yaml is still missing:

That's actually a warning and not a failure. We can silence it of course, but it's not the reason for the issue below

mholloway-shell@deploy1001:/srv/deployment-charts/helmfile.d/services/staging/proton$ source .hfenv; helmfile diff
Adding repo stable https://releases.wikimedia.org/charts/
"stable" has been added to your repositories

Updating repo
Hang tight while we grab the latest from your chart repositories...
...Skip local chart repository
...Successfully got an update from the "stable" chart repository
Update Complete.

skipping missing values file matching "private/secrets.yaml"
Comparing release=production, chart=stable/chromium-render
in ./helmfile.yaml: helm exited with status 1:
  Error: Get http://localhost:8080/api/v1/namespaces/proton/pods?labelSelector=app%3Dhelm%2Cname%3Dtiller: dial tcp [::1]:8080: connect: connection refused

Now that's my mistake. Fixed in the puppet/private repo.

The same thing happens for mobileapps.

Same puppet/private repo fixed that as well.

Mholloway updated the task description. (Show Details)Jun 11 2020, 6:44 PM

Change 607531 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] lvs: Add new proton TLS service

https://gerrit.wikimedia.org/r/607531

Change 607532 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] lvs: Switch proton to lvs_setup

https://gerrit.wikimedia.org/r/607532

Change 607533 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] lvs: Switch proton to monitoring_setup

https://gerrit.wikimedia.org/r/607533

Change 607534 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] lvs: Switch proton to production

https://gerrit.wikimedia.org/r/607534

Change 607535 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] proton: Switch dev restbase to talk to TLS proton

https://gerrit.wikimedia.org/r/607535

Change 607536 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] proton: Switch restbase production to TLS

https://gerrit.wikimedia.org/r/607536

Change 607531 merged by Alexandros Kosiaris:
[operations/puppet@production] lvs: Add new proton TLS service

https://gerrit.wikimedia.org/r/c/operations/puppet/ /607531

Change 607532 merged by Alexandros Kosiaris:
[operations/puppet@production] lvs: Switch proton to lvs_setup

https://gerrit.wikimedia.org/r/c/operations/puppet/ /607532

Change 607533 merged by Alexandros Kosiaris:
[operations/puppet@production] lvs: Switch proton to monitoring_setup

https://gerrit.wikimedia.org/r/c/operations/puppet/ /607533

Change 607534 merged by Alexandros Kosiaris:
[operations/puppet@production] lvs: Switch proton to production

https://gerrit.wikimedia.org/r/c/operations/puppet/ /607534

Change 607535 merged by Alexandros Kosiaris:
[operations/puppet@production] proton: Switch dev restbase to talk to TLS proton

https://gerrit.wikimedia.org/r/c/operations/puppet/ /607535

Change 607536 abandoned by Alexandros Kosiaris:
[operations/puppet@production] proton: Switch restbase production to TLS

Reason:
Done differently in https://gerrit.wikimedia.org/r/c/operations/puppet/ /610720

https://gerrit.wikimedia.org/r/607536

Change 610789 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] proton: Set LVS level OpenAPI checks on TLS

https://gerrit.wikimedia.org/r/610789

Change 610789 merged by Alexandros Kosiaris:
[operations/puppet@production] proton: Set LVS level OpenAPI checks on TLS

https://gerrit.wikimedia.org/r/610789

Change 610855 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/deployment-charts@master] proton: Amend prometheus-statsd config

https://gerrit.wikimedia.org/r/610855

Change 610855 merged by jenkins-bot:
[operations/deployment-charts@master] proton: Amend prometheus-statsd config

https://gerrit.wikimedia.org/r/610855

akosiaris closed this task as Resolved.Jul 10 2020, 7:25 AM
akosiaris updated the task description. (Show Details)

Resolving. This has been completed quite successfully. New dashboard is at https://grafana.wikimedia.org/d/llIEd7MMz/proton. I 'll archive the old one. \o/