Page MenuHomePhabricator

Add TLS termination to services running on kubernetes
Open, MediumPublic

Description

In order to add TLS termination to services running on k8s we need the following:

  • Finish adding the envoy TLS sidecar everywhere in k8s
  • Add Envoy the service-proxy capabilities on services outside of k8s (@Joe )
    • Add profile::services_proxy::envoy to all roles for applications
    • Modify service::configuration and whatever else is needed to make services only go via envoy
  • Add Envoy the service-proxy capabilities on k8s
  • For each service (see subtasks):
    • Add TLS LVS pool
    • Switch the services proxies to use it
    • Remove the non-TLS pool from pybal

The reason to do things in this order - specifically, adding LVS last - is not to put too much pressure on pybal to check too many things at once. We could nonetheless get away with just switching everything to use TLS for MediaWiki and that should be enough.

Details

ProjectBranchLines +/-Subject
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+6 -0
operations/deployment-chartsmaster+6 -0
operations/deployment-chartsmaster+4 -4
operations/deployment-chartsmaster+269 -250
operations/deployment-chartsmaster+263 -244
operations/deployment-chartsmaster+6 -0
operations/deployment-chartsmaster+264 -245
operations/deployment-chartsmaster+450 -646
operations/deployment-chartsmaster+4 -4
operations/deployment-chartsmaster+301 -240
operations/deployment-chartsmaster+296 -251
operations/deployment-chartsmaster+305 -235
operations/deployment-chartsmaster+287 -233
operations/deployment-chartsmaster+6 -0
operations/deployment-chartsmaster+12 -47
operations/deployment-chartsmaster+7 -1
operations/deployment-chartsmaster+288 -229
operations/deployment-chartsmaster+175 -119
operations/puppetproduction+1 -1
operations/puppetproduction+21 -0
operations/deployment-chartsmaster+18 -0
operations/deployment-chartsmaster+169 -112
operations/deployment-chartsmaster+130 -111
operations/deployment-chartsmaster+45 -74
operations/deployment-chartsmaster+133 -88
Show related patches Customize query in gerrit

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 14 2019, 8:08 AM
jijiki triaged this task as Medium priority.Oct 14 2019, 3:49 PM
Joe claimed this task.Oct 21 2019, 5:50 AM
Joe updated the task description. (Show Details)Nov 4 2019, 10:13 AM
Dzahn added a subscriber: Dzahn.Nov 7 2019, 3:47 PM
Joe updated the task description. (Show Details)Dec 6 2019, 9:21 AM

Change 554832 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/deployment-charts@master] blubberoid: break TLS functionality into a helper

https://gerrit.wikimedia.org/r/554832

Change 554833 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/deployment-charts@master] scaffold: import the blubberoid tls helpers in scaffold

https://gerrit.wikimedia.org/r/554833

Change 554834 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/deployment-charts@master] eventgate: convert to use the common tls templates

https://gerrit.wikimedia.org/r/554834

Change 554832 merged by jenkins-bot:
[operations/deployment-charts@master] blubberoid: break TLS functionality into a helper

https://gerrit.wikimedia.org/r/554832

Change 556137 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/deployment-charts@master] blubberoid: release new chart version using the common templates directory

https://gerrit.wikimedia.org/r/556137

Change 557846 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/deployment-charts@master] cxserver: add TLS termination

https://gerrit.wikimedia.org/r/557846

Change 557847 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/deployment-charts@master] cxserver: enable TLS in production

https://gerrit.wikimedia.org/r/557847

Change 554833 merged by Giuseppe Lavagetto:
[operations/deployment-charts@master] Create common template helpers directory

https://gerrit.wikimedia.org/r/554833

Change 556137 merged by Giuseppe Lavagetto:
[operations/deployment-charts@master] blubberoid: release new chart version using the common templates directory

https://gerrit.wikimedia.org/r/556137

Change 558091 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/deployment-charts@master] citoid: add TLS termination

https://gerrit.wikimedia.org/r/558091

Change 558092 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/deployment-charts@master] mathoid: add TLS termination

https://gerrit.wikimedia.org/r/558092

Change 558093 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/deployment-charts@master] termbox: add TLS termination

https://gerrit.wikimedia.org/r/558093

Change 557846 merged by jenkins-bot:
[operations/deployment-charts@master] cxserver: add TLS termination

https://gerrit.wikimedia.org/r/557846

Change 557847 merged by jenkins-bot:
[operations/deployment-charts@master] cxserver: enable TLS in production

https://gerrit.wikimedia.org/r/557847

Joe updated the task description. (Show Details)Dec 19 2019, 2:12 PM
Joe added a comment.Dec 19 2019, 2:15 PM

Port reservations are for now indicated here: https://wikitech.wikimedia.org/wiki/Service_ports

Change 559489 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] lvs::configuration: add cxserver-https

https://gerrit.wikimedia.org/r/559489

Change 559495 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] trafficserver::backend: switch to https for cxserver

https://gerrit.wikimedia.org/r/559495

Change 559489 merged by Giuseppe Lavagetto:
[operations/puppet@production] lvs::configuration: add cxserver-https

https://gerrit.wikimedia.org/r/559489

Change 559495 merged by Giuseppe Lavagetto:
[operations/puppet@production] trafficserver::backend: switch to https for cxserver

https://gerrit.wikimedia.org/r/559495

Change 558091 merged by jenkins-bot:
[operations/deployment-charts@master] citoid: add TLS termination

https://gerrit.wikimedia.org/r/558091

Change 594924 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] wikifeeds: Add TLS termination support

https://gerrit.wikimedia.org/r/594924

JMeybohm claimed this task.May 7 2020, 10:38 AM

Change 594924 merged by JMeybohm:
[operations/deployment-charts@master] wikifeeds: Add TLS termination support

https://gerrit.wikimedia.org/r/594924

Change 595144 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] wikifeeds: enable TLS with chart defaults

https://gerrit.wikimedia.org/r/595144

Change 594922 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] _tls_helpers: Use defaults provided in docker image

https://gerrit.wikimedia.org/r/594922

Change 594922 merged by jenkins-bot:
[operations/deployment-charts@master] _tls_helpers: Use defaults provided in docker image

https://gerrit.wikimedia.org/r/594922

Change 595505 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] parsoid: Add TLS termination support

https://gerrit.wikimedia.org/r/595505

Change 554834 abandoned by JMeybohm:
eventgate: convert to use the common tls templates

Reason:
superseded by I5d9ab4069f1087fa41e0d8a5290789cee1d434d8

https://gerrit.wikimedia.org/r/554834

Change 595144 merged by jenkins-bot:
[operations/deployment-charts@master] wikifeeds: enable TLS with chart defaults

https://gerrit.wikimedia.org/r/595144

Change 595505 abandoned by JMeybohm:
parsoid: Add TLS termination support

Reason:
Chart is not deployed but only used in dev environments

https://gerrit.wikimedia.org/r/595505

Change 595900 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] zotero: Add TLS termination

https://gerrit.wikimedia.org/r/595900

Change 558092 merged by jenkins-bot:
[operations/deployment-charts@master] mathoid: add TLS termination

https://gerrit.wikimedia.org/r/558092

Change 558093 merged by JMeybohm:
[operations/deployment-charts@master] termbox: add TLS termination

https://gerrit.wikimedia.org/r/558093

Change 596227 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] termbox: deploy up to date chart

https://gerrit.wikimedia.org/r/596227

Change 595900 merged by jenkins-bot:
[operations/deployment-charts@master] zotero: Add TLS termination

https://gerrit.wikimedia.org/r/595900

Change 596227 merged by JMeybohm:
[operations/deployment-charts@master] termbox: deploy up to date chart

https://gerrit.wikimedia.org/r/596227

Change 597032 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] mathoid: enable TLS with chart defaults

https://gerrit.wikimedia.org/r/597032

Change 597034 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] termbox: fix wrong TLS port

https://gerrit.wikimedia.org/r/597034

Change 597035 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] termbox: enable TLS with chart defaults

https://gerrit.wikimedia.org/r/597035

Change 597036 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] zotero: enable TLS with chart defaults

https://gerrit.wikimedia.org/r/597036

Change 597034 merged by JMeybohm:
[operations/deployment-charts@master] termbox: fix wrong TLS port

https://gerrit.wikimedia.org/r/597034

Change 597230 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] mathoid: switch to common_templates v0.2

https://gerrit.wikimedia.org/r/597230

Change 597230 merged by JMeybohm:
[operations/deployment-charts@master] mathoid: switch to common_templates v0.2

https://gerrit.wikimedia.org/r/597230

Change 597032 merged by JMeybohm:
[operations/deployment-charts@master] mathoid: enable TLS with chart defaults

https://gerrit.wikimedia.org/r/597032

Change 597275 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] mathoid: bump version for fixed template

https://gerrit.wikimedia.org/r/597275

Change 597275 merged by JMeybohm:
[operations/deployment-charts@master] mathoid: bump version for fixed template

https://gerrit.wikimedia.org/r/597275

Change 597303 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] tls_helper: fix the envoy config configmap

https://gerrit.wikimedia.org/r/597303

Change 597303 merged by jenkins-bot:
[operations/deployment-charts@master] tls_helper: fix the envoy config configmap

https://gerrit.wikimedia.org/r/597303

TLS enabled mathoid is corrently deployed in staging and codfw k8s clusters but not in eqiad. CPU throttling has increased a lot (due to the added envoy throttling * 30 replicas) and I see an significant increase in p99 compared to the last two days (where it was lower than usual, it seems).

TLS enabled mathoid is corrently deployed in staging and codfw k8s clusters but not in eqiad. CPU throttling has increased a lot (due to the added envoy throttling * 30 replicas) and I see an significant increase in p99 compared to the last two days (where it was lower than usual, it seems).

p99 looks normal long term, so I've deployed to eqiad as well now.

Change 598037 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] termbox: switch to common_templates v0.2

https://gerrit.wikimedia.org/r/598037

Change 598037 merged by jenkins-bot:
[operations/deployment-charts@master] termbox: switch to common_templates v0.2

https://gerrit.wikimedia.org/r/598037

Change 597035 merged by jenkins-bot:
[operations/deployment-charts@master] termbox: enable TLS with chart defaults

https://gerrit.wikimedia.org/r/597035

Change 597036 merged by jenkins-bot:
[operations/deployment-charts@master] zotero: enable TLS with chart defaults

https://gerrit.wikimedia.org/r/597036

Change 598435 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] admin: Increase maximum Pod memory to 3Gi

https://gerrit.wikimedia.org/r/598435

Change 598435 merged by jenkins-bot:
[operations/deployment-charts@master] admin: Increase maximum Pod memory to 3Gi

https://gerrit.wikimedia.org/r/598435

JMeybohm updated the task description. (Show Details)
JMeybohm updated the task description. (Show Details)Jun 22 2020, 7:59 AM
JMeybohm updated the task description. (Show Details)Jun 22 2020, 8:05 AM
jijiki moved this task from Incoming 🐫 to Unsorted on the serviceops board.Aug 17 2020, 11:47 PM
JMeybohm moved this task from Unsorted to Next up 🥌 on the serviceops board.Aug 18 2020, 10:01 AM