Page MenuHomePhabricator

JMeybohm
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Apr 2 2020, 9:01 AM (13 w, 6 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
JMeybohm (WMF) [ Global Accounts ]

Recent Activity

Yesterday

JMeybohm created T257408: Write a script to prune old chart versions/charts from chartmuseum.
Wed, Jul 8, 8:26 AM · serviceops, Prod-Kubernetes, Kubernetes

Tue, Jul 7

JMeybohm added a comment to T255835: Make pipelinelib able to update deployment chart image tags.

That sounds nice!
I would suggest to update the image version in the helmfiles.d values (e.g. https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/services/staging/blubberoid/values.yaml#33) instead of the chart itself, though. In general a new chart release is (or at least should be) only needed when substantial changes have been made to the containerized application (changes that would change the way the container is deployed/run, not changes to what is run inside the container).

Tue, Jul 7, 8:57 PM · Release-Engineering-Team-TODO (Release-Engineering-Team-TODO (2020-07-01 to 2020-09-30 (Q1))), Release Pipeline, Release-Engineering-Team (Pipeline)
JMeybohm created T257333: CI pipeline/job to build and release helm chart artifacts.
Tue, Jul 7, 3:55 PM · Release-Engineering-Team, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm moved T253843: Move helm chart repository out of git from Next up to Doing on the serviceops board.
Tue, Jul 7, 2:50 PM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm added a subtask for T256877: Handle sunset of stretch-backports: T257327: Rebuild wikimedia-stretch for repository updates.
Tue, Jul 7, 2:48 PM · Patch-For-Review, Operations
JMeybohm added a parent task for T257327: Rebuild wikimedia-stretch for repository updates: T256877: Handle sunset of stretch-backports.
Tue, Jul 7, 2:48 PM · Patch-For-Review, serviceops
JMeybohm created T257327: Rebuild wikimedia-stretch for repository updates.
Tue, Jul 7, 2:47 PM · Patch-For-Review, serviceops
JMeybohm updated subscribers of T255877: Move proton to use TLS only.

Proton has an TLS LVS already via fbee4a768b10e7e405e6ecf64ace2062004c5c36 / T225680

Tue, Jul 7, 7:36 AM · Patch-For-Review, Prod-Kubernetes, Kubernetes, serviceops, Operations

Mon, Jul 6

JMeybohm updated the task description for T253843: Move helm chart repository out of git.
Mon, Jul 6, 4:18 PM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm committed rLPRIde198ee862a0: Fix chartmuseum hiera path (common/profile -> role/common) (authored by JMeybohm).
Fix chartmuseum hiera path (common/profile -> role/common)
Mon, Jul 6, 1:47 PM
JMeybohm committed rLPRI2bd6a5247ec0: Add fake secrets for chartmuseum (authored by JMeybohm).
Add fake secrets for chartmuseum
Mon, Jul 6, 12:57 PM
JMeybohm closed T256970: Site: eqiad/codwf each 1 VM for helm-charts.wikimedia.org (chartmuseum), a subtask of T253843: Move helm chart repository out of git, as Resolved.
Mon, Jul 6, 10:28 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm closed T256970: Site: eqiad/codwf each 1 VM for helm-charts.wikimedia.org (chartmuseum) as Resolved.

VMs created, installed and ran puppet insetup role successfully. Both came up fine after reboot.

Mon, Jul 6, 10:28 AM · Patch-For-Review, vm-requests, Operations

Fri, Jul 3

JMeybohm committed rLPRId3a1da9e320b: secret: add dummy key for helm-charts (chartmuseum) (authored by JMeybohm).
secret: add dummy key for helm-charts (chartmuseum)
Fri, Jul 3, 7:54 AM

Thu, Jul 2

JMeybohm created T256970: Site: eqiad/codwf each 1 VM for helm-charts.wikimedia.org (chartmuseum).
Thu, Jul 2, 12:32 PM · Patch-For-Review, vm-requests, Operations
JMeybohm closed T253396: Upgrade all TLS enabled charts to v0.2 tls_helper, a subtask of T235411: Add TLS termination to services running on kubernetes, as Resolved.
Thu, Jul 2, 8:18 AM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm closed T253396: Upgrade all TLS enabled charts to v0.2 tls_helper as Resolved.

This is resolved now. For anyone passing along, please see: https://wikitech.wikimedia.org/wiki/Docker#Deleting_an_image_(from_registry) and T242604

Thu, Jul 2, 8:18 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes

Wed, Jul 1

JMeybohm updated the task description for T242604: Remove obsoleted docker images.
Wed, Jul 1, 2:35 PM · Patch-For-Review, User-brennen, Operations, Release Pipeline, Release-Engineering-Team, serviceops
JMeybohm assigned T256726: redis for docker-registry should have maxmemory-policy set to allkeys-lru to akosiaris.
Wed, Jul 1, 2:20 PM · serviceops, Prod-Kubernetes, Kubernetes, Operations
JMeybohm added a comment to T242604: Remove obsoleted docker images.

Unfortunately removing all tags of an image (e.g. repository) does not remove the repository itself from the registry[1][2]. What that means is that the "image" will still be listed in the catalog (GET /v2/_catalog).

Wed, Jul 1, 2:06 PM · Patch-For-Review, User-brennen, Operations, Release Pipeline, Release-Engineering-Team, serviceops
JMeybohm closed T256786: kubernetes unable to pull images from registry as Resolved.

Did a rolling restart on all affected nodes, we should be fine now. Sorry for the inconvenience and thanks a lot for the reports @jeena && @Mholloway !

Wed, Jul 1, 11:11 AM · Prod-Kubernetes, Kubernetes, serviceops, Page Content Service, Product-Infrastructure-Team-Backlog
JMeybohm added a comment to T256786: kubernetes unable to pull images from registry.

This is the old Puppet CA that some docker daemons have still loaded.
Unfortunately a docker reload does not reload the CA, so we need to do a docker restart on: kubernetes[2001-2004].codfw.wmnet,kubernetes[1001-1004].eqiad.wmnet Never Kubernetes nodes already started with the updated CA and are fine.

Wed, Jul 1, 9:41 AM · Prod-Kubernetes, Kubernetes, serviceops, Page Content Service, Product-Infrastructure-Team-Backlog
JMeybohm raised the priority of T256786: kubernetes unable to pull images from registry from High to Unbreak Now!.

Raising prio as we do have the same situation on prod clusters.

Wed, Jul 1, 8:44 AM · Prod-Kubernetes, Kubernetes, serviceops, Page Content Service, Product-Infrastructure-Team-Backlog
JMeybohm added a comment to T256786: kubernetes unable to pull images from registry.

It's only docker that is totally sure that the certificate is not valid, so I guess it does not reload ca-certificates (even on SIGHUP).

Wed, Jul 1, 8:41 AM · Prod-Kubernetes, Kubernetes, serviceops, Page Content Service, Product-Infrastructure-Team-Backlog
JMeybohm added a comment to T256786: kubernetes unable to pull images from registry.

Still getting ErrImagePull in kubectl get events:

73s         Normal    Pulling             Pod          pulling image "docker-registry.discovery.wmnet/wikimedia/mediawiki-services-mobileapps:2020-06-29-163540-production"
73s         Warning   Failed              Pod          Failed to pull image "docker-registry.discovery.wmnet/wikimedia/mediawiki-services-mobileapps:2020-06-29-163540-production": rpc error: code = Unknown desc = Error response from daemon: Get https://docker-registry.discovery.wmnet/v1/_ping: x509: certificate has expired or is not yet valid
73s         Warning   Failed              Pod          Error: ErrImagePull
Wed, Jul 1, 8:22 AM · Prod-Kubernetes, Kubernetes, serviceops, Page Content Service, Product-Infrastructure-Team-Backlog
JMeybohm renamed T256786: kubernetes unable to pull images from registry from mobileapps kubernetes deployment is timing out to kubernetes unable to pull images from registry.
Wed, Jul 1, 8:17 AM · Prod-Kubernetes, Kubernetes, serviceops, Page Content Service, Product-Infrastructure-Team-Backlog
JMeybohm triaged T256786: kubernetes unable to pull images from registry as High priority.
Wed, Jul 1, 8:12 AM · Prod-Kubernetes, Kubernetes, serviceops, Page Content Service, Product-Infrastructure-Team-Backlog
JMeybohm merged task T256796: Deploying blubberoid to staging fails to pull docker image into T256786: kubernetes unable to pull images from registry.
Wed, Jul 1, 8:12 AM · Prod-Kubernetes, Kubernetes
JMeybohm merged T256796: Deploying blubberoid to staging fails to pull docker image into T256786: kubernetes unable to pull images from registry.
Wed, Jul 1, 8:12 AM · Prod-Kubernetes, Kubernetes, serviceops, Page Content Service, Product-Infrastructure-Team-Backlog
JMeybohm added a project to T256786: kubernetes unable to pull images from registry: Kubernetes.
Wed, Jul 1, 8:09 AM · Prod-Kubernetes, Kubernetes, serviceops, Page Content Service, Product-Infrastructure-Team-Backlog

Tue, Jun 30

JMeybohm created T256762: Fix nginx config and caching for docker registry .
Tue, Jun 30, 2:47 PM · serviceops, Kubernetes, Operations
JMeybohm reopened T253396: Upgrade all TLS enabled charts to v0.2 tls_helper, a subtask of T235411: Add TLS termination to services running on kubernetes, as Open.
Tue, Jun 30, 11:33 AM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm reopened T253396: Upgrade all TLS enabled charts to v0.2 tls_helper as "Open".

This led to failing docker-reporter-base-images.service on deneb. I'm definitely missing something here...

Tue, Jun 30, 11:33 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm triaged T256726: redis for docker-registry should have maxmemory-policy set to allkeys-lru as Low priority.
Tue, Jun 30, 10:01 AM · serviceops, Prod-Kubernetes, Kubernetes, Operations
JMeybohm closed T253396: Upgrade all TLS enabled charts to v0.2 tls_helper, a subtask of T235411: Add TLS termination to services running on kubernetes, as Resolved.
Tue, Jun 30, 9:42 AM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm closed T253396: Upgrade all TLS enabled charts to v0.2 tls_helper as Resolved.

Seems it is requited to try to fetch the tag list while bypassing the caches once to have the lasting references removed:
curl https://docker-registry.wikimedia.org/v2/envoy-tls-local-proxy/tags/list?x=y

Tue, Jun 30, 9:42 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm created T256726: redis for docker-registry should have maxmemory-policy set to allkeys-lru.
Tue, Jun 30, 9:04 AM · serviceops, Prod-Kubernetes, Kubernetes, Operations

Mon, Jun 29

JMeybohm added a comment to T253396: Upgrade all TLS enabled charts to v0.2 tls_helper.

I tried to delete the tags/image with the process described here[1] but unfortunately the tags can still be pulled after successful DELETE (another DELETE even returns HTTP 404).
I guess a garbage-collection[2] run is needed to actually remove the tags from the registry. I tried that (--dry-run) on registry2001 where is is running since 5 hours and still going. According to the output (which seems to go over all images in alphabetic order) it has reached the last image but it's still doing a lot of swift requests, so its probably not stuck...

Mon, Jun 29, 5:20 PM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes

Fri, Jun 26

JMeybohm added a comment to T253843: Move helm chart repository out of git.

Turns out our swift cluster does only support Swift V1 Auth, which ChartMuseum does not. I've tried the S3 API as well but that only supports "v2 signatures" which ChartMuseum ... does not (because the official aws-sdk-go only supports v4 signatures).

Fri, Jun 26, 3:31 PM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm closed T256020: Access to the thanos-swift cluster for ChartMuseum, a subtask of T253843: Move helm chart repository out of git, as Resolved.
Fri, Jun 26, 3:25 PM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm closed T256020: Access to the thanos-swift cluster for ChartMuseum as Resolved.

This is done and the account is working, thanks @fgiunchedi !

Fri, Jun 26, 3:25 PM · SRE-swift-storage, Operations, serviceops
JMeybohm created T256459: Increased "Allowed memory size exhausted" exceptions from MediaWiki since 2020-06-25 ~16:00.
Fri, Jun 26, 9:38 AM · Patch-For-Review, Parsing-Team, Core Platform Team, Performance-Team, serviceops, Operations
JMeybohm created T256449: cp5006 multiple alerts (and SSH flapping).
Fri, Jun 26, 7:43 AM · Traffic, Operations, ops-eqsin

Wed, Jun 24

JMeybohm committed rLPRI3f7f7e1f91f0: thanos::swift add chartmuseum account key (authored by JMeybohm).
thanos::swift add chartmuseum account key
Wed, Jun 24, 2:53 PM
JMeybohm added a comment to T256256: Raise an alarm on container restarts/OOMs in kubernetes.

That could help but the alert should always be actionable. For that to happen the owner needs to acknowledge the need for it, which might not happen at the same time for all services.

Wed, Jun 24, 2:25 PM · Sustainability (Incident Prevention), serviceops, Kubernetes, ChangeProp
JMeybohm added a comment to T256256: Raise an alarm on container restarts/OOMs in kubernetes.

With kube-state-metrics (sorry for me repeating this over and over 😂 ) there is kube_pod_container_status_restarts_total and kube_pod_container_status_last_terminated_reason which can be used to detect OOM on containers.

Wed, Jun 24, 1:50 PM · Sustainability (Incident Prevention), serviceops, Kubernetes, ChangeProp
JMeybohm edited P11638 smaller_gerritbot_comments.js.
Wed, Jun 24, 12:40 PM · JavaScript, Phabricator
JMeybohm added a comment to T256020: Access to the thanos-swift cluster for ChartMuseum.

Commit in private is e427c266f2d6ac0a937bf5d972b759933a9f9a18

Wed, Jun 24, 11:16 AM · SRE-swift-storage, Operations, serviceops
JMeybohm added a comment to P11638 smaller_gerritbot_comments.js.

I seem unable to screenshot the tooltip, but it contains the repo name and the commit message.

Wed, Jun 24, 6:23 AM · JavaScript, Phabricator
Majavah awarded P11638 smaller_gerritbot_comments.js a Barnstar token.
Wed, Jun 24, 6:22 AM · JavaScript, Phabricator

Tue, Jun 23

JMeybohm added a comment to T256020: Access to the thanos-swift cluster for ChartMuseum.

We don't expect private data in the charts at all.
In addition, they are already publicly accessible via https://releases.wikimedia.org/charts/ and https://gerrit.wikimedia.org/g/operations/deployment-charts ofc.

Tue, Jun 23, 8:01 AM · SRE-swift-storage, Operations, serviceops
JMeybohm edited P11638 smaller_gerritbot_comments.js.
Tue, Jun 23, 7:55 AM · JavaScript, Phabricator
JMeybohm created P11638 smaller_gerritbot_comments.js.
Tue, Jun 23, 7:55 AM · JavaScript, Phabricator
JMeybohm added a comment to T255975: Investigate the iowait issues plaguing kubernetes nodes since 2020-05-29.

Thanks for writing this up @akosiaris! I think it would be nice to have the follow up tasks linked here. Like the removal of the service-runner and splitting up changeprop into multiple deployments (one per topic?).
Maybe we should also add a follow op to alert/warn on OOMKs / Container restarts?

Tue, Jun 23, 7:39 AM · Sustainability (Incident Prevention), serviceops, Kubernetes, ChangeProp

Mon, Jun 22

JMeybohm created T256020: Access to the thanos-swift cluster for ChartMuseum.
Mon, Jun 22, 4:11 PM · SRE-swift-storage, Operations, serviceops
JMeybohm updated subscribers of T253843: Move helm chart repository out of git.

I need to make decisions regarding TLS and storage:

Mon, Jun 22, 1:31 PM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm updated the task description for T235411: Add TLS termination to services running on kubernetes.
Mon, Jun 22, 8:06 AM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm updated the task description for T235411: Add TLS termination to services running on kubernetes.
Mon, Jun 22, 7:59 AM · Prod-Kubernetes, Kubernetes, serviceops, Operations

Fri, Jun 19

JMeybohm placed T255879: Move cxserver to use TLS only up for grabs.
Fri, Jun 19, 3:50 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm updated subscribers of T255879: Move cxserver to use TLS only.

@Joe I think cxserver is missing the last two steps as well, correct?

Fri, Jun 19, 3:47 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm created T255879: Move cxserver to use TLS only.
Fri, Jun 19, 3:46 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm created T255878: Move wikifeeds to use TLS only.
Fri, Jun 19, 3:42 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm created T255877: Move proton to use TLS only.
Fri, Jun 19, 3:42 PM · Patch-For-Review, Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm created T255876: Move mobileapps to use TLS only.
Fri, Jun 19, 3:42 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm created T255875: Move mathoid to use TLS only.
Fri, Jun 19, 3:42 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm created T255874: Move eventstreams to use TLS only.
Fri, Jun 19, 3:42 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm created T255873: Move eventgate-main to use TLS only.
Fri, Jun 19, 3:42 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm created T255872: Move eventgate-logging-external to use TLS only.
Fri, Jun 19, 3:42 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm created T255871: Move eventgate-analytics-external to use TLS only.
Fri, Jun 19, 3:42 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm created T255870: Move eventgate-analytics to use TLS only.
Fri, Jun 19, 3:42 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm created T255869: Move zotero to use TLS only.
Fri, Jun 19, 3:35 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations
JMeybohm created T255868: Move citoid to use TLS only.
Fri, Jun 19, 3:34 PM · Prod-Kubernetes, Kubernetes, serviceops, Operations

Wed, Jun 17

Ladsgroup awarded T251305: Migrate to helm v3 a Like token.
Wed, Jun 17, 10:55 AM · Patch-For-Review, Kubernetes, serviceops, Operations

Tue, Jun 16

JMeybohm added a comment to T255410: Termbox SSR connection terminated very often.

@Michael thanks for writing this up!

Tue, Jun 16, 7:56 AM · wikidata-tech-focus, Wikidata, Wikidata-Termbox

Thu, Jun 11

JMeybohm added a comment to T253396: Upgrade all TLS enabled charts to v0.2 tls_helper.

All clusters clean from envoy-tls-local-proxy image!

Thu, Jun 11, 3:23 PM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm updated the task description for T253396: Upgrade all TLS enabled charts to v0.2 tls_helper.
Thu, Jun 11, 1:10 PM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm claimed T253843: Move helm chart repository out of git.
Thu, Jun 11, 7:59 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes

Wed, Jun 10

JMeybohm updated the task description for T254581: Move termbox to use TLS only.
Wed, Jun 10, 10:59 AM · Prod-Kubernetes, Kubernetes, serviceops
JMeybohm updated the task description for T253396: Upgrade all TLS enabled charts to v0.2 tls_helper.
Wed, Jun 10, 9:02 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes

Tue, Jun 9

JMeybohm updated the task description for T254581: Move termbox to use TLS only.
Tue, Jun 9, 4:08 PM · Prod-Kubernetes, Kubernetes, serviceops
JMeybohm moved T254581: Move termbox to use TLS only from Backlog to Doing on the serviceops board.
Tue, Jun 9, 3:37 PM · Prod-Kubernetes, Kubernetes, serviceops

Jun 5 2020

JMeybohm created T254581: Move termbox to use TLS only.
Jun 5 2020, 1:51 PM · Prod-Kubernetes, Kubernetes, serviceops
JMeybohm closed T254479: (Re) add wmf.chartid as label to all kubernetes objects as Resolved.

Add everywhere except eventstream and eventgate.

Jun 5 2020, 1:41 PM · serviceops, Prod-Kubernetes, Kubernetes
JMeybohm added a comment to T218733: Migrate mobileapps to k8s and node 10.

Oh, my bad. Then we'll create them for you ofc.
Unfortunately starting with TLS right away would not permit the gradual traffic shift Alex was suggesting so it's probably better to start without and migrate to TLS in a second step. :-/

Jun 5 2020, 8:23 AM · serviceops, Patch-For-Review, Product-Infrastructure-Team-Backlog (Kanban), Page Content Service, Mobile-Content-Service
JMeybohm moved T254479: (Re) add wmf.chartid as label to all kubernetes objects from Backlog to Doing on the serviceops board.
Jun 5 2020, 6:35 AM · serviceops, Prod-Kubernetes, Kubernetes
JMeybohm moved T253396: Upgrade all TLS enabled charts to v0.2 tls_helper from Backlog to Doing on the serviceops board.
Jun 5 2020, 6:35 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes

Jun 4 2020

JMeybohm added a comment to T218733: Migrate mobileapps to k8s and node 10.

If you want so start with TLS (via envoy) right away (which would be great!) you need to go through the extra steps of generating certificates (current document draft at https://wikitech.wikimedia.org/wiki/User:Giuseppe_Lavagetto/Add_Tls_On_Kubernetes) and "register" a TCP port at https://wikitech.wikimedia.org/wiki/Service_ports

Jun 4 2020, 3:29 PM · serviceops, Patch-For-Review, Product-Infrastructure-Team-Backlog (Kanban), Page Content Service, Mobile-Content-Service
JMeybohm created T254479: (Re) add wmf.chartid as label to all kubernetes objects.
Jun 4 2020, 2:15 PM · serviceops, Prod-Kubernetes, Kubernetes
JMeybohm updated the task description for T253396: Upgrade all TLS enabled charts to v0.2 tls_helper.
Jun 4 2020, 9:45 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm updated the task description for T253396: Upgrade all TLS enabled charts to v0.2 tls_helper.
Jun 4 2020, 9:22 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes

Jun 3 2020

JMeybohm added a comment to T253396: Upgrade all TLS enabled charts to v0.2 tls_helper.

And I now see T242861, so please ignore what I said (or at least what I was suggesting).
I'll evaluate the route of merging the common_templates v0.2 changes into eventgate/eventstream forks instead to not have this blocked.

Jun 3 2020, 10:08 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm added a comment to T253396: Upgrade all TLS enabled charts to v0.2 tls_helper.

Oh. I see that the current canary setup will not work with my suggestions and as I see it there is currently no way on how to do it with the default scaffold/templates.

Jun 3 2020, 8:58 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm updated subscribers of T253396: Upgrade all TLS enabled charts to v0.2 tls_helper.

So eventgate and eventstream use forked tls_helpers (currently even the forks slightly differ).

Jun 3 2020, 7:37 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes

May 29 2020

JMeybohm updated the task description for T253396: Upgrade all TLS enabled charts to v0.2 tls_helper.
May 29 2020, 11:23 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes

May 28 2020

JMeybohm added a comment to T247388: Create a grafana dashboard to monitor services proxied via envoy.

Just to have the reference here. I guess it's: https://grafana.wikimedia.org/d/VTCkm29Wz/envoy-telemetry

May 28 2020, 10:41 AM · MediaWiki-General, Operations, serviceops, Service-Architecture
JMeybohm created T253843: Move helm chart repository out of git.
May 28 2020, 10:27 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes

May 27 2020

JMeybohm closed T252428: Make helm upgrades atomic as Resolved.

tiller has been updated in all clusters and namespaces so this is resolved now

May 27 2020, 5:57 PM · Prod-Kubernetes, Patch-For-Review, serviceops, Kubernetes
JMeybohm updated the task description for T252428: Make helm upgrades atomic.
May 27 2020, 5:57 PM · Prod-Kubernetes, Patch-For-Review, serviceops, Kubernetes
JMeybohm added a parent task for T253396: Upgrade all TLS enabled charts to v0.2 tls_helper: T235411: Add TLS termination to services running on kubernetes.
May 27 2020, 8:04 AM · Patch-For-Review, serviceops, Prod-Kubernetes, Kubernetes
JMeybohm added a subtask for T235411: Add TLS termination to services running on kubernetes: T253396: Upgrade all TLS enabled charts to v0.2 tls_helper.
May 27 2020, 8:04 AM · Prod-Kubernetes, Kubernetes, serviceops, Operations