Page MenuHomePhabricator

jaeger is configured to receive traces from production
Closed, ResolvedPublic

Description

Before we can complete T343302: otel collector is configured to send traces to jaeger we need to get jaeger collector TCP ports (4317 for grpc and 4318 for http) exposed on the production network.

To this end, this is the list of things that need to happen:

  • Test ingress on aux is working as expected: T325178
  • Define jaeger-collector-(http|grpc) (or similar name) service in service::catalog, pointing to k8s-ingress-aux
  • Add DNS for jaeger-collector-(http|grpc)
  • service::catalog and DNS for query?
  • Add certs with proper SANs for jaeger-collector to use on 4317/4318
  • Configure jaeger-collector to serve/use said certs
  • Instruct istio to forward said ports to jaeger-collector
  • Test that traces can be received from the production network and wikikube
# Tests from deploy1002
# For GRPC
./otel-cli exec --verbose --service test-grpc --name "curl wikipedia"  --endpoint "jaeger-collector-grpc.svc.eqiad.wmnet:30443" curl https://wikipedia.org
# For HTTP
./otel-cli exec --service test-http --name "curl wikipedia"  --endpoint "https://jaeger-collector-http.svc.eqiad.wmnet:30443" curl https://wikipedia.org

Details

Related Changes in Gerrit:
SubjectRepoBranchLines +/-
operations/puppetproduction+3 -3
operations/puppetproduction+36 -0
operations/deployment-chartsmaster+8 -0
operations/dnsmaster+10 -4
operations/deployment-chartsmaster+26 -26
operations/deployment-chartsmaster+165 -17
operations/puppetproduction+0 -2
labs/privatemaster+0 -3
operations/deployment-chartsmaster+1 -1
operations/puppetproduction+4 -2
operations/deployment-chartsmaster+0 -4
labs/privatemaster+3 -0
operations/deployment-chartsmaster+4 -0
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+10 -6
operations/deployment-chartsmaster+2 -2
operations/deployment-chartsmaster+3 -2
operations/deployment-chartsmaster+2 -2
operations/deployment-chartsmaster+17 -0
operations/deployment-chartsmaster+12 -1
operations/deployment-chartsmaster+1 -1
operations/deployment-chartsmaster+115 -105
operations/puppetproduction+6 -1
operations/deployment-chartsmaster+33 -3
operations/deployment-chartsmaster+2 -2
operations/deployment-chartsmaster+35 -2
operations/deployment-chartsmaster+3 -3
operations/deployment-chartsmaster+0 -2
operations/deployment-chartsmaster+4 -8
operations/deployment-chartsmaster+4 -4
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 949501 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/deployment-charts@master] istio: clarify instructions to get the istio version

https://gerrit.wikimedia.org/r/949501

Change 949504 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/deployment-charts@master] aux: add tlsHostnames for jaeger collector and query

https://gerrit.wikimedia.org/r/949504

Change 949501 merged by Filippo Giunchedi:

[operations/deployment-charts@master] istio: clarify instructions to get the istio version

https://gerrit.wikimedia.org/r/949501

Change 949504 merged by Filippo Giunchedi:

[operations/deployment-charts@master] aux: add tlsHostnames for jaeger collector and query

https://gerrit.wikimedia.org/r/949504

Change 951438 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jaeger: Don't skip host verification for connections to ES

https://gerrit.wikimedia.org/r/951438

Change 951438 merged by JMeybohm:

[operations/deployment-charts@master] jaeger: Don't skip host verification for connections to ES

https://gerrit.wikimedia.org/r/951438

Change 951443 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jaeger: Don't pull images via CDN

https://gerrit.wikimedia.org/r/951443

Change 951450 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jaeger: Enable TLS for query (UI)

https://gerrit.wikimedia.org/r/951450

Change 951443 merged by jenkins-bot:

[operations/deployment-charts@master] jaeger: Don't pull images via CDN

https://gerrit.wikimedia.org/r/951443

Change 951450 merged by jenkins-bot:

[operations/deployment-charts@master] jaeger: Enable TLS for query (UI)

https://gerrit.wikimedia.org/r/951450

Change 951471 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jager: Fix typo in tls.cert name

https://gerrit.wikimedia.org/r/951471

Change 951471 merged by JMeybohm:

[operations/deployment-charts@master] jager: Fix typo in tls.cert name

https://gerrit.wikimedia.org/r/951471

Change 951477 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jaeger: Enable TLS for the collector as well

https://gerrit.wikimedia.org/r/951477

Change 951477 merged by jenkins-bot:

[operations/deployment-charts@master] jaeger: Enable TLS for the collector as well

https://gerrit.wikimedia.org/r/951477

Change 951533 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] deployment_server: Add jaeger user to aux-k8s

https://gerrit.wikimedia.org/r/951533

Change 951533 merged by JMeybohm:

[operations/puppet@production] deployment_server: Add jaeger user to aux-k8s

https://gerrit.wikimedia.org/r/951533

Change 952052 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] Move jaeger from admin_ng to aux services

https://gerrit.wikimedia.org/r/952052

Change 952052 merged by jenkins-bot:

[operations/deployment-charts@master] Move jaeger from admin_ng to aux services

https://gerrit.wikimedia.org/r/952052

Change 952111 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jaeger: Fix path to helmfile-defaults secret

https://gerrit.wikimedia.org/r/952111

Change 952111 merged by JMeybohm:

[operations/deployment-charts@master] jaeger: Fix path to helmfile-defaults secret

https://gerrit.wikimedia.org/r/952111

Change 952151 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/puppet@production] hieradata: add jaeger collector to service catalog

https://gerrit.wikimedia.org/r/952151

Change 952152 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jeager: Add networkpolicy support to es-index-cleaner

https://gerrit.wikimedia.org/r/952152

Change 952153 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jeager: Disable creation of service accounts, add networkPolicy

https://gerrit.wikimedia.org/r/952153

Change 952152 merged by jenkins-bot:

[operations/deployment-charts@master] jeager: Add networkpolicy support to es-index-cleaner

https://gerrit.wikimedia.org/r/952152

Change 952153 merged by jenkins-bot:

[operations/deployment-charts@master] jeager: Disable creation of service accounts, add networkPolicy

https://gerrit.wikimedia.org/r/952153

Change 952205 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jaeger: Fix label selector for es-index-cleaner job

https://gerrit.wikimedia.org/r/952205

Change 952205 merged by jenkins-bot:

[operations/deployment-charts@master] jaeger: Fix label selector for es-index-cleaner job

https://gerrit.wikimedia.org/r/952205

Change 952211 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jaeger: Run index cleaner daily not hourly

https://gerrit.wikimedia.org/r/952211

Change 952212 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jeager: Rename default release from jaeger to main

https://gerrit.wikimedia.org/r/952212

Change 952211 merged by jenkins-bot:

[operations/deployment-charts@master] jaeger: Run index cleaner daily not hourly

https://gerrit.wikimedia.org/r/952211

Change 952212 merged by jenkins-bot:

[operations/deployment-charts@master] jeager: Rename default release from jaeger to main

https://gerrit.wikimedia.org/r/952212

Change 952214 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jeager: Fix secret name (generated by Certificate objects)

https://gerrit.wikimedia.org/r/952214

Change 952214 merged by jenkins-bot:

[operations/deployment-charts@master] jeager: Fix secret name (generated by Certificate objects)

https://gerrit.wikimedia.org/r/952214

Change 952220 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jeager: Temporarily lower the lifetime of TLS certs to 2 days

https://gerrit.wikimedia.org/r/952220

Change 952220 merged by jenkins-bot:

[operations/deployment-charts@master] jeager: Temporarily lower the lifetime of TLS certs to 2 days

https://gerrit.wikimedia.org/r/952220

Change 952231 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jaeger: Fix typo in secretName

https://gerrit.wikimedia.org/r/952231

Change 952231 merged by jenkins-bot:

[operations/deployment-charts@master] jaeger: Fix typo in secretName

https://gerrit.wikimedia.org/r/952231

Change 952242 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[labs/private@master] PKI: Rename aux key to match the naming scheme of everything else

https://gerrit.wikimedia.org/r/952242

Change 952243 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] PKI: Rename the aux profile to match the naming scheme

https://gerrit.wikimedia.org/r/952243

Change 952131 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] Revert "jeager: Temporarily lower the lifetime of TLS certs to 2 days"

https://gerrit.wikimedia.org/r/952131

Change 952242 merged by JMeybohm:

[labs/private@master] PKI: Rename aux key to match the naming scheme of everything else

https://gerrit.wikimedia.org/r/952242

Change 952246 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] aux: Rename the aux profile to match the naming scheme

https://gerrit.wikimedia.org/r/952246

Change 952131 merged by jenkins-bot:

[operations/deployment-charts@master] Revert "jeager: Temporarily lower the lifetime of TLS certs to 2 days"

https://gerrit.wikimedia.org/r/952131

Change 952243 merged by JMeybohm:

[operations/puppet@production] PKI: Rename the aux profile to match the naming scheme

https://gerrit.wikimedia.org/r/952243

Change 952246 merged by jenkins-bot:

[operations/deployment-charts@master] aux: Rename the aux profile to match the naming scheme

https://gerrit.wikimedia.org/r/952246

Change 952309 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[labs/private@master] PKI: Rename aux key to match the naming scheme of everything else

https://gerrit.wikimedia.org/r/952309

Change 952309 merged by JMeybohm:

[labs/private@master] PKI: Rename aux key to match the naming scheme of everything else

https://gerrit.wikimedia.org/r/952309

Change 953250 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/puppet@production] pki: restore aux default expiration

https://gerrit.wikimedia.org/r/953250

Change 953250 merged by Filippo Giunchedi:

[operations/puppet@production] pki: restore aux default expiration

https://gerrit.wikimedia.org/r/953250

Before we can complete T343302: otel collector is configured to send traces to jaeger we need to get jaeger collector TCP ports (4317 for grpc and 4318 for http) exposed on the production network.

With the current setup we only have one LVS service on port tcp/30443 in front of the istio-ingressgateway. Without a second port we can't have multiple ports per hostname (obviously). If we want to stick with one LVS service I suggest we differentiate the http and grpc collector endpoints via different hostnames:

  • jaeger-collector-http.discovery.wmnet
  • jaeger-collector-grpc.discovery.wmnet

Change 953675 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jaeger: Configure ingress using istio CRD

https://gerrit.wikimedia.org/r/953675

Change 953675 merged by jenkins-bot:

[operations/deployment-charts@master] jaeger: Configure ingress using istio CRD

https://gerrit.wikimedia.org/r/953675

Change 953974 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jaeger: Fix networkpolicy (indentation)

https://gerrit.wikimedia.org/r/953974

Change 953974 merged by jenkins-bot:

[operations/deployment-charts@master] jaeger: Fix networkpolicy (indentation)

https://gerrit.wikimedia.org/r/953974

JMeybohm updated the task description. (Show Details)

Change 954218 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/dns@master] wmnet: add jaeger records for ingress

https://gerrit.wikimedia.org/r/954218

Change 954218 merged by Filippo Giunchedi:

[operations/dns@master] wmnet: add jaeger records for ingress

https://gerrit.wikimedia.org/r/954218

Change 954301 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] jeager: Fix GRPC traffic to the collector

https://gerrit.wikimedia.org/r/954301

Change 954301 merged by jenkins-bot:

[operations/deployment-charts@master] jeager: Fix GRPC traffic to the collector

https://gerrit.wikimedia.org/r/954301

JMeybohm updated the task description. (Show Details)

Change 952151 merged by Filippo Giunchedi:

[operations/puppet@production] hieradata: add jaeger collector to service catalog

https://gerrit.wikimedia.org/r/952151

Change 954705 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/puppet@production] hieradata: set jaeger components services to production

https://gerrit.wikimedia.org/r/954705

Change 954705 merged by Filippo Giunchedi:

[operations/puppet@production] hieradata: set jaeger components services to production

https://gerrit.wikimedia.org/r/954705