Page MenuHomePhabricator

Alter changeprop chart to use the service mesh
Closed, DeclinedPublic

Description

The changeprop chart bundles the service mesh, but is not using it to reach outside services.
This has the following consequences:

  • No envoy telemetry for both changeprop and changeprop-jobqueue services
  • No progressive migration path to mw-api-int for changeprop

Direct discovery domains are declared in the changeprop configuration:

https://api-rw.discovery.wmnet -> mwapi-async -> http://localhost:6500
https://eventgate-main.discovery.wmnet:4492 -> eventgate-main -> http://localhost:6005
https://inference.discovery.wmnet:30443 -> inference -> http://localhost:6031
https://restbase-async.discovery.wmnet -> restbase-for-services -> http://localhost:6503
http://staging.svc.eqiad.wmnet:34192 -> eventgate-main-staging, doesn't have a service definition
https://inference-staging.svc.codfw.wmnet:30443 -> inference-staging, to be created -> http://localhost:6031

Two direct discovery domains are declared in the changeprop-jobqueue configuration:

https://mw-jobrunner.discovery.wmnet:4448 -> no listener yet, create one
https://videoscaler.discovery.wmnet -> no listener yet, create one
  1. Create the missing listeners
  2. Check the chart includes all needed modules for the service mesh
  3. Change the defined URIs to use the service mesh listeners

Event Timeline

Clement_Goubert created this task.

Change 1013300 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/puppet@production] envoy: Add mw-jobrunner and videoscaler listeners

https://gerrit.wikimedia.org/r/1013300

There is a few reasons why we didn't migrate changeprop to use the service mesh, first of all the fact we don't want to define timeouts outside of it.

In general, changeprop manages connection pools and concurrency itself, having an interference always looked like a bad idea for that reason.

That makes sense. I don't necessarily have a problem with it not using the service mesh (except for the lack of telemetry), except the fact that it means migrating it in one go to use mw-api-int as a backend.

Abandoned because the internals of changeprop make it unadvisable to add another layer. I'll create another task for its migration to mw-api-int.

Change #1013300 merged by Clément Goubert:

[operations/puppet@production] envoy: Add missing service mesh listeners

https://gerrit.wikimedia.org/r/1013300