Page MenuHomePhabricator

Investigate collecting request metrics using service mesh
Closed, ResolvedPublic

Description

What we are specifically looking for is "observability" provided by some tooling; particularly we would like metrics on number and latency of requests. It appears that a common solution to this is using a service mesh since this functionality usually comes bundled in.

Some examples of service meshes to investigate are:

  • Istio
  • Linkerd

An more broad tool to look at might be: Cillium.

For a look at the landscape of tools it may be worth looking at the CNCF landscape. Particularly the Service Mesh or Cloud Native Network sections.

You can also look at how the WMF has rolled their own observability using an envoy sidecar pattern:

Event Timeline

Collecting the requirements we came up with in today's meeting:

  • It’s possible to install with our current stack (i.e. Helm + Terraform)
  • Should be able to uninstall it easily
  • We don’t need to change our infrastructure for it
  • Gives us prometheus metrics out of the box

There's a working POC here https://github.com/wmde/wbaas-deploy/tree/fr/envoy-es-sidecar - I will draft an ADR based upon this next, closing this ticket therefore.

Fring removed Fring as the assignee of this task.Mar 1 2023, 4:43 PM