Not all services we run here propagate tracing headers from incoming requests to outgoing ones. A few of these we've manually identified and fixed, like in T371129
As a deliverable for the MVP it'd be nice to have an idea of the extend of this, and be able to file some tasks against the services easiest and/or most important to add propagation.
In the long run it'd be good to keep tabs on this so we can work towards increasing tracing "coverage".
One of the ways that is likely a strong signal for failure to propagate headers, using data we're already collecting:
- Find traces where the root span attribute upstream_cluster.name does not match either LOCAL_.* or local_service
- What this means is that the Envoy service mesh sidecar of some other service, received a request from its application destined towards another service but without any tracing context attached. In theory, this should be either healthchecks (which we should filter) or uninstrumented user traffic
- The application sending the traffic can be found in the process-level information of that span -- k8s.namespace.name, k8s.pod.name etc.