Page MenuHomePhabricator

Validate end-to-end OpenTelemetry (custom) span delivery in AW k8s services
Closed, ResolvedPublic

Description

What/Why:
There are custom spans that were implemented last year (2025) sometime to emit granular spans at specific function calls within a trace. This is set behind the config flag createCustomSpans; for both the Evaluator and Orchestrator.
We turned this feature off temporarily while investigating service slowness and memory exhaustion. Today as SRE is looking into further support on service-utils (and AW being the first team to guinea-pig it), we want to turn this back on and verify that these spans are coming through correctly.

How:
Locate all of the functions which calls the custom span creation logic, possibly remove from a couple of heavy-weight functions(?), and set createCustomSpans to true in our deploy config.

Current areas that trigger custom span creation
in Orchestrator:

  • evaluate.js
  • execute.js
  • fetchObject.js
  • implementation.js
  • validation.js

in Evaluator:

  • Executor.js

Event Timeline

ecarg updated the task description. (Show Details)
ecarg updated the task description. (Show Details)

Change #1243695 had a related patch set uploaded (by Ecarg; author: Ecarg):

[operations/deployment-charts@master] turn on custom oTel spans for Wikifunctions

https://gerrit.wikimedia.org/r/1243695

Change #1243695 merged by jenkins-bot:

[operations/deployment-charts@master] wikifunctions: Turn on custom oTel spans

https://gerrit.wikimedia.org/r/1243695

@ecarg: Can you check whether this is (a) working and not (b) slowing down the services, and if so mark as Resolved?

These were looking great after deploy last week but checking today, there is zero data on wikifunctions traces. Will update again in the next couple days

Today, there is data again 😵, I added some screenshots in case they disappear again; resolving this task now!

Screenshot 2026-03-05 at 10.16.29 AM.png (1×1 px, 222 KB)

Screenshot 2026-03-05 at 10.14.20 AM.png (776×1 px, 155 KB)