@JAllemandou noticed that some jobs were only shown in the Spark History UI about a day after having run. (See Slack thread. We need to figure out why the server is lagging, to restore a almost-real-time indexing behavior.
Description
Description
Details
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
spark-history: expand the an-worker subnets the SHS can egress to | operations/deployment-charts | master | +14 -0 |
Event Timeline
Comment Actions
Change 1005727 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/deployment-charts@master] spark-history: expand the an-worker subnets the SHS can egress to
Comment Actions
Change 1005727 merged by Brouberol:
[operations/deployment-charts@master] spark-history: expand the an-worker subnets the SHS can egress to
Comment Actions
Mentioned in SAL (#wikimedia-analytics) [2024-02-22T11:52:51Z] <brouberol> redeploying the spark-history server with expanded egress rules for hadoop workers - T358206
Comment Actions
The spark history server is now catching up on its lag after a redeploy. No more tracebacks of failed connections are observed.