https://gitlab.wikimedia.org/repos/data-engineering/eventgate-wikimedia
Done is
All eventgate deployments are upgraded to Node 20:
- all of the below in beta
- eventgate-analytics
- eventgate-logging-external
- eventgate-analytics-external
- eventgate-main
| Ottomata | |
| Jan 15 2025, 7:56 PM |
| F58952647: Screenshot 2025-03-31 at 13.19.36.png | |
| Mar 31 2025, 5:22 PM |
| F58952643: Screenshot 2025-03-31 at 13.19.23.png | |
| Mar 31 2025, 5:22 PM |
| F58952644: Screenshot 2025-03-31 at 13.19.20.png | |
| Mar 31 2025, 5:22 PM |
| F58952641: Screenshot 2025-03-31 at 13.19.12.png | |
| Mar 31 2025, 5:22 PM |
https://gitlab.wikimedia.org/repos/data-engineering/eventgate-wikimedia
All eventgate deployments are upgraded to Node 20:
| Title | Reference | Author | Source Branch | Dest Branch | |
|---|---|---|---|---|---|
| blubber.yaml - Add libssl-dev to image | repos/data-engineering/eventgate-wikimedia!12 | otto | T383814_node20_1 | master | |
| Upgrade to node 20 | repos/data-engineering/eventgate-wikimedia!11 | otto | T383814_node20 | master |
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Open | None | T368927 [Epic] Migrate Data Platform Engineering maintained git repos to GitLab | |||
| Open | None | T384364 Create an instance-level npm package registry in Gitlab | |||
| Open | None | T366614 [Epic] Migrate Data Engineering maintained NodeJS repositories to GitLab | |||
| Resolved | Snwachukwu | T366611 Migrate Data Engineering NodeJS library repos to GitLab | |||
| Resolved | tchin | T366537 Create gitlab ci npm publish pipeline and job in workflow_utils gitlab_ci_templates | |||
| Resolved | tchin | T366612 Publish Data Engineering maintained NodeJS packages to GitLab and use them in depender code | |||
| Open | None | T364779 Migrate node-based services in production to node20 | |||
| Resolved | Ottomata | T383814 Upgrade eventgate-wikimedia to node20 |
otto opened https://gitlab.wikimedia.org/repos/data-engineering/eventgate-wikimedia/-/merge_requests/6
Upgrade to node 20
@tchin I'm trying to do like you did in https://gitlab.wikimedia.org/repos/data-engineering/eventgate/-/merge_requests/4, but am failing. Please help if you can!
I've been working on this since my head is in eventgate code as part of T382173: Enable Event Platform streams to opt out of collecting User-Agent data.
I hope to deploy both at the same time.
otto opened https://gitlab.wikimedia.org/repos/data-engineering/eventgate-wikimedia/-/merge_requests/11
Upgrade to node 20
Change #1114795 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/deployment-charts@master] eventgate - templatize module name, default to @eventgate/wikimedia
Change #1114798 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/deployment-charts@master] eventgate-analytics - upgrade to v1.10.0 and NodeJS 20
Change #1114795 merged by jenkins-bot:
[operations/deployment-charts@master] eventgate - templatize module name, default to @eventgate/wikimedia
v1.10.0 has been released with node20. It is running in beta.
I'd like to deploy to some production eventgates, but I'm worried about the timing. I have one full work day left before next week's DPE offsite. Node upgrades might work right away, but any performance regressions could take days to manifest.
Let's put off deployment until we are back from the offsite.
otto merged https://gitlab.wikimedia.org/repos/data-engineering/eventgate-wikimedia/-/merge_requests/11
Upgrade to node 20
Change #1114798 merged by Ottomata:
[operations/deployment-charts@master] eventgate-analytics - upgrade to v1.10.0 and NodeJS 20
Attempted to deploy eventgate-analytics staging today, but it looks like staging k8s is out of IP addresses?
0s Warning FailedCreatePodSandBox pod/eventgate-production-6f5b4bbb64-rhrf4 (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "986549f25d453eb7cccba18c87774f564d4a2c2ffbffe4a0bab97d512ddfb4cf": plugin type="calico" failed (add): failed to request IPv4 addresses: Assigned 0 out of 1 requested IPv4 addresses; No more free affine blocks and strict affinity enabled
...Asking in IRC.
@tchin has discovered that nodejs20-slim does not contain ssl package needed for tls to Kafka. eventgate probably needs the same dependency.
https://gitlab.wikimedia.org/repos/data-engineering/eventstreams/-/merge_requests/16
Change #1120576 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/deployment-charts@master] charts/eventgate - fix name of default node module to load
Change #1120576 merged by Ottomata:
[operations/deployment-charts@master] charts/eventgate - fix name of default node module to load
otto opened https://gitlab.wikimedia.org/repos/data-engineering/eventgate-wikimedia/-/merge_requests/12
blubber.yaml - Add libssl-dev to image
otto merged https://gitlab.wikimedia.org/repos/data-engineering/eventgate-wikimedia/-/merge_requests/12
blubber.yaml - Add libssl-dev to image
Change #1120604 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/deployment-charts@master] eventgate-analytics - bump to v1.11.0 for node20
Change #1120604 merged by Ottomata:
[operations/deployment-charts@master] eventgate-analytics - bump to v1.11.0 for node20
Mentioned in SAL (#wikimedia-operations) [2025-02-18T17:30:09Z] <ottomata> upgrading eventgate-analytics in codfw to node20 (will let this simmer for a day before proceeding to eqiad) - T383814
eventgate-analytics in codfw is upgraded to node20.
I'll let this simmer for at least a day before proceeding in eqiad and with other eventgate deployments.
Mentioned in SAL (#wikimedia-operations) [2025-02-19T15:04:32Z] <ottomata> upgrading eventgate-analytics in eqiad to node20 - T383814
eventgate-analytics in eqiad is upgraded to node20.
Will proceed with other eventgate instance next week.
Change #1122159 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/deployment-charts@master] eventgate-logging-external - upgrade to node20
Mentioned in SAL (#wikimedia-operations) [2025-03-03T15:36:38Z] <ottomata> deploying eventgate-logging-external to bump to node20 - T383814
Change #1124144 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/deployment-charts@master] eventgate-analytics-external - upgrade to node20
Change #1122159 merged by Ottomata:
[operations/deployment-charts@master] eventgate-logging-external - upgrade to node20
Uh, oops. I deployed eventgate-logging-external this morning, but didn't merge the patch that bumped the image version! It was basically a no op! Merging and proceeding.
Mentioned in SAL (#wikimedia-operations) [2025-03-03T16:38:04Z] <ottomata> deploying eventgate-logging-external to ACTUALLY bump to node20 - T383814
I believe the latest deploy also removed x-geoip info/lookups from NEL logs, see also T387850: NEL logs are missing geoip information
Mentioned in SAL (#wikimedia-operations) [2025-03-04T16:01:15Z] <ottomata> eventgate-logging-external: rolling back to pre node 20 due to bug likely caused by T382173. -- T387850 , T383814
Change #1131415 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/deployment-charts@master] eventgate-logging-external - upgrade to node20
Change #1131415 merged by jenkins-bot:
[operations/deployment-charts@master] eventgate-logging-external - upgrade to node20
Mentioned in SAL (#wikimedia-operations) [2025-03-27T15:28:10Z] <ottomata> upgrading eventgate-logging-external to node20 (using new per stream header enrich setting), first testing in staging. - T383814, T387908
Change #1124144 merged by jenkins-bot:
[operations/deployment-charts@master] eventgate-analytics-external - upgrade to node20
Mentioned in SAL (#wikimedia-operations) [2025-03-27T17:34:37Z] <ottomata> upgrading eventgate-analytics-external to node20 - T383814
Change #1131792 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/deployment-charts@master] eventgate-main - upgrade to NodeJS 20
Status update:
all eventgate deployments are upgraded except eventgate-main.
Let's do eventgate-main on Monday.
Mentioned in SAL (#wikimedia-operations) [2025-03-31T16:28:13Z] <ottomata> beginning eventgate-main upgrade to NodeJS 20 - T383814
Change #1131792 merged by jenkins-bot:
[operations/deployment-charts@master] eventgate-main - upgrade to NodeJS 20
I deployed eventgate-analytics-external last Thursday March 27.
resource usage seems lower since then? TBD if it will last; perhaps memory usage will climb back to Node 18 status quo. Since the 27th:
Average Latency has dropped.
Average and Max CPU usage has dropped.
Average memory usage has dropped. From about 227MB to 160MB working set for eventgate-analytics-external container.
Too soon to tell for eventgate-main, but at least so far nothing looks worse. eventgate-main is a little different, in that most requests use the hasty=false producer, which means latency should be generally higher than eventgate-analytics-external.
Change #1119206 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/deployment-charts@master] eventgate-analytics remove canary release from staging
Change #1119206 merged by jenkins-bot:
[operations/deployment-charts@master] eventgate-analytics remove canary release from staging