Page MenuHomePhabricator

Fix recurring error: `FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory`
Closed, ResolvedPublicBUG REPORT

Description

What/Why:
We routinely see this error on our dashboard. We have also seen it occur during P0 incidents, we can't be sure how related it is to the last incident, but we need better insight as to why this happens.

How:

Event Timeline

Jdforrester-WMF changed the subtype of this task from "Task" to "Bug Report".

this is a Node js error saying that memory usage is too much...

  • perhaps we need to increase heap limit?

...but we should make sure that is absolutely necessary:

  1. monitor garbage collection
  2. monitor memory leakage
  3. monitor heap usage

Observation/notes:

  • This can be mapped to the spike in CPU usage in Grafana
  • Around the same time, there is a spike in implementation error count frequency
  • This Node js built-in error is often prefaced with other built-in errors,"<--- JS stacktrace --->"(used to indicate beginning of a JS stack trace; usu. in error logs for critical issues like memory problems or uncaught exceptions) and/or followed by "<--- Last few GCs --->" logs.

Change #1105362 had a related patch set uploaded (by Jforrester; author: Jforrester):

[operations/deployment-charts@master] wikifunctions: Upgrade orchestrator from 2024-11-27-074306 to 2024-12-17-184905

https://gerrit.wikimedia.org/r/1105362

Change #1105362 merged by jenkins-bot:

[operations/deployment-charts@master] wikifunctions: Upgrade orchestrator from 2024-11-27-074306 to 2024-12-17-184905

https://gerrit.wikimedia.org/r/1105362

Change #1111626 had a related patch set uploaded (by Jforrester; author: Jforrester):

[operations/deployment-charts@master] wikifunctions: Upgrade orchestrator from 2025-01-08-142250 to 2025-01-15-052609

https://gerrit.wikimedia.org/r/1111626

Change #1111626 merged by jenkins-bot:

[operations/deployment-charts@master] wikifunctions: Upgrade orchestrator from 2025-01-08-142250 to 2025-01-15-052609

https://gerrit.wikimedia.org/r/1111626

Jdforrester-WMF subscribed.

Is there more to do here, or should this be moved to sign-off?

This can be moved to sign-off, thank you! Next steps can be covered via this task https://phabricator.wikimedia.org/T383806

Change #1115032 had a related patch set uploaded (by Cory Massaro; author: Cory Massaro):

[operations/deployment-charts@master] wikifunctions: Upgrade orchestrator from version: 2025-01-22-203140 to 2025-01-28-144249

https://gerrit.wikimedia.org/r/1115032

Change #1115032 abandoned by Cory Massaro:

[operations/deployment-charts@master] wikifunctions: Upgrade orchestrator from version: 2025-01-22-203140 to 2025-01-28-144249

Reason:

already done

https://gerrit.wikimedia.org/r/1115032