Page MenuHomePhabricator

Orchestrator emits very unreadable error: "500: [object Object]
Closed, ResolvedPublicBUG REPORT

Description

Description

Steps to reproduce (step by step instructions, with links, commands and necessary data to reproduce the error)

Look here, at the logs

Observed behavior

  • completely inscrutable error

Expected behavior/Acceptance criteria (returned value, expected error, performance expectations, etc.)

  • scrutability

Completion checklist

Details

Related Changes in Gerrit:
Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Wrap fetchZIDs's JSON.parse in try/catch.repos/abstract-wiki/wikifunctions/function-orchestrator!310apineapine-500main
Customize query in GitLab

Event Timeline

Update: with d8b9cf42cc17bfb2988ae0d32c7b8efea1565d19 in the orchestrator (and the equivalent in the evaluator), this is now showing up as:

Otherwise Unhandled error 500: {"status":500,"type":"internal_error","detail":"Expected ',' or '}' after property value in JSON at position 413"}

… which suggests we're failing to try/catch a JSON decode from user input (?) at some point.

  • builtins.getLanguageMap – but should be very safe
  • Evaluator.evaluateReentrant_ - only in re-entrancy, not enabled
  • fetchObject.fetchZIDs - suspect, though should be safe
  • fetchObject.retrieveWikidataEntityFromLODAPI - suspect, though should generally be safe (we check for 404s/etc.)
  • fetchObject.findEntitiesByStatements - suspect, though should should generally be safe (we check for 404s/etc.)
  • utils.getEvaluatorAndOrchestratorConfigsForEnvironment – should either be entirely safe or explode on every request
  • utils.readJSON
    • builtins.getPersistentZObjectFromFile & …getDefinitionFromFile – should be safe
    • invariants.softwareLanguages – should either be entirely safe or explode on every request

Consequently I'm most suspicious of the fetchZIDs one, but we should (a) try/catch and re-error as appropriate with a trace, and (b) consolidate on using a wrapper that does this.

Change #1129368 had a related patch set uploaded (by Jforrester; author: Jforrester):

[operations/deployment-charts@master] wikifunctions: Update orchestrator from 2025-03-11-234105 to 2025-03-19-203723

https://gerrit.wikimedia.org/r/1129368

Change #1129368 merged by jenkins-bot:

[operations/deployment-charts@master] wikifunctions: Update orchestrator from 2025-03-11-234105 to 2025-03-19-203723

https://gerrit.wikimedia.org/r/1129368

OK, this should now fail with wikilambda_fetch returned degenerate output, which will help us find out where these are coming from.

noting here that while looking at logs, latest error message object is 'request entity too large' which seems like a Node.js error

Screenshot from 2025-04-02 14-22-59.png (896×3 px, 293 KB)

As far as I can tell, this error (which now manifests as "Otherwise Unhandled error [...]" ... doesn't prevent the function call from running? If I filter by the request ID for a request that causes this error, the error happens before orchestration even begins--and yet orchestration continues unimpeded. So I think this error is not very informative and can be ignored.

ecarg changed the task status from In Progress to Open.EditedMay 2 2025, 12:46 AM

oops, this hasn't been seen since March 12

DSantamaria changed the task status from Open to In Progress.May 4 2025, 6:11 AM

@ecarg, did you mean to close it? (You moved it to open, not sure why)

@cmassaro, is this resolved?

From my perspective, the status remains as discussed before:

Screenshot from 2025-04-02 14-22-59.png (896×3 px, 293 KB)

As far as I can tell, this error (which now manifests as "Otherwise Unhandled error [...]" ... doesn't prevent the function call from running? If I filter by the request ID for a request that causes this error, the error happens before orchestration even begins--and yet orchestration continues unimpeded. So I think this error is not very informative and can be ignored.

So yes, I think we can resolve it for now.