Page MenuHomePhabricator

Skip idempotent operations in the orchestrator, for performance
Closed, ResolvedPublic

Description

Description

The orchestrator performs two very expensive operations on ZObejcts: eager evaluation and validation. These operations recurse over the entire object and may kick off other expensive operations, resulting in exponential-time performance degradation.

These operations should be run at most once over a given object (unless/until it experiences a state change that invalidates prior operations).

Moreover, sometimes these operations should not be run at all. Many objects, such as instances of Wikidata types, are constructed automatically by built-in functions in the orchestrator, which check the validity of the JSON content retrieved from Wikidata. it's inessential for the orchestrator to to validate these instances.

It should be possible to eliminate these steps by marking the ZWrappers for these instances as "already validated and checking the flag before running the validation step.

Desired behavior/Acceptance criteria (returned value, expected error, performance expectations, etc.)

  • We expect this to improve performance whenever the orchestrator fetches Wikidata content of a nontrivial size, and further reduce the occurrences of timeouts described in T377338.
    • Memory usage should decrease by ~30% when run locally
    • Duration on certain calls (like Echo( Dereference( LID ) )) should decrease by ~70% when run locally
  • Validation is run at most once on objects (not counting re-runs due to state changes).
  • Eager evaluation is run at most once on objects (not counting re-runs due to state changes).
  • Validation and eager evaluation are never explicitly run on objects produced directly by the orchestrator, since these objects are known to be valid and evaluated.

Completion checklist

Event Timeline

Jdforrester-WMF renamed this task from Skip validation of instances of Wikidata types to Skip validation of instances of Wikidata types, for performance.Nov 20 2024, 5:35 PM
Jdforrester-WMF triaged this task as High priority.
cmassaro renamed this task from Skip validation of instances of Wikidata types, for performance to Skip idempotent operations in the orchestrator, for performance.Jan 23 2025, 2:46 PM
cmassaro updated the task description. (Show Details)

Change #1117551 had a related patch set uploaded (by Jforrester; author: Jforrester):

[operations/deployment-charts@master] wikifunctions: Upgrade orchestrator from 2025-01-28-144249 to 2025-02-03-215824

https://gerrit.wikimedia.org/r/1117551

Change #1117551 merged by jenkins-bot:

[operations/deployment-charts@master] wikifunctions: Upgrade orchestrator from 2025-01-28-144249 to 2025-02-03-215824

https://gerrit.wikimedia.org/r/1117551

@cmassaro Could you please check if this task had the expected impact on performance and "resolve" it if that is the case?

This did indeed have the expected impact on performance. Calls that were taking ~7 seconds are now taking <1 second.

Also, this isn't quite done: we still need to audit all of the other builtin functions to see where we can make improvements.

This comment was removed by cmassaro.

Change #1119094 had a related patch set uploaded (by Cory Massaro; author: Cory Massaro):

[operations/deployment-charts@master] wikifunctions: Upgrade orchestrator from 2025-02-03-215824 to 2025-02-11-155417.

https://gerrit.wikimedia.org/r/1119094

Change #1119094 merged by jenkins-bot:

[operations/deployment-charts@master] wikifunctions: Upgrade orchestrator from 2025-02-03-215824 to 2025-02-11-155417.

https://gerrit.wikimedia.org/r/1119094

Jdforrester-WMF subscribed.

Also, this isn't quite done: we still need to audit all of the other builtin functions to see where we can make improvements.

Moving back to Ready then.

apine opened https://gitlab.wikimedia.org/repos/abstract-wiki/wikifunctions/function-orchestrator/-/merge_requests/288

Audit remaining builtins that generate and return their output; set the output as evaluated and resolved.

ecarg merged https://gitlab.wikimedia.org/repos/abstract-wiki/wikifunctions/function-orchestrator/-/merge_requests/288

Audit remaining builtins that generate and return their output; set the output as evaluated and resolved.

DSantamaria changed the task status from Open to In Progress.Feb 26 2025, 6:20 AM

Change #1122963 had a related patch set uploaded (by Jforrester; author: Jforrester):

[operations/deployment-charts@master] wikifunctions: Update orchestrator from 2025-02-20-140756 to 2025-02-25-210518

https://gerrit.wikimedia.org/r/1122963

Change #1122963 merged by jenkins-bot:

[operations/deployment-charts@master] wikifunctions: Update orchestrator from 2025-02-20-140756 to 2025-02-25-210518

https://gerrit.wikimedia.org/r/1122963

Hi @cmassaro~ would you say this is ready to be marked as 'resolved'?