Description
Contributors are complaining about WF performance. We can't currently perform bottleneck analysis on what is probably the most latency-prone aspect of the system: MW API requests. If we logged or otherwise recorded the send and receive times for requests and responses throughout the system (i.e., in the API, the orchestrator, and the evaluator), we could get a clearer understanding of these bottlenecks.
Ways to surface this information:
- metadata map: this way is straightforward and doesn't require any special logic to align the data afterward, but it would potentially junk up the metadata map. This implementation path would be dependent on planned work to make the metadata map cover all nested events, not just the top-level function call.
- logstash: this would be relatively easy to incorporate into all parts of the system. We would need to be careful to assign a unique ID to each function call, AND an additional secondary ID to all nested calls (e.g., the evaluator or re-entrant orchestrator calls).
- performance instrument: similar to logstash but potentially more appropriate. It is not clear at the moment how hard it would be to incorporate metrics platform instruments into the backend services.
Desired behavior/Acceptance criteria (returned value, expected error, performance expectations, etc.)
- discuss the aforesaid
- make tasks as appropriate
Remove all the non-applicable tags from the "Tags" field, leave only the tags of the projects/repositories related to this task
Completion checklist
- Before closing this task, review one by one the checklist available here: https://www.mediawiki.org/wiki/Abstract_Wikipedia_team/Definition_of_Done#Back-end_Task/Bug_completion_checklist