Page MenuHomePhabricator

Caching: Handle Wikidata objects bigger than 1 MB
Closed, ResolvedPublic

Description

Description

memcached, by default, only caches objects under 1 MB in size. In our current caching code, attempts to cache larger objects will fail silently. As a result, we are not getting the maximum benefit of memcached. We should:

  • Check the size before caching.
  • If it is over a certain threshold, apply a compression algorithm before submitting it to memcached.
  • Tag the cached item to indicate that it was compressed.
  • Arrange to uncompress it whenever it is retrieved from the cache.
  • Check for failures when adding or updating an object in the cache, and log them for further analysis.

This is not the only strategy for dealing with oversized objects, but appears to be the best way to go based on investigation and conversations to date. (For one thing, arranging to increase the size limit has significant disadvantages.)

Desired behavior/Acceptance criteria (returned value, expected error, performance expectations, etc.)

  • Objects of all sizes should successfully be cached, Unless they are so large that their compressed size is greater than 1 MB, in which case the failure should be logged.

Completion checklist

Details

Related Changes in Gerrit:
Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Add the zstd-napi package and use zstd compression with memcache-clientrepos/abstract-wiki/wikifunctions/function-orchestrator!496dmartinT406682main
Customize query in GitLab

Event Timeline

Is this still underway? Is there anything with which I could help?

Thanks, James. Recently I have been focused on other orchestrator performance investigations and improvements, considered higher priority (some as part of T406346), but this is obviously also a top priority and I expect to pick this back up this week.

Change #1217215 had a related patch set uploaded (by Jforrester; author: Jforrester):

[operations/deployment-charts@master] wikifunctions: Upgrade orchestrator from 2025-12-02-224740 to 2025-12-10-133418

https://gerrit.wikimedia.org/r/1217215

Change #1217215 merged by jenkins-bot:

[operations/deployment-charts@master] wikifunctions: Upgrade orchestrator from 2025-12-02-224740 to 2025-12-10-133418

https://gerrit.wikimedia.org/r/1217215

This is now live. Please confirm it's working as designed, and then sign off.