Currently 15% of all of our databases resources is being spent on responding to the term lookups in wikidata: https://performance.wikimedia.org/arclamp/svgs/daily/2024-05-18.excimer-wall.all.reversed.svgz?x=895.1&y=837 That is potentially is up to half of s8. We can't reduce that to zero as they serve important functionalities but they can certainly be improved.
One major problem I see is that there are three major consumers of descriptions:
- Mobile Frontend adding tagline (via MobileFrontendHooks::onOutputPageParserOutput()
- WikibaseClient adding json+ld schema to pages via SkinAfterBottomScriptsHandler
- A lot of API calls which is mostly coming from apps: https://github.com/wikimedia/apps-android-wikipedia/blob/d6e160120af5872c2620029a7392d737b9d3b160/app/src/main/java/org/wikipedia/dataclient/Service.kt#L44 and https://github.com/wikimedia/wikipedia-ios/blob/08f5693881a6d0fac23e649643f673a3fb4e724e/Wikipedia/Code/WMFSearchFetcher.m#L167
- Maybe search in new vector does it too? if so, then we should definitely cache value of descriptions in memcached.
We can deduplicate their work and reduce the load on our databases drastically, let's assume we just reduce it to half, that's 25% of replicas in s8 and can translate to ~$20,000 cost reduction every year just for the hardware purchases. One simple fix is that WikibaseClient put the description to Parseroutput object and reads it cached (which it doesn't do that currently) and then MF reads from that value as well.