Page MenuHomePhabricator

Unify various wikidata description consumption
Open, Needs TriagePublic

Description

Currently 15% of all of our databases resources is being spent on responding to the term lookups in wikidata: https://performance.wikimedia.org/arclamp/svgs/daily/2024-05-18.excimer-wall.all.reversed.svgz?x=895.1&y=837 That is potentially is up to half of s8. We can't reduce that to zero as they serve important functionalities but they can certainly be improved.

One major problem I see is that there are three major consumers of descriptions:

We can deduplicate their work and reduce the load on our databases drastically, let's assume we just reduce it to half, that's 25% of replicas in s8 and can translate to ~$20,000 cost reduction every year just for the hardware purchases. One simple fix is that WikibaseClient put the description to Parseroutput object and reads it cached (which it doesn't do that currently) and then MF reads from that value as well.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Note: There are currently two types of description, one is Wikibase/Wikidata item description, another is locally defined short description (fallbackable to Wikidata one), currently lives in Wikidata extension, but planned to be moved to a dedicated extension (T282170: Move "Short Descriptions" feature outside of main Wikibase.git code) independent of Wikibase.

Current usages of description differs, and we may want to reconsider which to use:

  • Mobile Frontend adding tagline: use Wikibase description only
  • WikibaseClient adding json+ld schema to pages: use Wikidata description only
  • new vector: use local description
  • API: has option of choose between local or central description, local by default - note this may need breaking change when local description is moved to dedicated extension
  • iOS app: use local description

Another thing to note is local (but currently not central) description is stored as a page property in local wiki, so we do not need requests in Wikidata to fetch it.