As per [[ https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/1129955/comments/291e1642_82bab2d5 | comments ]] from @thiemowmde and @hoo.
As an Engineer, I want to improve the performance of Watchlist and Recent Changes with resolved entity link labels by batch prefetching entities' labels and descriptions.
**Background**
In T388685, we ensure Item and Property Ids within auto-edit summaries of Recent Changes/ Watchlist changes are replaced with their labels from Wikidata. e.g. Q64 is replaced with Berlin.
In its current form, `LinkerMakeExternalLink` will be individually run per link, fetching first the label, then description. In both Watchlist and Recent changes page, this could total >2000 small db queries. At scale, the requests will accumulate and could risk performance deprecation.
We can avoid this deprecation by batch fetching all of the potential values before LinkerMakeExternalLink runs, and caching them in the TermLookup. Then when LinkerMakeExternalLink is called, it gets values from the cached values table, rather than the extremely large database.
**Acceptance criteria**
- A list of values (labels and descriptions) are prefetched and cached in a TermLookup
- Individual term getters within the LinkerMakeExternalLinkHookHandler get values from the Lookup's cache and strictly not from the database
**Tasks**
[] Find a hook where we have access to the full list of recent changes/ watchlist changelog
- //Try onChangesListInitRows//
[] Parse and collate a list of mentioned entity ids from changelog comments (aka auto edit summaries for wikidata changes)
- //See SummaryParsingPrefetchHelper.php//
[] Find a Term lookup which:
- Can query terms using fallback chains
- Can cache the result of the lookup
- //Try CachingPrefetchingTermLookup.php or CachingFallbackLabelDescriptionLookup.php//
[] Prefetch the terms for mentioned entity ids
//- e.g. prefetchTerms method//
[] Pass the same lookup to LinkerMakeExternalLinkHookHandler
[] Check debug console for Queries to the db, and make sure no single calls are being run
[x] Write tests? - won't do: the setup for these tests would be hundreds of lines long and extremely fragile
Note: LabelPrefetchHookHandler::onChangesListInitRows demonstrates how repo prefetches the terms for the Wikidata watchlist. We could model our implementation on this. Attempting to reuse the hook by moving it into lib could come into domain issues, and some code would be redundant in client for our use case, so duplication is likely the best option.
**Deployment**
Since under 200 users from our pilot wikis (hewiki, ukwiki, cawiki) have wikidata changes shown in their watchlists by default, we can deploy T388685 to pilots before this change is deployed. The 'worst case' detailed above is extremely unlikely to happen, particularly on a smaller sized wiki.
Almost 10,000 users in enwiki have wikidata changes shown in their watchlists by default. The 'worst case' therefore is significantly more possible here, and as a result, this change must be deployed before global rollout of T388685.
This is a separate task from T388685 to facilitate easier dev work flows and reviews.
Post-rollout acceptance monitoring (this will be in a separate ticket):
- SRE team does not flag any major issues, once it is fully rolled out, and users don’t complain that it is slower