growthexperiments_link_recommendations caches the responses from the mwaddlink service. Responses for non-current revisions of pages are not useful anymore and should be cleaned up periodically to save storage space. That can be done via a DeferredUpdate. See also T268803#6861562
Description
Details
| Subject | Repo | Branch | Lines +/- | |
|---|---|---|---|---|
| Delete link recommendation from DB when it becomes invalid | mediawiki/extensions/GrowthExperiments | master | +45 -30 |
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | MMiller_WMF | T252822 [EPIC] Growth: "add a link" structured task 1.0 | |||
| Resolved | kostajh | T266437 Add a link engineering: backend product specifications | |||
| Resolved | kostajh | T261396 Add a link: engineering tasks for initial release | |||
| Resolved | Tgr | T275790 Add a link engineering: clean up old entries in growthexperiments_link_recommendations |
Event Timeline
IIRC we were going to do this by listening after page edit and invaliding the cache and search index that way, or is the proposed cronjob handling a different concern?
We do that for the index but not for the cache. That might be a better option, I didn't want to impact page save performance but I guess in a deferred update it should be fine.
Change 674394 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Delete link recommendation from DB when it becomes invalid
To QA, you'd have to make an edit, then verify in the link recommendation DB table that the relevant row has been removed.
Change 674394 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Delete link recommendation from DB when it becomes invalid
@kostajh, @Tgr
(1) answering a single 'yes' or 'no' will remove an article entirely from the growthexperiments_link_recommendations table even though there are many suggested links on the article that were not reviewed.
(2) reverting add link edits will not reinstate the article as a suggested links article
As far as I remember (1) and (2) are the expected behavior - just double-checking.
(3) in betalabs a suggested link article is removed from the growthexperiments_link_recommendations but it is still present in SE module. When a user clicks on such article, the warning is displayed: "Suggestions are no longer available on this article". Is it specific to betalabs env?
It might be, the mechanism for updating the search index (which the results in the homepage module are based on) is completely different between beta and production. It's not a known issue though.
Actually, that's only true for adding entries to the search index, not removing them. Still, any non-MediaWiki infrastructure (including the search index) is unreliable in beta, so it might be beta-specific. You should be able to verify on testwiki.
The warning "Suggestions are no longer available on this article" is quite frequent on testwiki wmf.5. Unfortunately, I don't have access rights to production db to verify the issue the same way as I did for betalabs.
Got the Console error "Link suggestion not found for "Murray Gell-Mann"" on cswiki wmf.5.
It should be fixed after T282873: Add Link: Fix production discrepancies between the link recommendation table and the search index is done.
The issue that made that task necessary did not affect testwiki though, so maybe there is still some unknown bug with (de)indexing.
I think this is working (or at least we have no evidence to the contrary). We have two somewhat similar issues:
- the one mentioned in T261407: Add a link engineering: Create event for event gate to update search index after obtaining link recommendations: search index updates get lost so there are new database entries with no search index entry. Those should be fixed, but probably not by cleaning them up but by putting them in the search index. Might be done on the search side or our side.
- the "no suggestion for this article" errors, ie. search index entries where the DB entry is missing.