Page MenuHomePhabricator

Write maintenance script for removing deleted items from the term store
Closed, ResolvedPublic


This is done to ensure we clean up not just wbt_term_in_lang, but also wbt_text_in_lang and wbt_text.
The maintenance script shouldn’t need to do much more than call TermStoreCleaner::cleanTermInLangIds(), after checking that the term really isn’t used anymore.

Script to be located in repo/maintenance/.

Event Timeline

There seems to be some confusion at the moment about whose responsibility it is to ensure terms are unused when they’re cleaned. The TermStoreCleaner interface says:

Delete the given term in lang IDs.
Ensuring that they are unreferenced is the caller’s responsibility.

But the DatabaseUsageCheckingTermStoreCleaner implementation does:

Checks the provided TermInLangIds for existence and usage in either
on both Items and Properties.

Those that do actually exist and are unused are passed to an inner cleaner.

$unusedTermInLangIds = $this->findActuallyUnusedTermInLangIds( $termInLangIds, $dbw );
$this->innerCleaner->cleanTermInLangIds( $dbw, $dbr, $unusedTermInLangIds );

I think the “it’s the caller’s responsibility” was the original design, but then Split new term storage cleaning into own transaction happened (T244115, cc @Addshore) and now the cleaner has to check for unused terms as well. We should check how the interface is currently used; if all callers now rely on the cleaner checking for usage, then it’s probably best to update the interface documentation (and the InMemoryTermStore implementation) while leaving the DatabaseUsageCheckingTermStoreCleaner unchanged.

Change 656433 had a related patch set uploaded (by Rosalie Perside (WMDE); owner: Rosalie Perside (WMDE)):
[mediawiki/extensions/Wikibase@master] Maintenance script for removing deleted items from the term store

Change 656433 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Maintenance script for removing deleted items from the term store