Page MenuHomePhabricator

Merging lexemes fails if source lexeme has links to its own senses
Closed, ResolvedPublic

Description

If a lexeme has statements which link to its own senses (e.g. usage example with qualifier demonstrates sense) and you try to merge it into another lexeme, it fails with the error "Failed to merge Lexemes, please resolve any conflicts first. Error: Lexemes link to each other in a statement."

To reproduce, go to https://test.wikidata.org/wiki/Special:MergeLexemes and try to merge https://test.wikidata.org/wiki/Lexeme:L1474 into https://test.wikidata.org/wiki/Lexeme:L1473

Event Timeline

I looked a bit into the code and I’m pretty sure that this is because Wikibase’s StatementEntityReferenceExtractor is stateful (has $this->entityIds, never clears them). WikibaseLexeme’s NoCrossReferencingLexemeStatements uses the same LexemeStatementEntityReferenceExtractor to extract referenced entity IDs from both lexemes, and the LexemeStatementEntityReferenceExtractor contains a StatementEntityReferenceExtractor which, due to the way it’s written, will return any IDs it already found on future extractEntityIds() calls, so when WikibaseLexeme has

$oneRefIds = $this->refExtractor->extractEntityIds( $one );
$twoRefIds = $this->refExtractor->extractEntityIds( $two );

then $oneRefIds will be all the entity IDs referenced in $one (the $source, according to how the class is used in LexemeMerger), but $twoRefIds will be all the entity IDs referenced in $two ($target) and $one, so if $one references itself, that’s counted as a reference from $two to $one and therefore blocks the merge. This explanation is consistent with how a merge from L1474 into L1473 is blocked (L1474 is the source lexeme and also the one that has the self-reference), while a merge from L1473 into L1474 is allowed (not tested with those lexemes, but tested with a real pair on Wikidata).

Change 661728 had a related patch set uploaded (by Lucas Werkmeister; owner: Lucas Werkmeister):
[mediawiki/extensions/Wikibase@master] Let StatementEntityReferenceExtractor be reused

https://gerrit.wikimedia.org/r/661728

Change 661729 had a related patch set uploaded (by Lucas Werkmeister; owner: Lucas Werkmeister):
[mediawiki/extensions/WikibaseLexeme@master] Test merging self-referential lexemes

https://gerrit.wikimedia.org/r/661729

Change 661728 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Let StatementEntityReferenceExtractor be reused

https://gerrit.wikimedia.org/r/661728

Change 661729 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Test merging self-referential lexemes

https://gerrit.wikimedia.org/r/661729