Page MenuHomePhabricator

[Bug] possible to remove last lemma
Closed, ResolvedPublic5 Estimated Story Points

Description

wbeditentity API action should not allow to save edit on lexeme where no lemma were provided.

Example edit: https://wikidata-lexeme.wmflabs.org/index.php?title=Lexeme:L551&diff=29567&oldid=29566

Info:
src/DataModel/Serialization/StorageLexemeSerializer.php throws UnsupportedObjectException in similar case (lexicalCategory missing), while src/DataModel/Lexeme.php does the checking.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

As a follow up to this, or added to this task we should check all other APIs that allow editing / removing of lemmas to ensure this can't happen anywhere else.

WMDE-leszek set the point value for this task to 5.Mar 13 2018, 10:55 AM
Jakob_WMDE subscribed.

Discussion had:

  • create spike:
    • remove nullable for (per data model) required properties of Lexeme
    • create LemmaList extension of TermList
    • prevent creation of empty LemmaList

This should also resolve a bug occuring on the demo system

[Mon Apr 02 14:21:18.715616 2018] [:error] [pid 22891] [client 10.68.21.68:37520] PHP Fatal error:  Call to a member function getLemmas() on a non-object in /var/www/html/extensions/WikibaseLexeme/src/PropertyType/LexemeIdHtmlFormatter.php on line 66

@WMDE-leszek checks with other experienced people for reasons not to...

I've had a stab at making constructor parameters related to required properties of Lexeme non-nullable and it seems to break several places where there are explicit new Lexeme() calls.

It seems there are several situations the "empty" lexeme has been created, as discussed below.

Case 1. Diffing

Lexeme/EntityDocument objects are used to calculate diffs between different "phases" of the lexeme. Special cases being 1) lexeme being created, 2) lexeme being destroyed (i.e. deleted). Diffs for those cases are computed by comparing the state N of the lexeme with the "empty" lexeme (i.e. no language, no lemmas etc).

Code pointers:

EntityContent::getDiff -> EntityContent::makeEmptyEntity -> EntityHandler::makeEmptyEntity -> LexemeHandler::makeEmptyEntity

EntityDiffer::getConstructionDiff -> EntityDifferStrategy::getConstructionDiff -> LexemeDiffer::getConstructionDiff
EntityDiffer::getDestructionDiff -> EntityDifferStrategy::getDestructionDiff -> LexemeDiffer::getDestructionDiff

Case 2. Special:NewLexeme page

When submitting the form with the data of the new lexeme filled in, special page class builds up the Lexeme object that is then being saved.

SpecialNewEntity::createForm calls HTMLForm::setSubmitCallback with a function making use of SpecialNewEntity::createEntityFromFormData. That function builds up lexeme "by parts", checking whether some required elements (language, lexical category) fields are filled. This does not seem necessary, as fields are required and being validated to be non-empty earlier on.

Case 3. Getting the Lexeme/EntityDocument object to apply ChangeOps to in the API action code.

wbeditentity API actions allows creating new lexemes with a particular content, and it also allows "clearing" (i.e. removing all existing elements) the lexeme, and filling it with the new data.
It seems for some convenience, the code is dealing with those special behaviour of wbeditentity by allowing to create an empty lexeme object. In other words, there is a code which loads the entity the API action operates on, and in special wbeditentity cases gives back the "empty" lexeme object.

Code pointers:

EntityFactory::newEmpty makes use of 'entity-factory-callback' from WikibaseLexeme.entitytypes.php
EntityFactory::newEmpty is called from
Api\EditEntity::clearEntity (i.e. wbeditentity API called with clear parameter)
and Api\EntitySavingHelper::createEntity (i.e. wbeditentity API called with new=lexeme parameter)

Conclusions:

  • Introducing LemmaList and disallowing it from being empty seems to be a good change to do any way, and it could be an intermediate step before making Lexeme's constructor not accept nulls. Constructor could still allow to pass null instead of LemmaList instance.
  • For the case 1, it might be worth considering not using "empty" lexemes in case of creating/destructing the object. These are the only cases where "empty" lexeme comes to be in diffing. Maybe the option would be to have a NullLexeme class or so for this special case
  • Case 2 seems like a bug that should be fixed by making createEntityFromFormData always call Lexeme's constructor with all required parameters set
  • Case 3 seems to need further thought as more complex.
  • It seems for this task adjusting LexemeContent::isValid/isSufficientlyInitialized might be enough to fix, without needing to jump at any of above yet.

Change 423932 had a related patch set uploaded (by WMDE-leszek; owner: WMDE-leszek):
[mediawiki/extensions/WikibaseLexeme@master] Do not allow saving lexemes without a lemma

https://gerrit.wikimedia.org/r/423932

Change 423932 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Do not allow saving lexemes without a lemma

https://gerrit.wikimedia.org/r/423932

\o/
Will open another ticket later for better feedback to the user.

Change 422426 abandoned by Jakob:
Disallow Lexemes having no lemmas

Reason:
Alternative solution as discussed in https://phabricator.wikimedia.org/T189185

https://gerrit.wikimedia.org/r/422426