Page MenuHomePhabricator

[LEX] OutOfRangeException: Lexeme L136328 doesn't have Form L136328-F2
Open, Needs TriagePublicPRODUCTION ERROR

Description

Error
normalized_message
[{reqId}] {exception_url}   OutOfRangeException: Lexeme L136328 doesn't have Form L136328-F2
exception.trace
from /srv/mediawiki/php-1.40.0-wmf.18/extensions/WikibaseLexeme/src/Domain/Model/Lexeme.php(310)
#0 /srv/mediawiki/php-1.40.0-wmf.18/extensions/WikibaseLexeme/src/Domain/Diff/LexemePatcher.php(174): Wikibase\Lexeme\Domain\Model\Lexeme->getForm(Wikibase\Lexeme\Domain\Model\FormId)
#1 /srv/mediawiki/php-1.40.0-wmf.18/extensions/WikibaseLexeme/src/Domain/Diff/LexemePatcher.php(92): Wikibase\Lexeme\Domain\Diff\LexemePatcher->patchForms(Wikibase\Lexeme\Domain\Model\Lexeme, Wikibase\Lexeme\Domain\Diff\LexemeDiff)
#2 /srv/mediawiki/php-1.40.0-wmf.18/extensions/Wikibase/lib/packages/wikibase/data-model-services/src/Diff/EntityPatcher.php(40): Wikibase\Lexeme\Domain\Diff\LexemePatcher->patchEntity(Wikibase\Lexeme\Domain\Model\Lexeme, Wikibase\Lexeme\Domain\Diff\LexemeDiff)
#3 /srv/mediawiki/php-1.40.0-wmf.18/extensions/Wikibase/repo/includes/EditEntity/MediawikiEditEntity.php(464): Wikibase\DataModel\Services\Diff\EntityPatcher->patchEntity(Wikibase\Lexeme\Domain\Model\Lexeme, Wikibase\Lexeme\Domain\Diff\LexemeDiff)
#4 /srv/mediawiki/php-1.40.0-wmf.18/extensions/Wikibase/repo/includes/EditEntity/MediawikiEditEntity.php(730): Wikibase\Repo\EditEntity\MediawikiEditEntity->fixEditConflict(Wikibase\Lexeme\Domain\Model\Lexeme)
#5 /srv/mediawiki/php-1.40.0-wmf.18/extensions/Wikibase/repo/includes/EditEntity/StatsdSaveTimeRecordingEditEntity.php(77): Wikibase\Repo\EditEntity\MediawikiEditEntity->attemptSave(Wikibase\Lexeme\Domain\Model\Lexeme, string, integer, boolean, boolean, array)
#6 /srv/mediawiki/php-1.40.0-wmf.18/extensions/WikibaseLexeme/src/MediaWiki/Api/RemoveForm.php(185): Wikibase\Repo\EditEntity\StatsdSaveTimeRecordingEditEntity->attemptSave(Wikibase\Lexeme\Domain\Model\Lexeme, string, integer, boolean, NULL, array)
#7 /srv/mediawiki/php-1.40.0-wmf.18/includes/api/ApiMain.php(1901): Wikibase\Lexeme\MediaWiki\Api\RemoveForm->execute()
#8 /srv/mediawiki/php-1.40.0-wmf.18/includes/api/ApiMain.php(878): ApiMain->executeAction()
#9 /srv/mediawiki/php-1.40.0-wmf.18/includes/api/ApiMain.php(849): ApiMain->executeActionWithErrorHandling()
#10 /srv/mediawiki/php-1.40.0-wmf.18/api.php(90): ApiMain->execute()
#11 /srv/mediawiki/php-1.40.0-wmf.18/api.php(45): wfApiMain()
#12 /srv/mediawiki/w/api.php(3): require(string)
#13 {main}
Impact
Notes

Details

Event Timeline

Interesting – when I call wblremoveform with a form ID that doesn’t exist (e.g. L123-F1000000), I get an ordinary error:

{
    "error": {
        "code": "not-found",
        "info": "Field \"id\" contains an ID of a Form \"L123-F1000000\" that could not be found.",
        "parameterName": "id",
        "fieldPath": [],
        "docref": "See https://www.wikidata.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes."
    },
    "servedby": "mw1470"
}

So what’s different about the request that produced the OutOfRangeException?

Still can’t reproduce it when I try to make two concurrent wblremoveform requests locally – one succeeds and the other returns a proper “not found” error (edit: or an “editconflict” error – both seem about equally likely), without crashing:

( async function() {
  const api = new mw.Api();
  const lexemeId = ( await api.postWithEditToken( {
    action: 'wbeditentity',
    new: 'lexeme',
    data: JSON.stringify( {
      lexicalCategory: 'Q11',
      language: 'Q11',
      lemmas: {
        en: { language: 'en', value: `T326768 ${new Date().toISOString()}` },
      },
      forms: [ {
        add: '',
        representations: {
          en: { language: 'en', value: `T326768 ${new Date().toISOString()}` },
        },
        grammaticalFeatures: [],
      } ],
    } ),
    summary: 'T326768',
  } ) ).entity.id;
  const deleteForm = () => api.postWithEditToken( {
    action: 'wblremoveform',
    id: `${lexemeId}-F1`,
  } );
  await Promise.all( [ deleteForm(), deleteForm() ] );
} )();

(You might need to change the language and lexical category to a different item ID than x depending on what your local wiki looks like.)

Upon closer inspection, this doesn’t look like two concurrent requests – L136328-F2 was removed 23:51, while the logstash error is from 23:59:46.647, about eight minutes later.

So I tried making a second non-concurrent wblremoveform request instead, with the same baserevid, but that still only gives me a proper error (always “editconflict” this time), not a crash:

(async function() {
  const api = new mw.Api();
  const { id: lexemeId, lastrevid: revisionId } = ( await api.postWithEditToken( {
    action: 'wbeditentity',
    new: 'lexeme',
    data: JSON.stringify( {
      lexicalCategory: 'Q11',
      language: 'Q11',
      lemmas: {
        en: { language: 'en', value: `T326768 ${new Date().toISOString()}` },
      },
      forms: [ {
        add: '',
        representations: {
          en: { language: 'en', value: `T326768 ${new Date().toISOString()}` },
        },
        grammaticalFeatures: [],
      } ],
    } ),
    summary: 'T326768',
  } ) ).entity;
  const deleteForm = () => api.postWithEditToken( {
    action: 'wblremoveform',
    id: `${lexemeId}-F1`,
    baserevid: revisionId,
  } );
  await deleteForm();
  await deleteForm();
})();

Also haven’t had luck trying to produce an edit conflict by editing the form and then removing it:

(async function() {
  const api = new mw.Api();
  const { id: lexemeId, lastrevid: revisionId } = ( await api.postWithEditToken( {
    action: 'wbeditentity',
    new: 'lexeme',
    data: JSON.stringify( {
      lexicalCategory: 'Q11',
      language: 'Q11',
      lemmas: {
        en: { language: 'en', value: `T326768 ${new Date().toISOString()}` },
      },
      forms: [ {
        add: '',
        representations: {
          en: { language: 'en', value: `T326768 ${new Date().toISOString()}` },
        },
        grammaticalFeatures: [],
      } ],
    } ),
    summary: 'T326768',
  } ) ).entity;
  const formId = `${lexemeId}-F1`;
  await api.postWithEditToken( {
    action: 'wbleditformelements',
    formId: formId,
    data: JSON.stringify( {
      representations: {
        en: { language: 'en', value: `T326768 (edited) ${new Date().toISOString()}` },
      },
      grammaticalFeatures: [],
    } ),
    baserevid: revisionId,
  } );
  await api.postWithEditToken( {
    action: 'wblremoveform',
    id: formId,
    baserevid: revisionId,
  } );
})();

Those edits actually all succeed.

The strange thing is that line 174 of LexemePatcher.php (the second line of the stack trace) is the second line here:

case $formDiff instanceof ChangeFormDiffOp:
	$form = $lexeme->getForm( $formDiff->getFormId() );
	if ( $form !== null ) {
		$this->formPatcher->patch( $form, $formDiff );
	}
	break;

Lexeme::getForm() can’t actually return null, so maybe that if ( $form !== null ) should just be turned into a try/catch for the OutOfRangeException we’re seeing here. But I don’t understand why the wblremoveform request would end up in this code at all – $formDiff should be a RemoveFormDiff, not a ChangeFormDiffOp, I think.

This has also happened a few other times in the last 7 days – 30 times on L136328-F2 (the form ID mentioned before), and once each on L679424-F5 (Jan 12, 2023 @ 15:01:16.442) and L942105-F1 (Jan 11, 2023 @ 17:49:23.770). The latter two lexemes were edited by a non-anonymous user, عُثمان, so I’ve contacted him to see if he can enlighten us about what might cause this error.

عُثمان reports no user impact: apparently his requests to remove the two forms still succeeded, without even needing a retry? Very confusing.

Lexeme::getForm() can’t actually return null, so maybe that if ( $form !== null ) should just be turned into a try/catch for the OutOfRangeException we’re seeing here.

@thiemowmde actually suggested this change years ago: Fix unchecked errors in LexemePatcher::patchForms and Lexeme::getForm (after a previous change, Remove convenience function Lexeme::hasForm, had produced the OutOfRangeException in the first place). However, without being able to reproduce this bug, I’m not sure if we should try this fix or not, to be honest.

It was the other way around. I suggested to remove the problematic hasForm after the "unchecked errors" patch got stale. Reading the discussion I still don't understand why it wasn't merged. (Well, not being linked to a ticket definitely played a role.) The patcher just throws the wrong exception. It could be so easy. Please feel free to restore or redo the patch.

Oh, sorry, then I didn’t pay attention to the timestamps.

Change 879855 had a related patch set uploaded (by Michael Große; author: Michael Große):

[mediawiki/extensions/WikibaseLexeme@master] Add more logging to unexpected exception being thrown

https://gerrit.wikimedia.org/r/879855

This adds logging close to where the exception is being thrown and where we know we're in the wrong place. Not sure how useful it will be, though? I hope at that point, all the diffs we get should actually be serializable 🤞.

But maybe we want to add this logging instead somewhere further up the stack trace, where we have different information available?

Change 879855 merged by jenkins-bot:

[mediawiki/extensions/WikibaseLexeme@master] Add more logging to unexpected exception being thrown

https://gerrit.wikimedia.org/r/879855

ItamarWMDE renamed this task from OutOfRangeException: Lexeme L136328 doesn't have Form L136328-F2 to [LEX] OutOfRangeException: Lexeme L136328 doesn't have Form L136328-F2.Wed, Nov 8, 10:51 AM
ItamarWMDE moved this task from Incoming to [DOT] By Project on the wmde-wikidata-tech board.
ItamarWMDE added a project: Wikidata Dev Team.

Prio Notes

Impact AreaAffected
production / end usersmaybe
monitoringyes
development effortsno
onboarding effortsno
additional stakeholdersmaybe
ItamarWMDE renamed this task from [LEX] OutOfRangeException: Lexeme L136328 doesn't have Form L136328-F2 to [SW] [LEX] OutOfRangeException: Lexeme L136328 doesn't have Form L136328-F2.Tue, Nov 21, 3:56 PM
ItamarWMDE renamed this task from [SW] [LEX] OutOfRangeException: Lexeme L136328 doesn't have Form L136328-F2 to [LEX] OutOfRangeException: Lexeme L136328 doesn't have Form L136328-F2.Wed, Nov 22, 10:06 AM