Page MenuHomePhabricator

inconsistency between language fields on the Lexeme page
Closed, ResolvedPublic5 Story Points

Description

A Lexeme page currently has two places to enter a language code: the language code associated with the lemma and the language code associated with the representation. As a user I expect them to accept the same input. Currently the input for the language code of the lemma is correct and the input for the language code of the representation should be adjusted.

Acceptance criteria

  • It is impossible to add a lemma using an invalid language code
  • It is impossible to add a form representation using an invalid language code

Invalid language code here means any string other then the valid language code, defined for the scope of this task as follows:

  • the string which is a language code recognized as valid by "core" wikibase
  • if the input contains a part '-x-' in it, the part before '-x-' is a language code recognized as valid by "core" wikibase, and the part after '-x-' is of form: 'Q' followed by digits

Examples of valid language codes

  • de
  • de-at
  • de-x-Q1996

Examples of invalid language codes

  • foobar
  • de-Q1996
  • de-x-foobar
  • de-x-Q1996-foobar
  • foobar-x-Q1996

Scenarios

Invalid language code for lemma

GIVEN I am on the lexeme page
AND I click edit button of the lexeme header
AND I click add lemma button
WHEN I enter lemma text
AND I enter invalid language code
AND I click save
AND I reload the page
THEN I see the lemma with invalid language code has not been saved

Invalid language code for representation

GIVEN I am on the page of a lexeme with a form
AND I click edit button of the form
AND I click add representation button
WHEN I enter representation text
AND I enter invalid language code
AND I click save
AND I reload the page
THEN I see the representation with invalid language code has not been saved

Code pointer: the entered lang code is checked using validator provided by LexemeValidatorFactory::getLanguageCodeValidator
NOTE: as of 25.04.2018 the language code validation is not strict enough, as "de-x-foobar" would be recognized as the valid language, whereas it has been defined that a valid language code should include a Q-ID-like string after '-x-' tag.

Details

Related Gerrit Patches:
mediawiki/extensions/WikibaseLexeme : wmf/1.32.0-wmf.4Use the same language validation for representations and lemmas
mediawiki/extensions/WikibaseLexeme : wmf/1.32.0-wmf.4Lemma validation: language covered in deserializer
mediawiki/extensions/WikibaseLexeme : masterLemma validation: language covered in deserializer
mediawiki/extensions/WikibaseLexeme : masterUse the same language validation for representations and lemmas

Related Objects

Event Timeline

Restricted Application added a project: Wikidata. · View Herald TranscriptApr 5 2018, 9:31 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Lydia_Pintscher renamed this task from PLACEHOLDER: Inconsistency between language fields on the lexeme page to inconsistency between language fields on the Lexeme page.Apr 6 2018, 8:10 AM
Lydia_Pintscher triaged this task as High priority.
Lydia_Pintscher updated the task description. (Show Details)
WMDE-leszek updated the task description. (Show Details)Apr 6 2018, 11:47 AM
WMDE-leszek updated the task description. (Show Details)Apr 6 2018, 11:51 AM
WMDE-leszek set the point value for this task to 5.Apr 6 2018, 11:53 AM
WMDE-leszek updated the task description. (Show Details)Apr 9 2018, 4:24 PM
WMDE-leszek updated the task description. (Show Details)
WMDE-leszek updated the task description. (Show Details)Apr 9 2018, 4:33 PM
WMDE-leszek updated the task description. (Show Details)
WMDE-leszek updated the task description. (Show Details)Apr 25 2018, 11:26 AM
WMDE-leszek updated the task description. (Show Details)Apr 26 2018, 9:34 AM
WMDE-leszek updated the task description. (Show Details)Apr 26 2018, 9:37 AM

Change 431765 had a related patch set uploaded (by Jakob; owner: Jakob):
[mediawiki/extensions/WikibaseLexeme@master] [WIP] Use the same language validation for representations and lemmas

https://gerrit.wikimedia.org/r/431765

Change 433544 had a related patch set (by Pablo Grass (WMDE)) published:
[mediawiki/extensions/WikibaseLexeme@master] Lemma validation: language covered in deserializer

https://gerrit.wikimedia.org/r/433544

Change 433751 had a related patch set uploaded (by WMDE-leszek; owner: WMDE-leszek):
[mediawiki/extensions/WikibaseLexeme@master] Handle invalid lexemeId in data when using wbeditentity new=form

https://gerrit.wikimedia.org/r/433751

Change 431765 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Use the same language validation for representations and lemmas

https://gerrit.wikimedia.org/r/431765

Change 433544 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Lemma validation: language covered in deserializer

https://gerrit.wikimedia.org/r/433544

WMDE-leszek removed WMDE-leszek as the assignee of this task.May 19 2018, 7:30 AM

Change 434446 had a related patch set uploaded (by Addshore; owner: Jakob):
[mediawiki/extensions/WikibaseLexeme@wmf/1.32.0-wmf.4] Use the same language validation for representations and lemmas

https://gerrit.wikimedia.org/r/434446

Change 434447 had a related patch set uploaded (by Addshore; owner: Pablo Grass (WMDE)):
[mediawiki/extensions/WikibaseLexeme@wmf/1.32.0-wmf.4] Lemma validation: language covered in deserializer

https://gerrit.wikimedia.org/r/434447

Change 434446 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@wmf/1.32.0-wmf.4] Use the same language validation for representations and lemmas

https://gerrit.wikimedia.org/r/434446

Change 434447 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@wmf/1.32.0-wmf.4] Lemma validation: language covered in deserializer

https://gerrit.wikimedia.org/r/434447