Page MenuHomePhabricator

Record the reviewer decision and associated metadata for a specific mismatch
Closed, ResolvedPublic8 Estimated Story Points

Description

As a mismatch store admin
I want the mismatch store to be able to receive and record review decisions and their associated metadata
in order to not serve reviewed mismatches again and to enable analysis of the review work being done

Problem:
We need to record the decisions reviewers make for the individual mismatches. For a mismatch, we want to record each decision as well as some metadata.

What we want to record:

  • Which "review decision" was taken, which will be one of:
    • The mismatch is on Wikidata
    • The mismatch is in the external data source
    • Both are wrong
    • None of the above
  • Who took the decision
  • When was the decision taken

BDD
GIVEN a mismatch review tool like the mismatch finder website
WHEN a user makes a decision in the tool
AND submits it
THEN an API exists to record this decision in the mismatch store

Acceptance criteria:

  • It is possible to set a "review decision" for a specific mismatch.
  • Anyone who provides a valid API token can submit a decision.
  • We store who made the decision and when it was made.

Note:

  • The API will allow to submit at minimum one review decision at a time, potentially more.
  • There is one decision per mismatch. The latest review decision, user name and timestamp prevails. (No history of review decisions will be kept.)

Event Timeline

For consistency with Help:Ranking, in addition to the outcome "wrong", shouldn't there be "preferred" as well?

I am not sure how ranks would play into it this way tbh.

In addition to:

  • "Wikidata was wrong"
  • "the other side was wrong"

it could be:

  • Wikidata's statement is preferred, statement in mismatch file should have normal rank
  • Wikidata's statement should have normal rank, the other side's is preferred

Maybe this is covered by (or this means both should have normal rank):

  • "the mismatch is intentionally kept"

Also maybe there should be an explicit way to record error in keys, the following isn't that clear:

  • "neither was wrong/there was some error in the mismatch"

Maybe the following 9 choices can summarize what is usually found at Wikidata:

BothMismatch fileWikidataSample: dobSample: pob
[ ] correct - other wrong/deprecated[ ] correct - other wrong/deprecated2012, 2009Eimsbüttel, Berlin
[ ] preferred - other normal[ ] preferred - other normal2012-10-31, 2012Eimsbüttel, Hamburg
[ ] both normal rank2012-10-31, 2012-10-30Germany, West-Germany
[ ] key mismatch[ ] key not applicable [ ] key not applicable2012, 1920 Eimsbüttel, Stockholm
[ ] conflation[ ] conflation
[ ] other problem

Samples dob and pob are partially based on Q2013#P571 and Q567#P19

"Key mismatch" means the key used to join Wikidata and the other file is on the wrong item. Possibly this could be in Wikidata or in the Mismatch file.

Ahhh ok. Thanks! That makes sense.
So I think initially we'd want mismatch providers to consider a statement as matching regardless of it's rank if it's the same in both places. That'd have the benefit of being able to concentrate on the mismatches that are more severe. I'd then concentrate on the cases where ranks play a role later once we've got a better understanding how people use it etc.

I'm not sure if I could add the samples I gave above to the options #3 (the mismatch is intentionally kept) and #4 (neither was wrong/there was some error in the mismatch) currently in the task description with certainty. Were these options discussed or described somewhere?

Maybe we should just record a qid that describes the decision. This would keep the system flexible.

In separate steps one could determine

  • how these qids should be available in the interface of selection (e.g. click somewhere or select from list gives this option)
  • what further actions should be taken (close mismatch, added several statements, change ranks on statements, etc.)

I asked for input at Project_chat#Fun_with_Mismatches:_typology.

Items can now be found with by query. The list on project chat tries to order them by importance (frequency?).

We could also add support in the API for external reviewers decisions - maybe a flag for a clue with higher trust. Lesson learned in Wikitree land is that many people care about WikiTree data but are less interested in correct on other platforms...

In WikiTree they scan 22 000 000 profiles weekly and create reports and on every profile they add suggestions example Q42 = WikiTree Adams-32825 has 23 suggestions on 282 related profiles.

  • Wikidata has about 20 errors in those reports 541 <-> 567

image.png (1×2 px, 512 KB)

Example how the feedback in WikiTree is for a suggested father - "Suggestion 541 Wikidata - Clue for Father"

image.png (1×2 px, 600 KB)

Another aspect is that we should also get the sources of external source and best would be to have a "quality ranking" and this should be machine readable see T222142: WikidataCon 2019: We need a better model communicating quality/relevance of sources in Wikidata / Provenance

Mattia_Capozzi_WMDE renamed this task from record reviewer decision and associated metadata for a specific mismatch to Record the reviewer decision and associated metadata for a specific mismatch.Jul 28 2021, 12:01 PM
Mattia_Capozzi_WMDE updated the task description. (Show Details)

Plan of action from task breakdown:

  1. Update mismatches concept:
    • Create a migration to add reviewer's user_id and rename status to decision
    • Update the model
    • Update the seeder/factory
    • Update the MismatchResource (include user, add links)
  2. Create an "edit" endpoint to take a partial json with the attribute changed
    • Make a decision between PUT and PATCH, take into account the experience of Wikibase REST-API
  3. Log the review decision with Laravel's logging framework, into a dedicated log file
Lydia_Pintscher claimed this task.

\o/

In the process of reviewing this I was looking for documentation that was still missing from the user guide. I added a task for that at T290182 for later.