Record the reviewer decision and associated metadata for a specific mismatch
Closed, ResolvedPublic8 Estimated Story Points
Actions

Assigned To

Authored By

	Lydia_Pintscher
	Jun 30 2021, 11:17 AM

Description

As a mismatch store admin
I want the mismatch store to be able to receive and record review decisions and their associated metadata
in order to not serve reviewed mismatches again and to enable analysis of the review work being done

Problem:
We need to record the decisions reviewers make for the individual mismatches. For a mismatch, we want to record each decision as well as some metadata.

What we want to record:

Which "review decision" was taken, which will be one of:
- The mismatch is on Wikidata
- The mismatch is in the external data source
- Both are wrong
- None of the above
Who took the decision
When was the decision taken

BDD
GIVEN a mismatch review tool like the mismatch finder website
WHEN a user makes a decision in the tool
AND submits it
THEN an API exists to record this decision in the mismatch store

Acceptance criteria:

It is possible to set a "review decision" for a specific mismatch.
Anyone who provides a valid API token can submit a decision.
We store who made the decision and when it was made.

Note:

The API will allow to submit at minimum one review decision at a time, potentially more.
There is one decision per mismatch. The latest review decision, user name and timestamp prevails. (No history of review decisions will be kept.)

Related Objects
Search...

Status	Assigned	Task
Resolved	ItamarWMDE	T290953 Results page - confirmation of review decision submission
Resolved	Lydia_Pintscher	T290822 Results page - submit review decisions
Resolved	ItamarWMDE	T289846 Results page - prompt reviewer to log in when viewing results page logged out
Resolved	ItamarWMDE	T289557 Results page - indicate review decisions
Resolved	Lydia_Pintscher	T285849 Record the reviewer decision and associated metadata for a specific mismatch
Resolved	Silvan_WMDE	T289132 Update mismatches concept
Resolved	guergana.tzatchkova	T289133 Create an "edit" endpoint to take a partial json with the 'review_status' attribute changed
Resolved	Silvan_WMDE	T289134 Log the review decision with Laravel's logging framework

Event Timeline

Lydia_Pintscher created this task.Jun 30 2021, 11:17 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 30 2021, 11:17 AM

Lydia_Pintscher moved this task from Backlog to Ready for story writing on the Mismatch Finder board.Jun 30 2021, 11:20 AM

Lydia_Pintscher added a project: Wikidata.Jun 30 2021, 11:31 AM

For consistency with Help:Ranking, in addition to the outcome "wrong", shouldn't there be "preferred" as well?

I am not sure how ranks would play into it this way tbh.

In addition to:

"Wikidata was wrong"
"the other side was wrong"

it could be:

Wikidata's statement is preferred, statement in mismatch file should have normal rank
Wikidata's statement should have normal rank, the other side's is preferred

Maybe this is covered by (or this means both should have normal rank):

"the mismatch is intentionally kept"

Also maybe there should be an explicit way to record error in keys, the following isn't that clear:

"neither was wrong/there was some error in the mismatch"

Maybe the following 9 choices can summarize what is usually found at Wikidata:

Both	Mismatch file	Wikidata	Sample: dob	Sample: pob
	[ ] correct - other wrong/deprecated	[ ] correct - other wrong/deprecated	2012, 2009	Eimsbüttel, Berlin
	[ ] preferred - other normal	[ ] preferred - other normal	2012-10-31, 2012	Eimsbüttel, Hamburg
[ ] both normal rank			2012-10-31, 2012-10-30	Germany, West-Germany
[ ] key mismatch	[ ] key not applicable	[ ] key not applicable	2012, 1920	Eimsbüttel, Stockholm
	[ ] conflation	[ ] conflation
[ ] other problem

Samples dob and pob are partially based on Q2013#P571 and Q567#P19

"Key mismatch" means the key used to join Wikidata and the other file is on the wrong item. Possibly this could be in Wikidata or in the Mismatch file.

More samples about mismatches and possible outcomes: Database_reports/identical_birth_and_death_dates
About conflations: Help:Conflation_of_two_people

Ahhh ok. Thanks! That makes sense.
So I think initially we'd want mismatch providers to consider a statement as matching regardless of it's rank if it's the same in both places. That'd have the benefit of being able to concentrate on the mismatches that are more severe. I'd then concentrate on the cases where ranks play a role later once we've got a better understanding how people use it etc.

I'm not sure if I could add the samples I gave above to the options #3 (the mismatch is intentionally kept) and #4 (neither was wrong/there was some error in the mismatch) currently in the task description with certainty. Were these options discussed or described somewhere?

Maybe we should just record a qid that describes the decision. This would keep the system flexible.

In separate steps one could determine

how these qids should be available in the interface of selection (e.g. click somewhere or select from list gives this option)
what further actions should be taken (close mismatch, added several statements, change ranks on statements, etc.)

I asked for input at Project_chat#Fun_with_Mismatches:_typology.

Items can now be found with by query. The list on project chat tries to order them by importance (frequency?).

We could also add support in the API for external reviewers decisions - maybe a flag for a clue with higher trust. Lesson learned in Wikitree land is that many people care about WikiTree data but are less interested in correct on other platforms...

In WikiTree they scan 22 000 000 profiles weekly and create reports and on every profile they add suggestions example Q42 = WikiTree Adams-32825 has 23 suggestions on 282 related profiles.

Wikidata has about 20 errors in those reports 541 <-> 567

Example how the feedback in WikiTree is for a suggested father - "Suggestion 541 Wikidata - Clue for Father"

Wikitree Bailey-31411 <-> Wikidata Q3531665
- suggestion 641 is that WikiTree should add the father = Q75417563

Another aspect is that we should also get the sources of external source and best would be to have a "quality ranking" and this should be machine readable see T222142: WikidataCon 2019: We need a better model communicating quality/relevance of sources in Wikidata / Provenance

see also Talk page

Lydia_Pintscher updated the task description. (Show Details)Jul 26 2021, 8:53 AM

Lydia_Pintscher updated the task description. (Show Details)Jul 28 2021, 10:13 AM

Lydia_Pintscher updated the task description. (Show Details)Jul 28 2021, 10:27 AM

Lydia_Pintscher updated the task description. (Show Details)Jul 28 2021, 10:39 AM

Lydia_Pintscher moved this task from Ready for story writing to Ready for estimating on the Mismatch Finder board.Jul 28 2021, 11:11 AM

• Mattia_Capozzi_WMDE renamed this task from record reviewer decision and associated metadata for a specific mismatch to Record the reviewer decision and associated metadata for a specific mismatch.Jul 28 2021, 12:01 PM

• Mattia_Capozzi_WMDE updated the task description. (Show Details)

karapayneWMDE set the point value for this task to 8.Aug 17 2021, 1:41 PM

karapayneWMDE moved this task from Ready for estimating to Mismatch Finder - sprint 5 on the Mismatch Finder board.

karapayneWMDE edited projects, added Mismatch Finder (Mismatch Finder - sprint 5); removed Mismatch Finder.

Silvan_WMDE updated the task description. (Show Details)Aug 18 2021, 10:45 AM

Plan of action from task breakdown:

Update mismatches concept:
- Create a migration to add reviewer's user_id and rename status to decision
- Update the model
- Update the seeder/factory
- Update the MismatchResource (include user, add links)
Create an "edit" endpoint to take a partial json with the attribute changed
- Make a decision between PUT and PATCH, take into account the experience of Wikibase REST-API
Log the review decision with Laravel's logging framework, into a dedicated log file

ItamarWMDE moved this task from To Do to Parents / Waiting on the Mismatch Finder (Mismatch Finder - sprint 5) board.Aug 18 2021, 10:57 AM

ItamarWMDE closed subtask T289132: Update mismatches concept as Resolved.Aug 27 2021, 7:52 AM

Lydia_Pintscher added a parent task: T289557: Results page - indicate review decisions.Aug 27 2021, 9:37 AM

Silvan_WMDE moved this task from Parents / Waiting to Test (Verification) on the Mismatch Finder (Mismatch Finder - sprint 5) board.Aug 31 2021, 11:41 AM

Silvan_WMDE closed subtask T289133: Create an "edit" endpoint to take a partial json with the 'review_status' attribute changed as Resolved.Aug 31 2021, 12:22 PM

Silvan_WMDE closed subtask T289134: Log the review decision with Laravel's logging framework as Resolved.Aug 31 2021, 12:45 PM

Lydia_Pintscher mentioned this in T290182: add instructions on how to sumbit a review decision to user guide.Sep 1 2021, 4:41 PM

\o/

In the process of reviewing this I was looking for documentation that was still missing from the user guide. I added a task for that at T290182 for later.

	F34556021: image.png
	Jul 19 2021, 9:17 AM

	F34556009: image.png
	Jul 19 2021, 9:17 AM

Record the reviewer decision and associated metadata for a specific mismatchClosed, ResolvedPublic8 Estimated Story PointsActions

Description

Related ObjectsSearch...

Event Timeline

Record the reviewer decision and associated metadata for a specific mismatch
Closed, ResolvedPublic8 Estimated Story Points
Actions

Related Objects
Search...