As a mismatch reviewer, I don't want to waste my time with reviewing mismatches that have already been reviewed by someone else before.
As a mismatch provider, I want the Mismatch Finder to ignore mismatches from my previous uploads that have already been reviewed in order to not have to remove these mismatches from my upload.
Problem:
Currently previously reviewed mismatches will show up again if it they are reuploaded and will have to be reviewed again. This is not a good use of reviewer time. We can solve this issue by simply ignoring intentional mismatches in new uploads that we already have marked as reviewed in previous uploads. This will only apply to the exact same mismatches (i.e. only the expiry date can be different) from the same mismatch provider.
(Removing the already reviewed mismatches by the mismatch provider is possible now that they have access to the review decisions T329156 . However this is additional work we're putting on uploaders that should be avoided.)
BDD
GIVEN Mismatch upload
AND a previous reviewed mismatch
WHEN an exact match of a previous mismatch is uploaded that only differs in the expiry date
THEN it is not imported into the mismatch store
Acceptance criteria:
- We only drop mismatches that have previously been reviewed. Mismatches that have not been previously reviewed are imported again, potentially creating duplicates.
- This should only apply to comparisons between mismatches by the same mismatch provider
- A benchmark for 250 uploads is established and during Product Verification it is decided if the import time is something we want to track
Tech notes
Before the change is made, let's check to see how long it takes to upload 250 mismatches as a benchmark for comparison when this change is made.
Product to review the discrepancy of time once the change is made.
Original ticket
Several people have now set up workflows to automate uploading mismatches to the Mismatch Finder every x weeks. If a mismatch has been uploaded in week 1 and is being reviewed in week 2 as being an intentional mismatch it means that neither the external source nor Wikidata has been adjusted so the two values match. The mismatch will show up again in the next upload in week 3 and will have to be reviewed again. This is not a good use of reviewer time.
Removing the already reviewed mismatches by the mismatch provider is possible once T329156 is fixed and they have access to the review decisions. However this is additional work we're putting on them that should be avoided.
We can solve this issue by simply ignoring mismatches in new uploads that we already have marked as reviewed in previous uploads.