Page MenuHomePhabricator

Allow reporting of mismatches for labels, descriptions and aliases
Open, MediumPublic

Assigned To
None
Authored By
Lydia_Pintscher
Jul 21 2022, 7:49 AM
Referenced Files
F37541501: image.png
Aug 17 2023, 11:06 AM
F35794423: T313469.png
Nov 18 2022, 12:19 PM
F35794422: T313469.png
Nov 18 2022, 12:19 PM
F35794168: T313469.png
Nov 18 2022, 12:17 PM
F35794133: T313469.png
Nov 18 2022, 12:17 PM
F35794129: T313469.png
Nov 18 2022, 12:17 PM
F35794305: T313469.png
Nov 18 2022, 12:17 PM

Description

As a mismatch provider I want to be able to report mismatches for labels, descriptions and aliases in order to improve the data quality in them as well, not just statements.

Problem:
We currently don't allow reporting of mismatches for labels, descriptions and aliases on the Mismatch Finder. These should be accepted as important data is stored there as well.

For uploaders to understand that new types are now accepted in their csv files, we need to update the Mistmatch Finder User Guide documentation here: https://github.com/wmde/wikidata-mismatch-finder/blob/main/docs/UserGuide.md#importing

Screenshots/mockups:

1a:

image.png (2×2 px, 369 KB)

Figma file

BDD
GIVEN a label, description or alias
AND a mismatch
WHEN a CSV of a mismatch is uploaded to the Mismatch Finder
AND it contains mismatches for labels, descriptions and aliases
THEN they are accepted
AND shown on the Mismatch Finder website

Acceptance criteria:

  • mismatches for labels, descriptions and aliases are accepted from the upload CSV
  • mismatches for labels, descriptions and aliases are shown on the Mismatch Finder website
  • the Mismatch Finder User Guide is updated to reflect that labels, descriptions and aliases are accepted as mismatches

Open questions
There is no GUID for Terms, Labels, Descriptions and Aliases, how will this be represented in the mismatches? And how would we link to these on the mismatches? As these are at the top of the Item pages we could possibly direct to the Item page, but this could be a clunky user experience.

Event Timeline

Lydia_Pintscher changed the task status from Open to Stalled.Jul 21 2022, 7:49 AM
Lydia_Pintscher triaged this task as High priority.
Lydia_Pintscher moved this task from Backlog to Needs work on the Mismatch Finder board.

Marking this as stalled because we need a design.

Lydia_Pintscher changed the task status from Stalled to Open.Nov 14 2022, 9:35 PM
Lydia_Pintscher added a subscriber: Sarai-WMDE.

Unstalling as @Sarai-WMDE is working on it.

Terminological mismatches: Request for feedback on designs

Like in the case of T313467: Ability to report mismatches on qualifiers, this task requires redesigning of the Mismatch Finder's results page and/or table to display new types of mismatches. In this case, said mismatches are related to terminological data: labels in all languages, aliases, potentially descriptions.

Some initial concerns to keep in mind regarding the overall initiative of collecting term mismatches:

  1. Looking for textual differences between free, human-produced text might generate a lot of noise. This might be specially noticeable when it comes to descriptions. We'd need to understand the benefits and usefulness of including this type of mismatches.
  2. When searching for terminological mismatches, we'll of course have to take all languages into account (and indicate this in the results)

The initial design exploration aimed at being explicit, and presented terminology mismatches in a separate table under the same item. This allowed to clearly differentiate these different kinds of mismatches from the property (and qualifir) mismatches.

Initial proposals to visualize term mismatches on the results page (see designs in Figma)

T313469.png (2×2 px, 300 KB)

Description: The Mismatch finder generates a separate table to display mismatches in labels, aliases and descriptions. 

The term table would be displayed below the properties’ table.

This approach was tentatively abandoned in a following iteration, where it was decided to try to group all mismatch types (properties, qualifiers and terms) in a single:

Latest iteration (see designs in Figma)

This unified solution corresponds exactly to what's documented in T313467: Ability to report mismatches on qualifiers.

Option 1: A new column specifies the type of mismatch (two possible placements)Option 2: Descriptive text provides mismatch type inside the Mismatch column
1a
T313469.png (2×2 px, 233 KB)
1b
T313469.png (2×2 px, 233 KB)
T313469.png (2×2 px, 236 KB)
Description: A new column is introduced in the mismatch table to document the ‘type’ of the listed mismatches (or 'where' they are). In option 1a, the new ‘mismatch type’ column is placed to the right of the mismatch column. In option 1b, the column is placed at the beginning of the table.Description: In this version, a subtle descriptive text indicates the type of mismatch from within the Mismatch column.

Questions regarding the latest design iteration:

  1. For label mismatches: Could we link to the item’s termbox?
  2. In general, as mentioned in this comment, the unified approach might be weaker in case all or the big majority of mismatches are going to be related to statement properties: using a whole new column to indicate their type will feel redundant.
Arian_Bozorg renamed this task from allow reporting of mismatches for labels, descriptions and aliases to [SW] Allow reporting of mismatches for labels, descriptions and aliases.Aug 15 2023, 9:24 AM
Arian_Bozorg updated the task description. (Show Details)
Arian_Bozorg renamed this task from [SW] Allow reporting of mismatches for labels, descriptions and aliases to Allow reporting of mismatches for labels, descriptions and aliases.Aug 24 2023, 12:31 PM