Page MenuHomePhabricator

[Epic] Feedback processes and tools for data-providers
Open, Needs TriagePublic

Description

Wikidata editors find a number of mistakes in the data that we get from other places (i.e. authority controls files, GLAMS, ...). We'd like to have standardized tools and processes to give feedback to the institution/company/... that provided the data so it can be corrected at the source and not just in Wikidata.

Potential problems we'd like to report:

  • wrong mappings to Wikidata items
  • duplicate entries in the source database
  • learning how we should map concepts in 2 different domains e.g. should books in Wikidata has the same structure as a library that use BIBFrame, should a church in Wikidata follow the same structure as a church at the Swedish National Heritage use....
  • ...

Existing tools and processes:

  • Phabricator

How to get out the most value for your organisation when being part of a fast changing open community as the loosely coupled Wikipedia echo system

  • change management basics using our Phabricator - our prefered way of interact and our best practise

I see this need is getting bigger and bigger in Swedish about the lack of linked data awareness from most Swedish cultural institutions...

If we should get the full potential of Linked data and create trust between loosely coupled groups like Wikidata we need tools and processes. Seeing no signs of traceability like issue T223259: LIBRIS XL <-> VIAF <-> Wikidata will help no one


Picture from Future Learn Strategic Doing Collaboration and Trust

Notes:

  • See T202531 for the inverse
  • Experience with Entity management Europeana <-> Wikidata see tweet / task list

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 22 2018, 1:33 PM
Salgo60 added a subscriber: Salgo60.

Property proposal related to this see maintenance_tag

Salgo60 updated the task description. (Show Details)Nov 18 2018, 2:34 PM
Salgo60 added a comment.EditedMay 14 2019, 12:51 PM

We are right now seeing a small chaos at VIAF regarding handling the Swedish National Libraries new system that is LIBRIS-URI Property:P5587 see T223259: LIBRIS XL <-> VIAF <-> Wikidata

I have asked VIAF and LIBRIS has been asked about plans but

  • no one has published what they will do
  • VIAF is starting mixing Property:P906 and Property:P5587
  • Swedish LIBRIS says after 2 email/questions to them that they have error reported it to OCLC/VIAF but they give us no Task ID or description what they think is wrong or the status

My conclusion is

  • a better change process is needed
  • tasks/progress should be trackable
  • public Workboards are needed so we can link and track what is happening like we can with Wikidata using Phabricator
Salgo60 updated the task description. (Show Details)Jun 13 2019, 4:32 PM

See https://www.wikidata.org/wiki/Property_talk:P7844#regex_and_formatterUrl_exceptions and https://www.wikidata.org/wiki/Property_talk:P7711#Format trying to get culture.fr people into some discussion regarding the identifiers used in their 30 thesauri, that their GINCO system cannot generate sequential IDs, mixes all thesauri into one namespace, and likely the result will be that we'll merge these thesauri on Wikidata.

Nobody from culture.fr participated in these discussions on WD. I just got 1 reply by email, only because someone from a FR museum had a connection to someone at culture.fr. (It was not from the 2 contacts listed at the culture.fr thesauri home page.)

Salgo60 updated the task description. (Show Details)Feb 6 2020, 11:42 AM

Just a quick note: Thanks for your input here so far. I'm taking it into account as I talk to more people and organisations. I'm currently starting these conversations to better understand what they have to share, how they'd like to share it and why they are not already.

Salgo60 added a comment.EditedMar 22 2020, 12:37 PM

@Lydia_Pintscher any progress?!?!? or idea how step 1 should be done in the best way

We have a great potential with Europeana and they speak about open up the internal ""R&D space on Jira" for us adding User cases.... with the speed we see with > 4000 external identifiers Wikidata should try to have a "standard" and an easy way understand who is the "owner" of specific external identifier... and what change processes/helpdesk they have....

Example how tight Wikidata and Europeana is getting see T247719#5983389 and a bad deployement this week were we got > 300 000 non working links ;-)

My vision is

  1. A museum says this artist is the same as WD Qxxxx
  2. The museum is sending the object to Europeana
  3. Europeana checks if it has an "agent" for that WD Q number
    1. If not Europeana creatyes this person as an Agent based on Metadata in Wikidata and updates Wikidata property 7704 with the agent
    2. seconds later all Wikilanguage versions that has in the Authority template Europeana Property 7704 will have a link to Europeana and the object created
      1. today we have the following languages using Property 7704 in the Authority template

Another thought is that we try to have a "middle layer" between Wikidata and the Source

This middle layer is a Wikibase installation that we encourage the Source to update See suggestion to Europeana T251225#6088169

In this Wikibase we will have

  • changes/new items/deletes uploaded from the Source...
  • we can use all the outofthebox tools in Wikibase like versions/talk pages/SPARQL federation/API....
    • the good thing is that a Wiki has a track record of handling Open loosely coupled domains

      --> we get one location to integrate with..... and with tools like wbstack its just some clicks to get a Wikibase and start