Page MenuHomePhabricator

Ensure reports work for Wikidata harvests
Closed, ResolvedPublic

Description

We produce various reports during a harvest:

  • unused images
  • images without
  • unknown fields
  • ...

Ensure that these still work when feed data from a Wikidata harvests. One likely issue is e.g. to on an "unused images" list on a Wikipedia page make the assumption that the "source list" lives on the same wiki.

Event Timeline

One potential difference is that the source field (at least when viewed through the api) lists a wikidata source as http://www.wikidata.org/entity/<Qid> instead of e.g. //ka.wikipedia.org/w/index.php?title=<list name>

Change 370481 had a related patch set uploaded (by Lokal Profil; owner: Lokal Profil):
[labs/tools/heritage@wikidata] Make unused image reports deal with sparql harvested data

https://gerrit.wikimedia.org/r/370481

There is likely a similar need to look at the reflexes in the api. They should already be centralised though.

Change 370481 merged by jenkins-bot:
[labs/tools/heritage@wikidata] Make scripts dealing with the sparql source field deal with sparql harvested data

https://gerrit.wikimedia.org/r/370481

I think that all that is now left to do is to run and inspect the results.

Lokal_Profil claimed this task.

Closing this and adding a comment to do a run once everything is live.