Page MenuHomePhabricator

Add P180 (Depicts) and P6243 (Digital representation of) structured data to Commons files representing artworks by Jakob Smits
Closed, ResolvedPublic

Description

Add the correct Depicts and Digital representation of statements to the files showing artworks in the Commons category https://commons.wikimedia.org/wiki/Category:Jakob_Smits

Event Timeline

Ideally, I'd like to find a workflow for this that is achievable by a 'muggle' (someone who is not a coder/developer) like myself (i.e. it's probably very easy to achieve with Pywikibot but I'd like to do it with a tool).

The Q items that need to be connected to these files via P180 and P6243 are different for each file and can be retrieved via this Wikidata query:

https://w.wiki/C9h

I think this should somehow be doable by a muggle with QuickStatements, but haven't been able to figure out how. Perhaps @LucasWerkmeister and @Magnus have tips?

Ecritures triaged this task as Medium priority.
Ecritures added a project: SDC General.

My best idea so far is to make the SPARQL query concatenate all the image titles, separated by pipe characters –

SELECT (GROUP_CONCAT(wikibase:decodeUri(STRAFTER(STR(?image), "http://commons.wikimedia.org/wiki/Special:FilePath/")); separator = " | ") AS ?images) WHERE {
  ?item wdt:P170 wd:Q3157590.
  ?item wdt:P18 ?image.
}

– and then copy+paste the result into ACDC (it will split the list into individual files again).

I can’t think of a way to do it with QuickStatements, because as far as I know for that we need to turn the titles into M-IDs; and I don’t know a way to turn the files from the query results into a PagePile either (which would be easier to add to ACDC than this copy+pasted list).

Have you seen Magnus Manske sdc_tool.

Yes, I have, but that can't do this specific thing I want to (which is to assign different structured data to each image individually).

Spinster added a subscriber: SandraF_WMF.

Not correct to assign this to me, as I actually want to ask around at the Wiki-Techstorm-2019 whether someone else can solve this (it may require coding).

My best idea so far is to make the SPARQL query concatenate all the image titles, separated by pipe characters –

SELECT (GROUP_CONCAT(wikibase:decodeUri(STRAFTER(STR(?image), "http://commons.wikimedia.org/wiki/Special:FilePath/")); separator = " | ") AS ?images) WHERE {
  ?item wdt:P170 wd:Q3157590.
  ?item wdt:P18 ?image.
}

– and then copy+paste the result into ACDC (it will split the list into individual files again).

That's indeed a way to get the files into AC/DC, thanks! However, then I need to give each file an other SDC statement (i.e. make each image point to another Q item), which is the tricky part of my request and which is something AC/DC can't do...

I can’t think of a way to do it with QuickStatements, because as far as I know for that we need to turn the titles into M-IDs;

Yes, I suspected this would be the blocker... Any idea if there's any progress in finding a solution for this? Anything that can be done by the Structured-Data-Backlog team to help?

and I don’t know a way to turn the files from the query results into a PagePile either (which would be easier to add to ACDC than this copy+pasted list).

I think the approach you proposed above is already great, thanks!

@Husky is building a tool, aptly named Minefield, to convert filenames to M item numbers, see T238908: Minefield: A tool to convert Commons page title to media ID's, which will provide a missing link to be able to feed the necessary edits into QuickStatements.

And thanks to the Minefield tool, I have successfully added structured data to the files in that category now! \o/

https://tools.wmflabs.org/quickstatements/#/batch/22170
https://tools.wmflabs.org/quickstatements/#/batch/22157

Spinster moved this task from Backlog to Done on the Wiki-Techstorm-2019 board.