Page MenuHomePhabricator

Adapt QuickStatements2 to be able to work with structured data on Commons as well
Open, LowPublic

Description

Make adaptations to QuickStatements2, the Wikidata mass editing tool, so that it can edit structured metadata on Wikimedia Commons as well.

Event Timeline

I will look into this once there is a realistic test site. QS2 is designed for multiple sites, see https://phabricator.wikimedia.org/source/tool-quickstatements/browse/master/public_html/sites.json

SandraF_WMF lowered the priority of this task from Medium to Low.Nov 22 2017, 12:56 PM

FYI, @ChristianKl commented on QuickStatements2 functionality at https://commons.wikimedia.org/wiki/Commons_talk:Structured_data/Get_involved/Tools:

"One of the big problems with QuickStatements is that it's an external tool. Given that banning a user that has an active QuickStatement batch running doesn't stop the batch, there talk on the Wikidata Admin chat that we sometimes might to temporarily block the QuickStatementBot. Undoing QuickStatement batches is also generally cumbersome. I think it would be very valuable if the functionality would be integrated more natively. If Wikidata (or Commons for that matter) would have a representation of all the edits that come from a batch as a batch it could provide an easy way to undo/redo the batch or discuss it. ChristianKl (talk) 11:05, 26 November 2017 (UTC)"

FYI, @ChristianKl commented on QuickStatements2 functionality at https://commons.wikimedia.org/wiki/Commons_talk:Structured_data/Get_involved/Tools:

"One of the big problems with QuickStatements is that it's an external tool. Given that banning a user that has an active QuickStatement batch running doesn't stop the batch, there talk on the Wikidata Admin chat that we sometimes might to temporarily block the QuickStatementBot. Undoing QuickStatement batches is also generally cumbersome. I think it would be very valuable if the functionality would be integrated more natively. If Wikidata (or Commons for that matter) would have a representation of all the edits that come from a batch as a batch it could provide an easy way to undo/redo the batch or discuss it. ChristianKl (talk) 11:05, 26 November 2017 (UTC)"

And FYI2, this partly corresponds with two of the user stories that @Ramsey-WMF has outlined for Structured Commons:

An addition to these user stories would thus be, that batches must be easily undo/redo/discussable, so that Commons admins also don't hate their lives.

One sidelight on this. QS2 is great (invaluable!) for adding new statements, but not so good for modifying existing statements, especially if they are currently heavily qualified or referenced -- at the moment, the entire statement complete with all qualifiers and references has to be re-created, even if one only wants to change one qualifier value, or migrate the property being used for the qualifier.

Not sure what the answer is for this, but it may be a sharper issue for Structured Data, because there may be types of statements there (eg attributions?) that we will be expecting to quite heavily qualified, as part of the data model.

One other thing to note is that QS2 is currently throttled a lot slower than eg Cat-a-lot on Commons. If the Structured equivalent of Cat-a-lot were using QS2 as a back-end (not an unreasonable possibility), it will seem x10 or x50 more sluggish than the present Cat-a-lot.

As of today, QuickStatements supports MediaInfo items (Mxxx).
For now, you'll have to supply the IDs manually, which is a pain.
I am working on a QS syntax parser in Rust, which will support

  • ranks
  • page/filename => ID conversion on-the-fly

This will require some more testing

Can my colleagues from the StructuredDataOnCommons team do anything to make it easier to include filename-to-Mid conversion in QuickStatements? It is actually a blocker to do proper Commons-related batch edits like for instance outlined in T238443: Add P180 (Depicts) and P6243 (Digital representation of) structured data to Commons files representing artworks by Jakob Smits and I'll be happy to help give things a push if needed.

According to @Multichill a file's M ID is identical to its page ID (example in file info) and there are also Concept URI links in the left hand sidebar on Commons file pages, but perhaps retrieving these for usage in tools like QuickStatements can be made easier?

The comment below is a good one to take into account in the context of technical scoping/prioritization in the WMSE-Tools-for-Partnerships-2019-Blueprinting project:

FYI, @ChristianKl commented on QuickStatements2 functionality at https://commons.wikimedia.org/wiki/Commons_talk:Structured_data/Get_involved/Tools:

"One of the big problems with QuickStatements is that it's an external tool. Given that banning a user that has an active QuickStatement batch running doesn't stop the batch, there talk on the Wikidata Admin chat that we sometimes might to temporarily block the QuickStatementBot. Undoing QuickStatement batches is also generally cumbersome. I think it would be very valuable if the functionality would be integrated more natively. If Wikidata (or Commons for that matter) would have a representation of all the edits that come from a batch as a batch it could provide an easy way to undo/redo the batch or discuss it. ChristianKl (talk) 11:05, 26 November 2017 (UTC)"

We're at the Wiki-Techstorm-2019 and @Husky is building a tool, aptly named Minefield, to convert filenames to M item numbers, see T238908: Minefield: A tool to convert Commons page title to media ID's. For now this can provide help in formatting the right commands for QuickStatements.

Can my colleagues from the StructuredDataOnCommons team do anything to make it easier to include filename-to-Mid conversion in QuickStatements?

FYI, getting the correct M-id from a page title can also be done via an action=wbgetentities API call, like so: https://commons.wikimedia.org/w/api.php?action=wbgetentities&sites=commonswiki&titles=File:2018-07-05-budapest-buda-hill.jpg
With above API call, you'll find that the id for File:2018-07-05-budapest-buda-hill.jpg on Commons is M75908279.

The tool mentioned by @Spinster above is done and working over here:

https://tools.wmflabs.org/hay/minefield/

It also supports PagePile ids.

Magnus subscribed.

FYI, OpenRefine is (as of its version 3.6) now also able to batch add StructuredDataOnCommons to Wikimedia Commons files. Version 3.7 will do batch uploads too. See https://commons.wikimedia.org/wiki/Commons:OpenRefine for more info.

Lookup of M-ids (and retrieval of Wikitext and existing SDC statement values) is now possible there thanks to the Wikimedia Commons Reconciliation Service, see also Reconciliation. https://commonsreconcile.toolforge.org

That said, QuickStatements remains a very useful tool for batch editing SDC too, especially for larger batches I think.