Page MenuHomePhabricator

Implement Structured Data on Commons into a module
Open, Needs TriagePublic

Description

A few community members, especially from the GLAM-Wiki environment, have shared interest in having an LUA module or template on Wikipedia that would work with Structured Data on Commons in a similar way to the Module:WikidataIB, which extracts Wikidata information into Wikipedia to help structure Wikidata infoboxes or even templates like the Mbabel ones.

This would be very useful to automatically generate a caption and reference for an image using Structured Data on Commons, as @Dominicbm is trying to think in this SDC citation example.

Related Objects

Event Timeline

GFontenelle_WMF renamed this task from Implement Strucutred Data on Commons into a module to Implement Structured Data on Commons into a module.Oct 27 2021, 10:52 PM
GFontenelle_WMF updated the task description. (Show Details)

Thanks for making this. I started to try to make a metadata template as well, that could be fully powered by its own structured data, and allow a media file to require no plain text metadata in the wiki page if all required properties are present, but I ran into several pieces that are impossible using only the existing {{#property}} parser function. Here are some examples:

  • For a media file, I would like to be able to get a Q-id for the item value of a given property.
    • Use case: The #property tag only provides access to the label as plain text currently. One use case of this would be where the SDC statement stores the source institution of an image, I want to use more than just that institution's Wikidata item label. I want the Q-id so I can make a template that looks at the properties of the institution's Wikidata item (using existing Wikidata modules) to display its logo, put the image in the institution-based category, and so on.
  • I would like to be able to access the claims in qualifiers (and soon references) of statements.
    • Use case 1: Some properties in SDC are modeled in a way that the most relevant unit of information is actually in a qualifier. For example, if you have a creator that is a string, you add it by putting P170 (which is item data type), using "some value", and then putting the creator string in P2093 (author name string) as a qualifier. example
    • Use case 2: It is necessary for some purposes to only display information if there is a particular qualifier/reference. For example, we use the "determination method" -> "determined by GLAM institution and stated at its website" to denote statements that are sourced to the institution's authoritative original metadata. For citation, we would want to display only this metadata, and not other claims that have also been added by the community. For another example, you might have multiple values for a property for valid reasons, but with no access to the properties or way to ask for a particular one, #property will always just give you a plain text list of them separated by commas—and you don't even know how many values you are getting, because commas are also valid in Wikidata labels.
  • For a given Commons file, display the file caption.
    • This is already implemented on Commons using the {{file caption}} template. But I am not aware of a way to display a caption outside of Commons, for example in Wikipedia, where you might want to display it with an image in an article.

For all of these, since Commons has unique page titles separate from the M-id (in contrast to Wikidata), it would be useful to be able to get these back when providing just a Commons file name and not an M-id.

In Commons, it might be useful (the best solution), if existing modules in widespread use, like https://commons.wikimedia.org/wiki/Module:WikidataIB, are simply modified so they can work with either a Q-id or M-id, since the underlying structure for either will be the same.