Page MenuHomePhabricator

CommonsMetadata should write additional metadata to JSON-LD if possible
Open, Needs TriagePublicFeature

Description

Feature summary :
CommonsMetadata should add metadata like creator, copyrightNotice, creditText to JSON-LD when possible.

Use case(s) :
Images on Commons are missing some metadata in JSON-LD. Most of the time those missing information are already ready to use (written inside Template:Information and parsed by CommonsMetadata), but it is not machine-readable as it is not written in any metadata format.

Since CommonsMetadata already do the same with licenses (read from license template, parse them, and write them into JSON-LD), and the Template:Information parser is already in place, it should be trivial to add the missing metadata into JSON-LD.

Benefits:

  • Make information accessible by other parties (e.g. Google Images)

Event Timeline

The fields mentioned by the Google metadata linter: creator (an alias for author), copyrightNotice, creditText.

To clarify person and organization, that would likely require effort on the Information template and possibly UploadWizard. My proposal is to implement the identifier as HTML class in the Information template, and have it parsed by CommonsMetadata like the existing implementation.

If no identifier exists, it can either fallback to a person or just not have anything. I'm not familiar enough with Commons to make an informed decision though.

  • The info template does not have a reliable machine-readable way of expressing copyright notices; the Attribution field seems easy enough to make machine-readable though.
  • It's not very clear to me from the description what creditText is supposed to be. I don't think we have anything matching in the information template.

My original thought is to use attribution field for creditText only. Maybe copyrightNotice can be handled by the license templates?

I can work on a patch if we can decide on how to approach it. Would it be more suitable to have one patch per metadata?