Page MenuHomePhabricator

[XL] Create a way to see and add references to structured data on Commons (MediaInfo) statements
Open, HighPublic

Description

References are part of Structured Data on Commons, however they are not visible to the end user and there is no interface to be able to add them. A reference (or source) is a citation that verifies the statement. A reference can be a link to a URL or an item; for example, a book. Contributors can add and manage references and readers will see the references for the statements. Currently, there is no limit to the number of references that can be added.

A statement has an Add reference button below it.
References include fields that are grouped collectively for one reference.
Qualifiers can still be added for statements.

Adding these links for background

As a contributor, I want to add a reference, so that I can give credence to statements on Commons.

As a contributor, I want to modify or delete statement references, so the references are correct.

As a reader, I want to see references for statements on Commons, so I know the statements have been verified.

Acceptance Criteria
Given a Contributor is on the Structured data tab
When clicking on Edit
Then Qualifiers and Add qualifier appear below the statement
And References title and Add reference appear below Qualifiers

Given a Contributor wants to add a reference
When clicking on Add reference
Then a white field with Property and a gray field appear

Given a Contributor wants to create a reference grouping
When clicking on Add under the reference
Then a white field with Property and a gray field appear under the first property

Given a Contributor wants to fill in the reference information
When the Contributor fills in the Property field AND fills in the reference in the gray field
Then the Publish Changes button is active and is blue

Given a Contributor wants to publish a reference
When the Publish Changes button is clicked
Then the reference appears below the statement

Given a Contributor wants to modify a reference
When in Edit mode and clicked in a reference field
Then the reference can be modified and the Publish Changes button is active and is blue

Given a Contributor wants to delete a reference
When in Edit mode
Then the X appears and clicking X delete the reference

Given a Reader sees a statement
Then all the references for the statement appear below the statement

Note: This feature will be released behind a feature flag so that we can test thoroughly before release.

references.jpg (2×4 px, 775 KB)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I am very interested in the topic but I am not very useful in putting the feature to action.

I think SDC is superior to any other environment in the Internet in being able to record and display the provenance of statements about media items. To see where a description or a tag originated from, be it an institution, a volunteer or an automated tagging algorithm. This information can be used to correct derogative captions, detect mistakes etc. while at the same time creating a trace of the previous states.

I would like to take part in the discussion and intertwine it with a vision of structured metadata exchange between web platforms serving cultural heritage items.

Hi, everyone! Yes, this activity is not on the SD team roadmap for now, but after we talked to them about it and the importance this could have on GLAM-Wiki activities, they want to understand the Commons community's interest in it as well. I added a section about it on the Structured Data on Commons talk page. Please, share your thoughts there too -- and we might see this happening! Thanks!

This conversation about how to do bot additions of depiction metadata, and attribute them appropriately in Wikidata vs Commons, might be useful here as it is what spawned the recent interest in this topic.

https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/METbot

The short version: Wikidata prefers reference statements when citing the source of a statement, while Commons has no choice but to use qualifiers to do the same. This means we have different ways in Commons and Wikidata to do the same thing, with no prospect of unifying our approach. This only adds to our list of Wikidata vs Commons woes. Not only are we a "community" with different project norms, we are now separated by differing technical capacities (or at least user interface) even though both Commons and Wikidata are using Wikibase.

This conversation about how to do bot additions of depiction metadata, and attribute them appropriately in Wikidata vs Commons, might be useful here as it is what spawned the recent interest in this topic.

https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot/METbot

The short version: Wikidata prefers reference statements when citing the source of a statement, while Commons has no choice but to use qualifiers to do the same. This means we have different ways in Commons and Wikidata to do the same thing, with no prospect of unifying our approach. This only adds to our list of Wikidata vs Commons woes. Not only are we a "community" with different project norms, we are now separated by differing technical capacities (or at least user interface) even though both Commons and Wikidata are using Wikibase.

Thanks for this really clear summary. Question: If they did have the same structure would this mean we could use tools meant for Wikidata on SDC data (and maybe both at the same time) eg tools that use Wikidata data on Wikipedia.

I wanted to share a first pass of how we might be able to approach this on Commons. These mockups are based on the idea of having both qualifier and reference statements (row three in @Fuzheado's "Approach for attribution" table)

reference_v1.jpg (2×3 px, 816 KB)

Hi @mwilliams thanks so much for creating these mock ups, I feel having references on SDC is extremely important especially when working with partners. My inital reaction is they look great, my guess is that having the same structure for references and qualifiers as Wikidata would make the most sense since it is a proven way of doing it and I would assume improve interoperability and making tools easier.

Thanks again

@mwilliams So, in terms of the interface, I think it's going to be a little more complicated than this mockup has captured so far. A reference does more than a qualifier does, so it's not just a matter of adding a new sub-claim with a property and value when a user selects "add reference". If you check how this works on Wikidata, clicking "add reference" creates a shaded box, within which the user can add any number of claims that are all grouped collectively as that one reference. If you click "add reference" again, you would actually be starting a new shaded box with a different group of claims for the second reference.

In the mockup, I'm not sure how this works yet. In the bottom left image, I see a place to add a claim—the text box expecting a "Property"—meaning someone has already clicked "Add reference" once, and then I see an "Add reference" button under that. On Wikidata (see below for reference), once you click to start a new reference, the user has both an "+add" button within the reference to add new claims to the reference, as well as an "+add reference" button under the reference, to start a new reference. I just want to make sure references in SDoC are being designed a collection of grouped claims, rather than as single disconnected claims (as qualifiers). If this is the plan already, and it's just that maybe a new text box pops up once the first one is entered, I would still suggest that some form of shading or visual marker to show which claims constitute the same reference would be helpful.

Screen Shot 2021-08-23 at 2.18.51 PM.png (302×1 px, 26 KB)

Also, this is minor, but in Wikidata each claim is displayed with a numerical value showing the number of references (even if it is 0). Will this be done in SDoC as well?

@mwilliams So, in terms of the interface, I think it's going to be a little more complicated than this mockup has captured so far. A reference does more than a qualifier does, so it's not just a matter of adding a new sub-claim with a property and value when a user selects "add reference". If you check how this works on Wikidata, clicking "add reference" creates a shaded box, within which the user can add any number of claims that are all grouped collectively as that one reference. If you click "add reference" again, you would actually be starting a new shaded box with a different group of claims for the second reference.

In the mockup, I'm not sure how this works yet. In the bottom left image, I see a place to add a claim—the text box expecting a "Property"—meaning someone has already clicked "Add reference" once, and then I see an "Add reference" button under that. On Wikidata (see below for reference), once you click to start a new reference, the user has both an "+add" button within the reference to add new claims to the reference, as well as an "+add reference" button under the reference, to start a new reference. I just want to make sure references in SDoC are being designed a collection of grouped claims, rather than as single disconnected claims (as qualifiers). If this is the plan already, and it's just that maybe a new text box pops up once the first one is entered, I would still suggest that some form of shading or visual marker to show which claims constitute the same reference would be helpful.

Screen Shot 2021-08-23 at 2.18.51 PM.png (302×1 px, 26 KB)

Also, this is minor, but in Wikidata each claim is displayed with a numerical value showing the number of references (even if it is 0). Will this be done in SDoC as well?

Thanks very much for explaining this, is what you would like basically the same way Wikidata does references? I've been trying to think of a use case that wouldn't work for the existing Wikidata approach for adding references and I can't think of one.

Thanks very much for explaining this, is what you would like basically the same way Wikidata does references? I've been trying to think of a use case that wouldn't work for the existing Wikidata approach for adding references and I can't think of one.

Yes, I think it must work this way. If you want to cite a book, you will need a title, an author, a publisher, etc. A web citation needs a URL and an access date. And so on. These claims all belong to one reference, and cannot be mixed together with claims for other references. Once you have two references for a statement, you need to know which title a given author claim belongs with. So there must be grouping. In the data, each reference is its own JSON object with a unique hash, in addition to all of its child claims. Qualifiers, by contrast, are all just standalone claims.

Thanks very much for explaining this, is what you would like basically the same way Wikidata does references? I've been trying to think of a use case that wouldn't work for the existing Wikidata approach for adding references and I can't think of one.

Yes, I think it must work this way. If you want to cite a book, you will need a title, an author, a publisher, etc. A web citation needs a URL and an access date. And so on. These claims all belong to one reference, and cannot be mixed together with claims for other references. Once you have two references for a statement, you need to know which title a given author claim belongs with. So there must be grouping. In the data, each reference is its own JSON object with a unique hash, in addition to all of its child claims. Qualifiers, by contrast, are all just standalone claims.

Thanks very much for confirming, do you know who in WMDE is knowledgable about how this all works? @Addshore are you able to suggest someone?

Thanks for the great feedback. @Dominicbm you are correct that this needs to be a collection of grouped claims and that my mockups didn't communicate that functionality.

reference.jpg (1×1 px, 184 KB)
Here is a quick mockup of how that might work, keeping each grouped claim in some sort of gray/shaded box. I'd love to keep the functionality as close as possible to Wikidata, while still mapping to the visual design choices that have already been made with SDoC.

Most of the design decisions that shifted away from exactly replicating Wikidata pre-date my involvement on this work, I'll try and track down some context around that. I'm sure we could revisit those decisions if needed but wouldn't want to conflate that work with this new feature.

Also, this is minor, but in Wikidata each claim is displayed with a numerical value showing the number of references (even if it is 0). Will this be done in SDoC as well?

I haven't quite figured this out but happy to spend more time on it if it feels useful.

Thanks for the great feedback. @Dominicbm you are correct that this needs to be a collection of grouped claims and that my mockups didn't communicate that functionality.

reference.jpg (1×1 px, 184 KB)
Here is a quick mockup of how that might work, keeping each grouped claim in some sort of gray/shaded box. I'd love to keep the functionality as close as possible to Wikidata, while still mapping to the visual design choices that have already been made with SDoC.

Most of the design decisions that shifted away from exactly replicating Wikidata pre-date my involvement on this work, I'll try and track down some context around that. I'm sure we could revisit those decisions if needed but wouldn't want to conflate that work with this new feature.

I think this looks almost perfect. I would not use "Add reference" both inside inside and outside the reference gray box. In my understanding, the entire grouping in one box is a single reference. Wikidata just says "Add" for the claims within the reference, so it's a little less confusing.

Also, this is minor, but in Wikidata each claim is displayed with a numerical value showing the number of references (even if it is 0). Will this be done in SDoC as well?

I haven't quite figured this out but happy to spend more time on it if it feels useful.

I'm not 100% sure what people use this for in Wikidata, so it's not a big deal to me, but I do find value in aligning the design (unless there is intention behind a change), so that was the main reason I brought it up.

My assumption is there would be a great deal of benefit in harmonising with Wikidata as much as possible for documentation, tools, ease of use etc so if in doubt make the same as Wikidata. Also this may allow easier improvements to be carrier over from Wikidata maybe?

Also, this is minor, but in Wikidata each claim is displayed with a numerical value showing the number of references (even if it is 0). Will this be done in SDoC as well?

I haven't quite figured this out but happy to spend more time on it if it feels useful.

Wikidata is a secondary knowledgebase, meaning it tries to replicate stuff that has its primary source somewhere else and then indicates provenance of the data through references. Recording references for the data is at the core of the idea of Wikidata so we want references for as much data as possible there. This lead to the decision to always show the number of references. If it is 0 then that should be a nudge towards adding a reference to the statement. If it is not 0 then it should give some degree of confidence in that statement.
For Commons I don't have a strong opinion on showing it or not showing it. I think the desire to add references for every statement is less there than on Wikidata.

image.png (261×388 px, 28 KB)

This bit should probably say something else, but otherwise this latest mock looks good and aligns with the datamodel being used to store these references (same as on wikidata)

Thanks for the great feedback. @Dominicbm you are correct that this needs to be a collection of grouped claims and that my mockups didn't communicate that functionality.

Just to pick on terminology a bit, a reference is made up of a group / collection of snaks

https://wmde.github.io/wikidata-wikibase-architecture/Glossary.html#snak

Claims are essentially synonyms to statements at this stage https://wmde.github.io/wikidata-wikibase-architecture/Glossary.html#statement

Thanks very much @Addshore and @Lydia_Pintscher, one thing I don't understand is who in practice could implement this?

Thanks very much @Addshore and @Lydia_Pintscher, one thing I don't understand is who in practice could implement this?

This is up to the team at the WMF responsible for structured data on Commons.

SWakiyama updated the task description. (Show Details)
MarkTraceur renamed this task from Create a way to see and add references to structured data on Commons (MediaInfo) statements to [XL] Create a way to see and add references to structured data on Commons (MediaInfo) statements .Sep 8 2021, 4:59 PM

Change 726883 had a related patch set uploaded (by Matthias Mullie; author: Matthias Mullie):

[mediawiki/extensions/WikibaseMediaInfo@master] [WIP] Add support for references

https://gerrit.wikimedia.org/r/726883

@matthiasmullie if you need any user guinea pigs to test things out let me know :)

Implementation revealed a few additional questions around the design & interaction of the edit mode, for references and qualifiers.
We've experimented with a few approaches, filled the gaps & landed on this:

TL;DR:

  • references & qualifiers will look the same (except that there can be multiple groups of references; qualifiers only 1)
  • they'll be grey "block" of snaks in edit mode
  • we're dropping the leading lines in edit mode

LMK if there are unresolved issues with this implementation. Otherwise, this is what we'll move forward with.

@matthiasmullie really nice video, very clear, thanks :) One question, how is this different to how Wikidata does it? I'd hate to miss something at this point that means going back and redoing things

@matthiasmullie thank you for recording this video! However, I was wondering if you could record another one using actual references. We were showing this to some GLAM partners, as @Dominicbm, and we were a bit confused about how it actually works, how it would look once ready. Thank you!

how is this different to how Wikidata does it?

Not really; they're pretty much the same across Wikidata & Commons, just a little different UX/UI to stay in line with what we already had for qualifiers.

I was wondering if you could record another one using actual references.

I'll try to put something together next week!

I was wondering if you could record another one using actual references.

Here's a video with the original example - does that clear things up?

@matthiasmullie Thanks very much for the new video, very clear

Thank you, @matthiasmullie! This looks so much better and it's more understandable.

FYI concerning: "Note: This feature will be released behind a feature flag so that we can test thoroughly before release."
-> The feature flag will only toggle the "references" functionality. This work also affects how other areas look a bit (qualifiers): those changes will go through immediately, even if support for references has not yet been enabled.

Change 737370 had a related patch set uploaded (by Matthias Mullie; author: Matthias Mullie):

[operations/mediawiki-config@master] Explicitly disable references support on Commons

https://gerrit.wikimedia.org/r/737370

Change 737382 had a related patch set uploaded (by Matthias Mullie; author: Matthias Mullie):

[mediawiki/extensions/WikibaseMediaInfo@master] Remove references feature flag

https://gerrit.wikimedia.org/r/737382

This got stalled a bit because these additional UI elements & interaction revealed an issue with some of the code that renders all elements on the page (T296616) - that has to be fixed before we roll out support for references. I think I have a working fix - it needs some more testing, but this is moving forward again.

This got stalled a bit because these additional UI elements & interaction revealed an issue with some of the code that renders all elements on the page (T296616) - that has to be fixed before we roll out support for references. I think I have a working fix - it needs some more testing, but this is moving forward again.

thanks so much your work on this :)

Change 737370 merged by jenkins-bot:

[operations/mediawiki-config@master] Explicitly disable references support on Commons

https://gerrit.wikimedia.org/r/737370

Change 726883 merged by jenkins-bot:

[mediawiki/extensions/WikibaseMediaInfo@master] Add support for references

https://gerrit.wikimedia.org/r/726883

Change 743354 had a related patch set uploaded (by Matthias Mullie; author: Matthias Mullie):

[operations/mediawiki-config@master] Enable references support on beta Commons

https://gerrit.wikimedia.org/r/743354

Change 743354 merged by jenkins-bot:

[operations/mediawiki-config@master] Enable references support on beta Commons

https://gerrit.wikimedia.org/r/743354

Etonkovidova added a subscriber: Etonkovidova.

Checked on betalabs commons - verified all specs/scenarios listed in the task description.

Two notes
(1) Filed T297171: Structured data - references can be published without property value - it's a recoverable error state (requires page reload).
(2) [minor] The mockup (step 2) displays connecting lines between statements/qualifiers/references in Edit mode. The current implementation doesn't have it.

mockup - Edit modecommons betalabs - Edit mode
Screen Shot 2021-12-06 at 2.10.06 PM.png (1×1 px, 427 KB)
Screen Shot 2021-12-06 at 2.07.57 PM.png (720×1 px, 83 KB)

After publishing those connecting lines are present (see commons betalabs File:Rose_yellow_11.jpg):

Screen Shot 2021-12-06 at 3.19.19 PM.png (1×1 px, 141 KB)

(2) [minor] The mockup (step 2) displays connecting lines between statements/qualifiers/references in Edit mode. The current implementation doesn't have it.

Correct.
There connecting lines posed a bit of a problem: the varying use between qualifiers/references and read/edit could be confusing, and they'd be tough to implement.
I discussed them with @mwilliams on Slack, we agreed to drop them, and settled on the implementation shown in the video in this comment: https://phabricator.wikimedia.org/T230315#7411888

(2) [minor] The mockup (step 2) displays connecting lines between statements/qualifiers/references in Edit mode. The current implementation doesn't have it.

Correct.
There connecting lines posed a bit of a problem: the varying use between qualifiers/references and read/edit could be confusing, and they'd be tough to implement.
I discussed them with @mwilliams on Slack, we agreed to drop them, and settled on the implementation shown in the video in this comment: https://phabricator.wikimedia.org/T230315#7411888

Thanks! Moving to Verify on Production.

Change 745532 had a related patch set uploaded (by Matthias Mullie; author: Matthias Mullie):

[mediawiki/extensions/WikibaseMediaInfo@master] Add url param to allow enabling references for testing

https://gerrit.wikimedia.org/r/745532

After above patch ends up deployed, support for references can be enabled by adding ?MediaInfoEnableReferences to the uri.
This will let us do some final tests on production before we switch it on for real.

Change 745532 merged by jenkins-bot:

[mediawiki/extensions/WikibaseMediaInfo@master] Add url param to allow enabling references for testing

https://gerrit.wikimedia.org/r/745532

@Etonkovidova are we at the stage yet where we can turn this on in production?

@Etonkovidova are we at the stage yet where we can turn this on in production?

Yes, I checked adding references on commons wmf.16 (with ?MediaInfoEnableReferences) and could not see any issues.

Some caution notes: 1) commons betalabs is quite different from the production 2) the use cases for the feature itself, i.e. adding references to structured data, might be quite different from what I was not able to anticipate.
However, with the url parameter ?MediaInfoEnableReferences, these would be clarified with testing references in real use cases situations.

Change 752599 had a related patch set uploaded (by Matthias Mullie; author: Matthias Mullie):

[operations/mediawiki-config@master] Enable support for references

https://gerrit.wikimedia.org/r/752599

Change 752599 merged by jenkins-bot:

[operations/mediawiki-config@master] Enable support for references

https://gerrit.wikimedia.org/r/752599

Mentioned in SAL (#wikimedia-operations) [2022-01-11T12:15:51Z] <cparle@deploy1002> Synchronized wmf-config: Config: [[gerrit:752599|Enable support for references (T230315)]] (duration: 01m 00s)

Seems this is live on production. Great to see this finally working!

Change 737382 merged by jenkins-bot:

[mediawiki/extensions/WikibaseMediaInfo@master] Remove references feature flag

https://gerrit.wikimedia.org/r/737382