Page MenuHomePhabricator

Media Viewer should display Attribution instead of author when set (by {{Credit line}} and other templates)
Closed, ResolvedPublic

Description

Some Wikimedia Commons images support custom attribution via machine-readable data provided by [[commmons:Template:Credit Line]] (and some others). CommonsMetadata parses these attributions and provides them via the 'Attribution' property of its API result. Whenever that property exists, the author name should be replaced by the value of the 'Attribution' property in the HTML / plain text generated in the Download and Embed panels.

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:20 AM
bzimport added a project: MediaViewer.
bzimport set Reference to bz65445.
bzimport added a subscriber: Unknown Object (MLST).

I believe this is a duplicate of bug 57460.

Kind of. Your bug was about instituting Credit Line in a specific way. Gergo seemed sure of other ways to implement the needed information. It's six months later and we're almost done with this version, and the issue hasn't been addressed. If it's as simple as adding the class id, then we should start spreading the word about that now.

Otherwise feel free to mark this an actual duplicate.

The patch mentioned there has been floating for half a year - I'll poke someone to review it. Anyway, that bug is about CommonsMetadata providing the information, this one is about MediaViewer actually using it, which is harder since it needs UX design and the interface is already overloaded with information.

Hi all,

The {{Credit line}} template is problematic in three important ways:

  1. It repeats information in an error-prone manner. Author, license and source are typically repeated and need to be maintained in two places. The result is that many {{Credit line}} invocations are actually incorrect or problematic. Here's an example call:

https://commons.wikimedia.org/w/index.php?title=Template:WojciechowscyAttribution&action=edit

As you can see, some CC licenses are linked, others aren't and the GFDL link goes to the Wikipedia article about the GFDL. This is very typical. The template attempts to substitute license names magically with links through some error prone code where a missing character will cause a license URL to be omitted.

  1. In many, probably most, invocations, it adds no new information. Media Viewer already provides attribution both using the author name and source URL. Typically this results in a virtually identical credit line (both below the image and using the "Use this file" dialog). So the actual severity of this issue is being dramatically exaggerated.
  1. The template, in its current form, emits all its text as a single machine-readable string. In order to display it in a manner similar to other attribution info (i.e. license in one place, author in another), the machine-readable output would need to be modified.

It's not clear to us that the template in its current form should exist or should be supported. To the extent that uploaders want to provide custom attribution strings beyond source URL and author, there may be better ways to do this, so that information like license name and author name is not duplicated in two places.

At minimum, we would need to address 3) - we could then ignore the problematic license string in favor to Media Viewer's own license parsing code, and substitute author/source information with the attribution string in the template. However, we think this would be a cop-out -- we should find a way to solve this problem that minimizes information repetition.

None of these issues are specific to Media Viewer; they will impact all re-users of Wikimedia content. (For example, any re-user would be including error-prone licensing information by relying on "credit line" attribution strings which sometimes link to the actual license and sometimes not.)

  1. Agreed
  1. Agreed, though there are a few uses that are more esoteric and or where people request to list Multiple licenses for instance. I do note, that such can be seen as 'requests from the author', not as legal requirements. We should however not forget that some people might make image donations based on their assumptions that these statements will be shown to potential re-users. That is an important 'social contract' that we need to be very careful with.
  1. We should not have to do that. As I see it, such statements are 'full overrides' of whatever we can gather from metadata statements. And thus full responsibility is with the author of that Credit line. I say just parse it out, strip all html other than <a> and show it in full place instead of anything else we can gather, when providing attribution.

(In reply to Derk-Jan Hartman from comment #6)

I say just parse it out, strip all html other than <a> and show it in full
place instead of anything else we can gather, when providing attribution.

That would often result in attribution that is not compliant with the license. I don't think shoving the responsibility to the author (who might or might not have understood the semantics of {{credit line}}) can justify doing that.

We show incorrect data all the time right now, why should we be overly correct on something where people want to deviate from the standard ?

Our communities often want control and the community is often very self correcting. This is just a small subgroup of highly experienced users. Let them figure it out on their own I say.

Or we can go draft two descriptions for both plans and let them vote on it.

Regarding the extent of the usage of {{Credit line}}, I just ran
http://quarry.wmflabs.org/query/354

I have sampled some of the biggest users

  • a seemingly common use case is to attribute Wikimedia Commons as an “Attribution Party”
  • one user used it to provide the title of the work
  • several users do not provide more information than Author / License.

I personally won’t cry on the disappearance of {{Credit line}}, as I believe we have alternate mechanisms (mainly |attribution= parameter of CC templates) that *may* provide the same functionality.

That said, I would rather suggest *talking* to these users and asking them *why* they use it, and if other mechanisms suit their needs. I’m sure many of them would be happy to migrate if they are provided with an equivalent or superior alternative and a migration tool (that should be piece of cake for a bot).

(Although, even now that we have “Reuse file” mechanisms (through StockPhoto or MV), I recall discussions with users who always felt, “Why does one need to find the “Reuse” and click on it? − A Credit line should be displayed at all time.)

Indeed, having the top-50 users migrate would 'fix' 95% of the files (192K out of 202K).

Some context that seems overlooked : Credit line was created five years ago, in August 2009. This was more than a year before Magnus deployed the StockPhoto gadget. At that time, Wikimedia Commons was incapable of building/displaying a ready-made, copy-pastable Credit line. This template, I believe, spread among users who cared so much about license compliance and wanted to make it easy and obvious for reusers, that they were ready to take upon themselves the extra work of adding the template with repeated information.

I am very happy that we have and are (finally…) building better tools for license compliance, methods that work on the scale on millions (no pun intended ;) of files we have − and am thankful to all the people working on this, including the necessary standardisation. But it kinda sounds to me that Credit line users are framed as eccentrics who for a whim go in the way of license compliance & file reuse − I do not believe this is true: they were the ones who cared when it seemed no one else did.

perhaps we should mass message them.... ?

(In reply to Jean-Fred from comment #9)

Some context that seems overlooked : Credit line was created five years ago,
in August 2009. This was more than a year before Magnus deployed the
StockPhoto gadget. At that time, Wikimedia Commons was incapable of
building/displaying a ready-made, copy-pastable Credit line. This template,
I believe, spread among users who cared so much about license compliance and
wanted to make it easy and obvious for reusers, that they were ready to take
upon themselves the extra work of adding the template with repeated
information.

Kudos for the people who did extra work to make reusers' life easier, certainly. But today, Commons (and Media Viewer) *is* capable of building a credit line from other elements of the file page. Can we close this bug as invalid, then?

I would suggest the following changes to Media Viewer and the template:

  1. In the template, emit licensetpl_attr only for the actual attribution component (which is already a separate parameter, i.e. the combination of "Author" and "Other"). License is already emitted in more predictable machine-readable form by other templates, but can still be rendered in the output for easy copy & paste.
  1. If licensetpl_attr is set, prioritize it as the preferred attribution below the image and in the "Use this file" provided attribution.

Note that this also addresses issues with attribution parameters provided by other templates, e.g. this file:
https://commons.wikimedia.org/wiki/File:(Boletus_edulis).jpg

Where the licensetpl_attr metadata specifies a full author name that's not rendered in Media Viewer.

Does that sound reasonable to folks on this bug? It would probably be worth rethinking the attribution concepts a bit more, but that can be done as part of the longer term structured data efforts.

There are multiple templates which set licensetpl_attr, some of them use free text (the attribution field in some license templates, for example), so changes to {{Credit line}} wouldn't be enough to guarantee consistent content for the licensetpl_attr field.

It wouldn't be, but one step at a time. Separating out the non-attribution content from {{Credit line}} seems like the first step to make licensetpl_attr actually usable in some cases.

The aforementioend example https://commons.wikimedia.org/wiki/File:(Boletus_edulis).jpg is problematic for other reasons. It uses the {{Attribution}} template, which denotes a non-copyleft license (!) in the "permission" field, in combination with the CC-BY-SA and GFDL licenses , which _are_ copyleft.

It looks like the author intended to use {{Credit line}} and would have been better off to use neither and just specify the author correctly. These templates create a lot more confusion than they solve, IMO, but let's sort through this mess and see if they're supportable.

I left a comment regarding the {{Credit line}} machine-readable metadata here: https://commons.wikimedia.org/wiki/Template_talk:Credit_line

*sigh* It's actually a bit more complex, as currently explained on the talk page.

At the file level, we have fileinfotpl_credit.

At the file level, {{Credit line}} also sets licensetpl_attr, although its creator says it was intended to be used only in the scope of a specific license, and our metadata parser indeed only interprets it in that scope. So that's clearly broken.

At the license level, licensetpl_attr may or may not be set. Many dual-licensed files only set it for one license and not the other -- even when both licenses require attribution. So that's clearly broken, as well.

I still remain skeptical how useful these custom bylines are (as opposed to just setting the author field), but if we want to support them, we probably have to support both -- per-file and per-license. Per-file is IMO much more useful, because it can be sanely applied across licenses requiring attribution.

Please chip in on the discussion for possible options to clean this up. Right now it's a mess and no third party re-user could be expected to make sense of it.

Adding Guillaume to dig further into this since he'll tackle the problems on the template side of things as part of a file metadata cleanup drive.

(In reply to Erik Moeller from comment #15)

I still remain skeptical how useful these custom bylines are (as opposed to
just setting the author field), but if we want to support them, we probably
have to support both -- per-file and per-license. Per-file is IMO much more
useful, because it can be sanely applied across licenses requiring
attribution.

Please chip in on the discussion for possible options to clean this up.
Right now it's a mess and no third party re-user could be expected to make
sense of it.

We don’t have to support it :). As I said earlier, the usage is widespread but localised to only a few dozen users.
http://quarry.wmflabs.org/query/354

I suppose the point is to support their use case / workflow somehow, not the template per se. Guillaume, could you help with that?

I've followed up at https://commons.wikimedia.org/wiki/Template_talk:Credit_line#Inconsistent_use_of_licensetpl_attr regarding the separation of attribution and license information in {{credit line}}.

Regarding working with the users of {{credit line}}, I've started a discussion at https://commons.wikimedia.org/wiki/Commons_talk:Credit_line#Improving_attribution_and_credit_lines .

Gilles triaged this task as Medium priority.Nov 24 2014, 1:37 PM
Gilles subscribed.
Tgr added a project: good first task.
Tgr set Security to None.

Change 182037 had a related patch set uploaded (by Sn1per):
Show custom Attribution line instead of Author/Credit when available

https://gerrit.wikimedia.org/r/182037

Patch-For-Review

Sn1per renamed this task from Media Viewer should display licensetpl_attr instead of author when set (by {{Credit line}} and other templates) to Media Viewer should display Attribution instead of author when set (by {{Credit line}} and other templates).Dec 29 2014, 1:38 AM

Change 182037 merged by jenkins-bot:
Show custom Attribution line instead of Author/Credit when available

https://gerrit.wikimedia.org/r/182037