Page MenuHomePhabricator

Metrics for SDoC: future work of interest (templates and licensing)
Open, NormalPublic

Description

Future work of interest:

  • Templates tied to licensing
    • Use case: migration to a different license templating system? (migrating from current system to licenses in structured data, etc)
    • How many times licenses are being used
    • How many media files have multiple licenses
      • Countries have different requirements (ie: ‘author is dead for X years and has data in the public domain’, etc)
    • PD-Art tag is when you’ve taken a picture of a work in public domain; current UploadWizard implementation defaults to PD-Art template that requires at least parameter (but we don’t allow the user to enter that parameter) => will be problematic in the migration (usage might be wrong when tagged PD-Art)?
      • Example to understand this use case: Picasso’s work
    • FYI: Multi-license copyright tags documentation and also https://commons.wikimedia.org/wiki/Commons:Multi-licensing
    • We want templates to be more structured and more accurate
  • Licensing will be key and understanding licensing on Commons will be super important
  • Snapshot-in-time work: distribution of licenses, co-occurrence of licenses in files
  • Interest in reduction of number of categories & number of categories created
    • Number of categories, number of hidden categories (how many pages they are applied to - might be future work), number of categories that go away (deleted)

Event Timeline

debt created this task.Oct 3 2017, 11:46 PM
debt renamed this task from Metrics for SDoC: future work of interest to Metrics for SDoC: future work of interest (templates and licensing).Oct 4 2017, 9:56 PM
debt updated the task description. (Show Details)Oct 11 2017, 9:45 PM
Lydia_Pintscher moved this task from incoming to monitoring on the Wikidata board.Dec 18 2017, 3:05 PM
Restricted Application added a project: Product-Analytics. · View Herald TranscriptApr 19 2018, 12:20 AM

@Ramsey-WMF we were just triaging this and were wondering what your timing was given that this was written as part of the first-round ask and you just finished a community discussion about this topic.

For reference, I did some work counting the number of Commons files with different CC licenses:

Hi!

Yes, we would love to get this work done as soon as resources are reasonably available. It's not urgent, but we'd love to have it done by end of quarter if possible.

Also the work Neil mentioned does indeed cover a lot of the CC cases (thanks Neil!) but we'll still need to get deep into the Public Domain template usage, the situation with files having multiple templates, and how the categorization system maps to license usage.

Neil_P._Quinn_WMF raised the priority of this task from Normal to High.Sep 27 2018, 8:24 PM
Neil_P._Quinn_WMF lowered the priority of this task from High to Normal.Sep 27 2018, 8:34 PM