Page MenuHomePhabricator

Consider using CC-0 license for all translations through Translate in Wikimedia projects
Open, Needs TriagePublic

Description

Given:

  • Translate extension provides machine translation which may be used by translators, without being recorded in page History.
  • Translate extension provides Translation memory, translators reuse other ones work, without crediting original translator (it is technically really complex to find the authors of the provided memory).
  • Most translation units are really short; copyrighting a one-word translation does probably not make sense.
  • TUX interface does not provides editnotices which may notice a particular license (e.g. Help namespace on MediaWiki.org is under CC-0 license).

Could Legal and Language teams

  • evaluate how all already-done translations are actually copyrighted?
  • consider automatically placing all future translation under CC-0?

Event Timeline

As the WMF-Legal project tag was added to this task, some general information to avoid wrong expectations:
Please note that public tasks in Wikimedia Phabricator are in general not a place where to expect feedback from the Legal Team of the Wikimedia Foundation due to the scope of the team and/or nature of legal topics. See the project tag description.
Please see https://meta.wikimedia.org/wiki/Legal for when and how to contact the Legal Team. Thanks!

  • When someone translates the whole page, they press the save button more often, but otherwise it’s not really different from translating the whole page without the extension, and human translations are generally copyrightable.
  • Translations are often not that short. One or two words may not be copyrightable, but whole sentences or paragraphs probably are.
  • There are existing translations, which cannot be retroactively released under CC0. If someone edits an existing translation, the modified translation cannot be released under CC0 either.
  • There are existing translations yet to be migrated to the extension. When they are migrated, they are technically new translations, but actually not, so they cannot be released under CC0 either.
  • Since the translated pages mirror the translation template from the potentially non-CC0/PD* source page, they potentially cannot be released under CC0 either. This means that the two technical edits initiated by the same logical edit (the edit to the translation unit page and the edit to the translated page) would be licensed differently.
  • Actually, not even the translation unit edit could be released under CC0 in most cases, since as a translation, its license cannot be more permissive than that of the original – which was done on a potentially non-CC0/PD page and is thus potentially not CC0/PD.

All in all, I don’t think it’s a good idea to release translations under CC0. Instead, the support for non-CC0 licenses should be improved:

  • Display a copyright notice in the Tux edit window.
  • Attribute authors when reusing human translation. I don’t think it’s technically really complex, it’s just not displayed on the UI – actually, if there is more than one place where the translation appears, they can be displayed on the UI, so I assume the information is also available when it appears at only one place.
  • (Machine translation is probably not copyrightable, so there are no attribution issues.)

*Only potentially non-CC0/PD due to the mediawiki.org Help namespace.

Brian Choo from Legal here. I'm so sorry for the delay in replying. In consultation with our legal counsel, we are hesitant to release translations under CC0. Our reasoning is in line with that of @Tacsipacsi above. We are concerned there may be confusion with the licensing of the final output, which should remain under a BY-SA license after translation.

My apologies again for the delay here, and I hope this is helpful.

@bchoo Thanks for the feedback! Could you maybe confirm (or disconfirm) my assumption?

  • (Machine translation is probably not copyrightable, so there are no attribution issues.)

Isn’t there a way to relax translation work copyright, while preserving the content license?

Else, I think the best workaround would be to make Tux automatically cite translation source while using a suggestion, via edit summary, or, better, in an additional content slot.

Hi all, bchoo asked me to take a look at this one. I think Tacsipacsi's analysis is broadly accurate, including the specific question about machine translation. The machine translation one is subject to some uncertainty now because of all the law on AI/ML happening. Based on the most recent copyright law office analysis, there would be two theories on machine translation being uncopyrightable, kind of on opposite ends of thinking. On one hand, if the machine is doing something very formulaic, predictable, and uncreative with no human creativity as part of it, then it's not creative enough for copyright. On the opposite end, if the machine is too random and disconnected from the human input, it appears that also means there's no human author and thus no copyright. If it's in-between where the machine is functioning as a complex tool to channel human creativity, then it might be copyrightable.