Page MenuHomePhabricator

Provide Legal comment on [[c:Commons:Deletion requests/Data talk:Kuala Lumpur Districts.map]] concerning derivates from OSM acceptability in Data namespace
Closed, ResolvedPublic

Description

There is an ongoing deletion request on c:Commons:Deletion requests/Data talk:Kuala Lumpur Districts.map where it is unclear what data derived to what extend using OSM (ODbL licensed) is allowed in Data namespace (CC-0 licensed).

Legal team comment would be much appreciated.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
debt triaged this task as High priority.Oct 16 2017, 3:30 PM
debt added a project: Discovery-ARCHIVED.
debt added subscribers: dr0ptp4kt, JKatzWMF, Slaporte.

Hello @Base - I've setup a meeting tomorrow to talk with our Legal team about this issue. I hope we can have a response to this issue soon.

Hey @Base, @debt and I spoke this afternoon. I can share more about why we only accept CC0-released data in the Data namespace at this point. This will take me a few days to prepare a public response from the legal team, but I wanted to let you know that we received the question.

@Slaporte, thank you. It would be interesting to know why only CC-0 data are allowed in the Data namespace. However, the most pressing question for editors is how to accommodate OpenStreetMap data on Commons. If they are completely excluded from the Data namespace, the whole thing becomes largely useless.

According to the Wikimedia License Policy, the Wikimedia projects may only accept material that meets the definition of Free Cultural Works. This includes material that is protected by copyright but released under a compatible free license, or work that is in the public domain because it is not protected or restricted by copyright law.

Currently, the tabular data and map data features support a CC0 dedication.

CC0 is suitable for data because:

  • CC0 does not add extraneous restrictions on factual data points that are not protected by copyright law or other rights.
  • CC0 is widely used and recommended for sharing datasets.
  • CC0 is easily compatible with other open licenses, so datasets can be combined or remixed without compliance concerns (other data-focused licenses have more complex compatibility questions like the Open Database License).
  • CC0 promotes legal predictability and certainty. More restrictive database licenses are tied to the existence of copyrights, neighboring rights, or sui generis database rights that vary from country to country.

There are many data sources that use licenses other than CC0. Popular examples include Creative Commons Attribution-ShareAlike (CC BY-SA) 3.0 material, used on Wikipedia, and Open Database License (ODbL) material, used by Open Street Maps. As a general rule, people must use material in compliance with these license terms and release their work under a compatible license. However, there are circumstances where material from a source licensed under CC BY-SA or ODbL may be used in a dataset released under CC0.

Can you put CC-licensed material in a CC0-dedicated dataset?

Certain elements of a CC BY-SA licensed work may contain public domain facts or data, including portions or excerpts that do not meet the threshold of originality when taken alone. Public domain material can be used in a CC0 dataset.

CC BY-SA 3.0 provides:

"Nothing in this License is intended to reduce, limit, or restrict any uses free from copyright…"

Can you put ODbL-licensed material in a CC0-dedicated dataset?

In addition to copyright law, the ODbL provides a license to use data that may be protected by database rights in certain countries. These rights differ based on the country, and may provide protection when all or a "substantial portion" of a database is copied. Material that is not restricted by any rights such as copyright or database rights is also not restricted by the ODbL and may be used in a CC0 dataset.

Specifically, ODbL 1.0 provides:

"This License does not affect any rights of lawful users to Extract and Re-utilise insubstantial parts of the Contents, evaluated quantitatively or qualitatively, for any purposes whatsoever, including creating a Derivative Database…"

The OpenStreetMap Foundation provides additional guidance on what they define as an insubstantial portion.

Note: the ODbL only covers the rights of the database, and the individual contents in the database may be subject to another license (if it is not in the public domain). See this FAQ on Database and Content licenses under the ODbL.

Will the tabular and map data features support non-CC0 datasets?

Currently, the tabular and map data features require a license field, that supports SPDX codes to identify the dataset's license. The feature currently supports CC0. In the future, it may support additional Free Licenses, including CC BY-SA or ODbL.

Before additional licenses can be allowed, the Wikimedia projects should (1) support attribution and other obligations contained in the license (such as when displayed in the Graph extension and other consumers of tabular and map data), and (2) provide users with appropriate community guidelines on what material and license is acceptable. This support may require additional feature development that is not currently planned, but open for future open source contributions.

@Slaporte thanks for a thorough post. WRT attribution, graphs can already have attributions - they just need to be done by the developer of the graph template. Would a social contract (e.g. "all graphs that use external data must include licensing terms of that data") be enough, or is it a requirement to have a technical mean to enforce this? So far the wiki movement mostly relied on the social contracts for the rule enforcement, so I am a bit reluctant to introduce a complex system to automatically add licensing terms when it is easy enough for the template/graph authors to include that at the bottom of their template, while having full control of the placement and styling of that text. If an author forgets to add that, another editor can easily modify that template to fix the issue.

debt lowered the priority of this task from High to Medium.Nov 7 2017, 1:48 AM
debt added a project: Maps-Sprint.
debt moved this task from Backlog to Stalled/Waiting on the Maps-Sprint board.

If attribution for the data is required, that can be provided as a reference in the caption. Saying were the data comes from is required per [[WP:V]] already.

Non CC0 datasets are now possible to license correctly on Commons, does this resolve this issue?

I think so... Plan to check it out in more detail soon.

Closing per last comment