Page MenuHomePhabricator

Register license on GitHub mirror (use LICENSE and not only COPYING file) so source code can be queried with BigQuery
Open, Needs TriagePublic

Description

It seems MediaWiki has an open source license, but that this is not explicitly registered in the GitHub mirror: https://help.github.com/en/articles/licensing-a-repository

If the license is set explicitly, the repository will show up in the Google BigQuery data collection of GitHub data.
Then, interested researchers can query the history of MediaWiki development using BigQuery.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 16 2019, 9:05 AM
revi added a subscriber: revi.Oct 16 2019, 11:23 AM

License is provided via COPYING file, not the LICENSE file as GitHub demands.

Aklapper renamed this task from Register open source license on GitHub mirror so that source code can be queried with BigQuery to Register license on GitHub mirror (use LICENSE and not only COPYING file) so source code can be queried with BigQuery.Oct 17 2019, 4:23 PM
Reedy updated the task description. (Show Details)Oct 17 2019, 4:25 PM
Reedy added a subscriber: Reedy.Oct 17 2019, 4:27 PM

This is kind of odd

https://github.com/wikimedia/mediawiki has a link to https://github.com/wikimedia/mediawiki/blob/master/COPYING on "View license"... Why Google can't just deal with both cases...

Krinkle added a subscriber: Krinkle.EditedOct 17 2019, 4:35 PM

GitHub use this library to parse the repo: https://github.com/licensee/licensee. It supports both COPYING and LICENSE files equally.

The problem isn't with the file name, but with the contents. The label "View license" is what they show when the license is unknown. If it is recognise, it would say "MIT", "GPL" or "Apache-2" etc there instead. See https://github.com/wikimedia/oojs or https://github.com/cssjanus/php-cssjanus for example.

The unknown / "View license" fallback happens when the library is unable to match the file contents to a known open-source license. This is most likely because of all the non-standard formatting and preamble we added to the file.