Page MenuHomePhabricator

Evaluate Potential Amazon Data Donation
Closed, ResolvedPublic

Description

Amazon wants to start creating training data for us start Oct 1.

They need the following:

  • the labels + schema for the draftquality training data
  • The labels + schema for the article quality training data
  • The labels + schema for the edit quality training data

Event Timeline

ACraze added a subscriber: calbon.

I've got everything noted in an etherpad for now: https://etherpad.wikimedia.org/p/labels

@calbon -- is there anything else we should include?

Thanks @ACraze. I am talking to legal about it. They'll have the final say.

calbon renamed this task from Document labels & schema for model repos to Evaluate Potential Amazon Data Donation.Oct 5 2020, 5:31 PM

Based on feedback from Research and Legal, our requirements for this are:

  • The data contains no PII
  • The data is publicly available (i.e. the public can download it)
  • The data uses a free license (MIT, etc.)
  • The code that produced the data is publicly available
  • The code that produced the data uses a free license (MIT, etc.)