Page MenuHomePhabricator

Access to Wikidata query logs that were used for recent research
Closed, ResolvedPublic

Description

In order to complete our analysis for T143762, we used data collected from the Wikidata query logs and posted the write up of the report here.

We recently received a request from a [user] asking if we can make the collected data/logs available publicly. We generally do not make data used for reports available unless that data is aggregated and all possible personally identifiable information (PII) has been removed.

We reached out to Legal (and cc'd @leila) to find out if we can release this information and received back this information:

We, the WMF, has signed MOU and NDA with [user] and his NDA allows him to work with raw webrequest logs related to WDQS. It would be great if the data from the report is shared with him, this way, we can increase collaboration across these research efforts and also make sure we don't end up working on very similar problems in parallel.

We will share the aggregated data files with the [user] via a shared server directory (probably on stat1002) since we cannot can guarantee anonymization of the data.

Event Timeline

debt created this task.Oct 11 2016, 7:00 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 11 2016, 7:00 PM
debt triaged this task as Normal priority.Oct 11 2016, 7:00 PM

Just sent an email to the requester with some instructions. :)

debt closed this task as Resolved.Oct 20 2016, 7:56 PM