Page MenuHomePhabricator

Reading Data Map
Closed, ResolvedPublic1 Story Points

Description

  • Map out all the datasets used by the Reading team including datasets "in transit" (e.g. data in Kafka)
  • Draft data access guidelines

Hi!

Staff from various teams have asked for a clear set of rules to follow for how to handle personally identifiable information. So Legal & Security want to harmonize access policies and practices among WMF staff as much as possible for personally identifiable information. We have a data retention policy, an access policy for community members, and ask people from outside WMF to sign an NDA, but our staff access practices vary across teams, and we don't have a comprehensive idea of where data sits or how it flows through WMF.

First, the data map: We'd like to know what personal data WMF collects and uses. So we're asking teams to fill out a spreadsheet about the data sets that they use and have control over. To give a sense of the level of granularity, here are examples that Editing and Discovery are currently filling out. The idea is to take the info from each team to create a single data map that all staff can easily refer to.

Second, a staff access policy: Last year, Discovery drafted a Data Access Guide. We'd like to use that as a starting point for other teams to tailor to their own needs, since some teams handle a lot more personal data than others. But having too many people work off one Google Doc is messy. So please make your comments, edits, and suggestions to the draft provided here, and we'll consolidate them. Eventually, each team can decide whether it's better to adopt a general guide, or to draft one more specific to their needs.

Finally, as to timing, the idea is to have all teams take a first pass at filling out the spreadsheets by the end of May.

Links & Documentation:

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 20 2016, 11:34 PM
kevinator renamed this task from Reading to Reading Data Map.Apr 20 2016, 11:43 PM
atgo updated the task description. (Show Details)Apr 25 2016, 9:04 PM
SPong updated the task description. (Show Details)Apr 29 2016, 6:30 PM
SPong added a subscriber: SPong.
Tbayer added a subscriber: Tbayer.

@SPong: Can you confirm that https://docs.google.com/a/wikimedia.org/document/d/1LeBWhzXVK5TdFHnIJ5tq6HUM5XixAJKaq6FBE36OoaM/edit?usp=sharing (the link given in the task description) is still the guidelines draft we should be working on? I had left some edits and comments there earlier, but I seem to recall someone suggesting to switch to a different location shortly afterwards.

(Update: Steven confirmed that notwithstanding those other suggestions, the guidelines draft link is still current for the purposes of this task.)

Tbayer updated the task description. (Show Details)Oct 18 2016, 8:29 PM
Tbayer added subscribers: JKatzWMF, Nuria.

The dataset maps were completed some months ago. Regarding the access guidelines, there was no followup and it looks like the proposed changes got lost in a process mixup - I'm inquiring on this separately.

Tbayer changed the task status from Open to Stalled.Dec 8 2016, 2:54 AM
Tbayer claimed this task.
Tbayer updated the task description. (Show Details)

(Input on data access guidelines has been provided separately since.)

Tbayer closed this task as Resolved.Dec 8 2016, 5:02 AM