Page MenuHomePhabricator

MediaWiki and GDPR
Open, Needs TriagePublic

Description

On May 25th, 2018, the new European data protection regulation GDPR will become effective. It also has several requirements for software collecting any kind of personal data. MediaWiki collects this kind of data in user tables, edit history and action logs.

In this workshop, we will go through the requirements of GDPR and see if and how MediaWiki complies with them. The requirements are:

  • Right to access and be informed
  • Right to rectification
  • Right to be forgotten
  • Right to data portability
  • Right to restrict / object processing
  • Breach notification

The desired outcome is

Documentation of the workshop

Event Timeline

Mglaser created this task.May 17 2018, 7:56 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 17 2018, 7:56 PM
Joris added a subscriber: Joris.May 18 2018, 7:52 AM
RHeigl added a subscriber: RHeigl.May 18 2018, 8:31 AM
Tgr added a subscriber: Tgr.May 18 2018, 9:09 AM

Hello!
This session is in a room with video recording capabilities. Would you like this session recorded for YouTube / Commons?
Pinging @bcampbell as he will be helping with this.

TK-999 added a subscriber: TK-999.May 18 2018, 11:54 AM

I would very much like to have the session recorded and/or broadcast live so people who can't attend personally can view the presentation.

You got it.

bcampbell updated the task description. (Show Details)May 18 2018, 7:30 PM
  • Article 6: Is IP address attribution for anons "necessary"?
  • Article 15: Right of access by the data subject. Do we need to e.g. give people access to their own checkuser data, and is it a security problem to make that accessible without a human in the loop?
  • Article 17: Right to erasure. How do we respond to requests for erasure of a username or anon IP? Do we need to provide a way to fully erase content via the web UI?
  • Article 20: Data portability. A personal view of "right to fork". Any additional tools needed? Preferences, watchlist export/import? Contributions mode of Special:Export? Uploaded files?
  • Article 34: Communication of a personal data breach to the data subject. Currently there is no mass mailing extension. For WMF board elections we write a custom maintenance script each time.
Mglaser updated the task description. (Show Details)May 19 2018, 10:54 AM
Mglaser updated the task description. (Show Details)May 19 2018, 2:33 PM
freephile updated the task description. (Show Details)May 19 2018, 3:14 PM
Masti added a subscriber: Masti.May 20 2018, 4:00 AM
Stryn added a subscriber: Stryn.May 23 2018, 3:00 PM

Thank you for hosting your session at WMHack! If you have any notes or slides please add them to the task and then make sure to close the task when there are no more actions. :)

Tgr added a comment.May 24 2018, 10:18 AM
  • Article 15: Right of access by the data subject. Do we need to e.g. give people access to their own checkuser data, and is it a security problem to make that accessible without a human in the loop?

At the very least it would seriously reduce its effectiveness if vandals could just check exactly what the tool reveals about them.

Article 17: Right to erasure. How do we respond to requests for erasure of a username or anon IP? Do we need to provide a way to fully erase content via the web UI?

WRT usernames, the right of erasure only exists when the processing is not necessary for some legitimate interest of the data controller. It could be argued that the ability to preserve provenance information (and thus the ability to identify large-scale manipulation of content) is a legitimate interest - users can ask to be renamed to something nondescript but the ability to inspect the full set of their contributions is still necessary.
WRT content, it could be argued that most open licenses are irrevocable contracts between the data subject and the data controller, and contracts are a lawful basis got data processing according to the GDPR, even in the absence of consent. The irrevocability of open licenses is probably on shaky legal grounds though.

  • Article 20: Data portability. A personal view of "right to fork". Any additional tools needed? Preferences, watchlist export/import? Contributions mode of Special:Export? Uploaded files?

"the data subject shall have the right to have the personal data transmitted directly from one controller to another, where technically feasible" - that seems to require some kind of wiki-to-wiki user data transfer feature. Arguably content (with the exception of user namespace), watchlist, most of preferences are not personal data though; there is very little in MediaWiki that this would apply to. There should probably be some sort of framework for extensions which do store personal data (such as SocialProfile or third-party authentication extensions), such as a personal data import/export hook.

  • Article 34: Communication of a personal data breach to the data subject. Currently there is no mass mailing extension. For WMF board elections we write a custom maintenance script each time.

There is sendBulkEmails.php. It's currently in WikimediaMaintenance though.

Lofhi added a subscriber: Lofhi.May 28 2018, 6:04 AM
Vvjjkkii renamed this task from MediaWiki and GDPR to etcaaaaaaa.Jul 1 2018, 1:09 AM
Vvjjkkii triaged this task as High priority.
Vvjjkkii removed Mglaser as the assignee of this task.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot assigned this task to Mglaser.
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot renamed this task from etcaaaaaaa to MediaWiki and GDPR.
CommunityTechBot added a subscriber: Aklapper.
corey added a subscriber: corey.Aug 9 2018, 2:27 AM

@Mglaser: Is there anything left in this task? Or should this task be resolved and potential work continues in https://www.mediawiki.org/wiki/GDPR_(General_Data_Protection_Regulation)_and_MediaWiki_software ?

@Mglaser: Is there anything left in this task?

I don't think comment T194901#4228182 by @Tgr has been addressed, at least not as far as data portability is concerned.