User Details
- User Since
- Aug 10 2018, 4:17 PM (242 w, 5 h)
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- Samuel (WMF) [ Global Accounts ]
Yesterday
My recommendation from a data privacy perspective is to show aggregated data only and keep the PII in the back end for 90 days, during which time participants can update their answers, after which time we anonymize the PII data and keep only the aggregated data.
As far as aggregated data, I recommend reporting out when we have more than x persons in a [sub]category; below that we could either not report out or report, for example a compilation "other <x"
Tue, Mar 28
Usually, with PII data on persons we set a minimum for calculating averages so the data cannot be disambiguated and persons identified.
Essentially, such data should not be disaggregated at small numbers. There may be a standard at 10, that is usually 20 though for power to detect differences. So, for example, if you have <10 persons in a [sub] category then you don't report out on that [sub] category.
Security and GDI teams may be able to provide additional insights and feedback.
Hey @Iflorez, how will that work if event organizers are able to view data at an individual level anyway (F35861130)? Will the detailed view of the Participants tab be unavailable once data is aggregated, after the 90-day window?
Tue, Mar 7
Thanks for the ping @sbassett. We could borrow some ideas from the generic message currently displayed when logged in users visit external links, and a privacy notice(T65598#6914486) which was provided by WMF-Legal. Privacy best practices encourage both brevity and clarity of notices. So, a more privacy-conscious message could be something along these lines:
Mon, Mar 6
Hey everyone, I agree that having the security team review every single Gadgets and User script would not be scalable or even realistic.
Tagging Privacy Engineering for an opinion/risk rating about the following. I'm not certain there's precedent for this on Wikimedia production or that wmcs would completely satisfy any privacy concerns for proposed, embedded content like this.
Fri, Mar 3
Hello @Skizzerz, is there a publicly accessible repository for the source of https://owidm.wmcloud.org?
Feb 10 2023
Nov 28 2022
@jrbs was added as a maintainer to the NDA bot, see (https://toolsadmin.wikimedia.org/tools/id/tsbot). Also, the code was moved the Wikimedia's Gitlab instance: https://gitlab.wikimedia.org/repos/security/tsbot-nda
The repository was imported to Wikimedia's Gitlab instance: https://gitlab.wikimedia.org/repos/security/tsbot-nda
Nov 24 2022
I examined the proposed API through common privacy risk categories:
Nov 8 2022
Hey @ldelench_wmf, I have no objections to closing this one, thanks.
Nov 1 2022
Hello, some quick updates.
Oct 25 2022
@ifried, the Security-Team hasn't gotten the chance to discuss the mitigating options surfaced in the Google Docs conversation. Meanwhile, I would like to keep the ticket open and update it once we've made some progress.
Oct 20 2022
Hey @ifried, the Privacy Engineering review is complete. Could you take a look at our conclusions and address any potential misunderstanding there? https://docs.google.com/document/d/1lFeq7jtUCmXdwoKwIfqgO-74ccTU0kBtX7zkJkeMByw/edit#?
Oct 14 2022
Hello @ifried, Privacy Engineering will start looking into this as part of our current sprint. On a side note, I am aware that the previous features have been looked at by WMF-Legal. For this additional feature, are you having any conversation with Legal in parallel?
Oct 12 2022
Aug 22 2022
Jul 12 2022
@Aklapper, sure thing. It used to be a private repo which I was the sole maintainer when I was still in Trust-and-Safety. Moving it to GitLab makes sense but I am not sure which project would be suited for it. I don't currently see anything related to Trust & Safety. Any suggestion?
Merged and deployed lemme know if that fixed it - sorry didn't get the chance to test it live as I no longer have a TS account. Let me know if you have a tool account and I add you as a deployer. Once the PR is merged deploying is usually just ssh'ing and running a shell script that does all the git and kubernetes steps.
Hey @jrbs, I think the issue might linger somewhere between these lines: https://github.com/samuelguebo/tsbot-nda/blob/6c05f844039d9713f3328d1e59c34aa29c80fa3b/routes/nda.py#L233-L284
Mar 8 2022
Thank you both.
Mar 7 2022
Hey @sbassett and @JFishback_WMF , do you have any strong objections to making this task public? Its content may inform the ongoing discussion around third-party resources in T296847.
Mar 4 2022
My current plan for rollout is as follows:
- Informal feedback round (in progress)
- Update policy based on feedback.
- New more formal feedback round from WMF staff e.g. please respond by X date
- Update policy based on feedback.
- A formal round of feedback from the community.
- We'll update the interface to provide a notice on pages where JS can be added that links to the policy:
<div class="mw-message-box mw-message-box-notice">All code written here is expected to <a href="#">adhere to the gadget policy</a>.</div>
Feb 16 2022
Following T262493#7584789 I've begun drafting a policy and collating feedback on the talk page:
https://www.mediawiki.org/wiki/User:Jdlrobson/Extension:Gadget/PolicyPerhaps we could combine efforts here?
Feb 15 2022
As noted above, WordPress-powered websites such as Diff are used by the Foundation for public-facing initiatives. For instance, blog posts published on Diff feature names of their authors, and in most cases their titles within the organization. Although, the REST API allows people to retrieve the list of user accounts of the website, it generates list of already-public users, in JSON format that'll need to be parsed/processed. Therefore, the API is not disclosing any information that was not already private, nor is it increasing the visibility of information that was already public by making it easier to retrieve.
Dec 8 2021
Dec 6 2021
Dec 2 2021
Dec 1 2021
Oct 29 2021
Hey @Addshore, on the behalf of Security-Team, I reviewed the privacy risks inherent to the proposed usage reporting feature for mwcli. I’ll share my conclusions below.
Oct 26 2021
@sguebo_WMF Is this data visible on the wikis?
For example after a user has gone through a whole setup and played around with the environment a bit we might get something like this (at the highest detail level planned) from the next time the mwcli attempts to send data back.
- 4x docker mediawiki create
- 2x docker mysql create
- 2x docker mediawiki install --dbtype=mysql --dbname=default
- 2x docker mediawiki install --dbtype=sqlite1 --dbname=CUSTOM
- 22x docker mediawiki exec
- 4x codesearch search --output=ack
- 1x codesearch search --output=table
Oct 25 2021
Hey @Addshore -- thanks for bringing that to the Privacy Engineering team's attention. My understanding is that the metric tool would work as below:
Oct 21 2021
localuser: source: - localuser - globaluser view: > select lu_wiki, lu_name, lu_attached_timestamp, lu_attached_method, lu_local_id, lu_global_id where: lu_global_id = gu_id AND gu_hidden=''
Oct 20 2021
Noted, thanks for the additional context @GeneralNotability. The description was updated accordingly.
Oct 19 2021
Hey @GeneralNotability and @Urbanecm. As mentioned earlier, your question will be brought to WMF-Legal's attention. Feel free to rename it or adjust the description if I missed some aspects.
Oct 18 2021
Oct 8 2021
Something more robust would be needed if this is to become any sort of wikimedia, or WMF-wide standard. A longer-term fix would be to integrate such indications in to the software, but development on Gadgets 2.0 has been stalled for a LONG time.
That being said, I don't think "PRIVACY" is very useful there alone - from a UX perspective that seems confusing, did you notice there is already a lengthy hover text on the "E"xternal indicator? It looks like this:
Also keep in mind, there are actually much larger risks then "privacy" when loading third party scripts, such as account hijacking - surreptitious action making, etc.
Oct 1 2021
Sep 20 2021
Thanks for handling that, @thcipriani
Sep 17 2021
On the behalf of Security-Team, I reviewed the privacy risks that exposing Translation extension’s tables in replicas may bring about. I’ll share my conclusions below.
Sep 16 2021
Hey @Nikerabbit and thanks for providing some background on these tables.
Sep 15 2021
Sep 13 2021
Sep 10 2021
Sep 9 2021
Thank you for your answer. I have now wrapped the privacy review and would like to share the conclusion. The analysis focused on the sys database of db2083 server, which as of writing, contains 88 tables, encompassing performance statistics.
Thanks for pasting the output of sys’ tables, @Marostegui. That’s really helpful. As I am wrapping up my analysis, I’d like to ask a quick question just to make sure that I understand things correctly. The host_summary table seems to contain IP addresses in the 10.192.** range. Judging by the range, my understanding is that these IPs are from Wikimedia load balancer and not from the end user. Is my understanding accurate?
Sep 3 2021
Interesting - https://github.com/miraheze/RemovePII does indeed look like it could maybe get us part of the way there.
Aug 30 2021
Aug 27 2021
Hey @mforns, that confusion is totally understandable so not worries at all. The good thing is that I have now updated my title in Phabricator.