Page MenuHomePhabricator

Create API to allow retrieving detailed information about Thanks log items for my own user account
Open, Needs TriagePublic

Description

As a user, I'd like to be able to call an API to get the "Thanks" that I have received, as well as the specific revisions/page titles I was thanked for.

That would require a new APIListModule in Thanks:

  • query the log_search table for ls_field => 'thankid', and use the ls_value column to get the revisions.
  • restrict access to current logged-in user only (I can only query results for myself) for privacy purposes

Event Timeline

As noted in parent task (T322166#8364935) you can get the revision or log ID via the Echo notifications query API: https://www.mediawiki.org/w/api.php?action=query&format=json&meta=notifications&formatversion=2&notsections=message&notnotifiertypes=web But it seems to miss an ability to specify e.g. type: "edit-thank" in the parameters, to get just the Thanks related notifications.

@sbassett, we'd like security's conceptual input and sign off on this task. We're implementing a Thanks Count dialog and we're considering querying the log_search table for ls_field => 'thankid', and use the ls_value column to get the information we need. As stated by Kosta, for privacy purposes, we're restricting access to current logged-in users only we users can only query results for themselves.

Is there anything we need to be aware of from a security perspective?

@sbassett, we'd like security's conceptual input and sign off on this task. We're implementing a Thanks Count dialog and we're considering querying the log_search table for ls_field => 'thankid', and use the ls_value column to get the information we need. As stated by Kosta, for privacy purposes, we're restricting access to current logged-in users only we users can only query results for themselves.

Is there anything we need to be aware of from a security perspective?

This sounds good. Someone from Privacy Engineering (@JFishback_WMF etc.) should be able to follow up with you soon. Thanks.

@sbassett, we'd like security's conceptual input and sign off on this task. We're implementing a Thanks Count dialog and we're considering querying the log_search table for ls_field => 'thankid', and use the ls_value column to get the information we need. As stated by Kosta, for privacy purposes, we're restricting access to current logged-in users only we users can only query results for themselves.

Is there anything we need to be aware of from a security perspective?

This sounds good. Someone from Privacy Engineering (@JFishback_WMF etc.) should be able to follow up with you soon. Thanks.

Pinging @JFishback_WMF and Privacy Engineering, could you please let us know if you have any objections? (Or let us know when you think you might be able to let us know your assessment?)

Hello @kostajh - we'll add this to our next sprint. Part of the team will be off for the upcoming holidays but I'll see if someone can review it in the meantime.

Hello @kostajh

I examined the proposed API through common privacy risk categories:

  • PII processing and user identification: The API output is expected to include revision details such as timestamp, page title, and username who sent their gratitude. These details will most likely not be personally identifiable, as most contributors do not use their real names as usernames. This does not prevent any malign actor from trying to correlate the username with other datasets but the likelihood of them using the API for that would be low: the API provides limited information and there are probably more other efficient ways to do it.
  • Notice, consent: Contributors who use the Thanks feature are informed that their action will “publicly send thanks” and are prompted to confirm their action. It can safely be assumed that the feature isn’t conflicting with the initial purpose for which consent was granted.
  • Data correction: AFAIK, once you've thanked someone, you can’t “unthank” them. While this means users have no way to exercise control over their information by withdrawing their action, it is beyond the scope of the API which is essentially retrieving data, rather than creating it.
  • Data security: By restricting the output to logged-in users and filtering the API results to the currently logged-in users, the proposal ensures confidentiality. It also prevents unauthorized secondary uses such as profiling users based on whom they thank.
  • Data sharing with external parties: The API would be restricted to authenticated users and would only display Thanks related to the logged-in user only. This would prevent sharing with third parties altogether.

Overall, the risk would be LOW from a privacy standpoint. This is mostly explained by the limited data processing and embed mitigations: access control and data minimization.