Page MenuHomePhabricator

Security Request For Service - Push Notifications
Closed, ResolvedPublic

Description

Basic Information Section

Brief description

In fiscal Q4 2019-2020 the Product Infrastructure team plans to build the basic infrastructure for push notifications for the Wikimedia product platforms (iOS, Android, and web). The initial focus of the project will be on providing push notifications for the apps.

This project is to implement a basic push notification infrastructure for the Wikimedia products (web and apps). This infrastructure will consist of two software components: (1) a set of updates to the Echo extension to handle push; and (2) a new Node.js service (mediawiki/services/push-notifications). We are requesting input on the security and privacy aspects of the system design, as well as a pre-deployment Security Readiness Review.

Do you have a project/product/program plan or documentation?

Primary Contacts

What Security Team services do you anticipate needing?

  • Threat Modeling
  • Security Concept Review
  • Security Readiness Review

What is the 'go live' date for deployment of this project

June 30, 2020


Privacy Information Section

Will any sensitive data to be collected, stored or exposed?

  • The push notifications service will manage the storage of platform-issued subscriber tokens (for apps) or subscription data blobs (for web) provided by users. These will in turn be sent along to platform-specific push services in order to identify intended message recipients.
  • Message content may expose a user's identity or interests

Technical Information Section

Do related discussions exist in Phab, on wiki, or in an RFC'?

T249065: RFC: Wikimedia Push Notification Service

Technology Stack

  • Extension: MediaWiki, PHP, MySQL
  • Service: Node.js, Cassandra

Security Readiness Review Section

An anticipatory Security Readiness Review request has already been filed: T246712: Security Readiness Review for push notifications infrastructure

Code

Post-deployment

  • Maintainers: Product Infrastructure

Working test environment

  • TBD

Details

Author Affiliation
WMF Product

Event Timeline

chasemp mentioned this in Unknown Object (Task).Apr 29 2020, 2:44 PM

Following up on some questions here as discussed by email.

I see T249065 which has since been created. Does this mean the TechComm process is on the menu now?

Yes, we decided to submit our plans for comment via the TechCom RFC process after all. I believe we're at or near the end of that process. It's been quiet, although I see we had some late-breaking comments come in last night.

T249065 outlines end game outcomes for Q4 (So July 31st of this year). Is this timeline still active as of Covid-19 availability changes?

Yes, our plan is to stick to that timeline, with the initial launch narrowly scoped to supporting only the apps (not web), and supporting only existing Echo notification types. (On a side note, I think end of Q4 is June 30, not July 31?)

Cassandra seems like the data store of choice here in the design phase. Have you spoken with the data persistence team about this? My understanding is cassandra currently has limited use cases, and is not supported fully by SRE. That would seem to mean whoever are the current stewards need to be onboard with supporting this use case, or the product team needs to maintain their own Cassandra infrastructure. This may have changed in the last 18months. Depending on stewardship of the underlying data store there are security implications inherent to support and ownership.

We've been in contact with @Eevans about prospectively having the service component of this infrastructure manage the storage of push subscriptions in Cassandra. My understanding in general is that it would be a technically feasible choice, and that there's plenty of unused capacity in the Cassandra cluster currently used for session storage.

We've been relying on Cassandra in the Foundation's services platform for a few years now, so I had assumed that there was at least some baseline guarantee of SRE support for it. That said, I don't know what the current service level agreement (if any) is between SRE and CPT with respect to Cassandra operations. We'll follow up on that.

The document is unclear to me as far as the July 31st aim of integration with Echo. Will Echo rely on Cassandra after this timeline? I'm assuming no with the creation of an extension to leverage Echo itself, but the relationship targeted between these two extensions at implementation phases is unclear to me.

Sorry, it sounds like the diagram or description may be misleading on this point. As currently proposed, any data storage in Cassandra would be managed by the Node.js service. Any data storage needs of the PushNotifications extension will be handled in the existing MediaWiki MySQL cluster. The plan is for the service to manage device push subscription storage (in Cassandra), and for the extension to manage a (MySQL) table mapping global wiki user IDs to device push subscriptions. It will be the client's responsibility to add or remove associations between push subscriptions registered with the service and wiki user accounts; the intention is that subscriptions managed by the service be relatively long-lived (and support anonymous users), and that the mappings managed by the extension between these subscriptions and wiki user accounts be shorter-lived, i.e., created on login and deleted on logout.

Does this answer your question? Sorry, I'm not quite clear on how the timeline element plays into it. This won't create any direct dependency on Cassandra from Echo (or the PushNotifications extension), if that's what you're asking.

Can we wrap our arms around what kind of data can be pushed and define some policy to this effect? Leveraging the provider native platforms for messages makes sense, but only if we have strict rules about what information can transit. These policies should be in place prior to any deployment for safe handling into posterity before any deployment.

Created T251204 re: definition of such a policy. As noted there, one option we've been exploring is that of sending no data via the third-party push services, but rather pushing empty messages that serve only to prompt the app or browser to wake up and retrieve messages from Wikimedia's servers. This would arguably muddy up the architecture somewhat, but with the benefit of effectively eliminating an entire class of potential problems around user privacy.

In relation to above, the design doc currently does not address content traversal over provider platforms and encryption. I note that depending on classification of data we may have different problems to solve, but also that each provider network has a different implementation for E-2-E encryption and privacy expectations. (Google, Apple). In general, there is no trust relationship between ourselves and these messaging backplanes and so they should be treated as wild west as the general internet. There will be exposure for at-risk populations if confidentiality is not preserved, etc.

We have looped you into that discussion, which is happening separately. (The above comment about a potential zero-content push message model also applies here.)

JFishback_WMF moved this task from Incoming to Watching on the Privacy Engineering board.
JFishback_WMF moved this task from Incoming to Waiting on the Security-Team board.
chasemp subscribed.

Jacob & Aeryn. I'm just subscribing you here to bypass the ACL so you'll have access to all relevant context across conversations.

chasemp closed subtask Restricted Task as Resolved.May 14 2020, 2:59 PM

Adding others from PI who will be involved in implementing this.

Just to be clear: is it OK to discuss potential privacy risk mitigation strategies at this point on the public RFC? I'd presume so; the code will be open-source, after all.

Just to be clear: is it OK to discuss potential privacy risk mitigation strategies at this point on the public RFC? I'd presume so; the code will be open-source, after all.

Absolutely, as long as folks refrain from discussing actual vulnerable code within WMF production. Which of course could not be the case here.

MSantos closed subtask Restricted Task as Resolved.Sep 15 2020, 5:12 PM
sbassett claimed this task.
sbassett moved this task from Waiting to Our Part Is Done on the Security-Team board.

Security Readiness Review was performed and completed in T246712. I'm going to mark this resolved for now, as I don't believe any further reviews are scheduled.

sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".May 17 2021, 4:16 PM
sbassett changed the edit policy from "Custom Policy" to "All Users".