Page MenuHomePhabricator

Application Security Review Request : OurWorldInData
Closed, DeclinedPublic

Description

Project Information

Description of the tool/project:
Provides a parser tag to embed interactive charts into a page, using freely-licensed data sourced from OurWorldInData.

Description of how the tool will be used at WMF:
Basque Wikipedia would like for the extension to be enabled on their wiki.

Dependencies

List dependencies, or upstream projects that this project relies on.

For Wikimedia deployment specifically, this will depend on a mirroring service available on WMCS: https://owidm.wmcloud.org/. This site will be used as the iframe target rather than the OurWorldInData main site, both for privacy reasons (not sending visitor IPs to a site operated by a 3rd party) as well as some changes made in the embedded display by the mirror site for better on-wiki appearance.

Has this project been reviewed before?

Please link to tasks or wiki pages of previous reviews.

No

Working test environment

Please link or describe setup process for setting up a test environment.

Extension is currently deployed on MDWiki: https://mdwiki.org. To create your own test environment, simply enable the extension. The Usage section on the extension's documentation page gives some examples on how to use it.

Post-deployment

Name of team responsible for tool/project after deployment and primary contact.

Not sure. Platform Engineering maybe?

Details

Risk Rating
Low

Event Timeline

sbassett changed the task status from Open to In Progress.Jan 4 2023, 5:25 PM
sbassett claimed this task.
sbassett triaged this task as Medium priority.
sbassett moved this task from Upcoming Quarter Planning Queue to In Progress on the secscrum board.

Tagging Privacy Engineering for an opinion/risk rating about the following. I'm not certain there's precedent for this on Wikimedia production or that wmcs would completely satisfy any privacy concerns for proposed, embedded content like this.

For Wikimedia deployment specifically, this will depend on a mirroring service available on WMCS: https://owidm.wmcloud.org/. This site will be used as the iframe target rather than the OurWorldInData main site, both for privacy reasons (not sending visitor IPs to a site operated by a 3rd party) as well as some changes made in the embedded display by the mirror site for better on-wiki appearance.

Hello @Skizzerz, is there a publicly accessible repository for the source code of https://owidm.wmcloud.org?

Tagging Privacy Engineering for an opinion/risk rating about the following. I'm not certain there's precedent for this on Wikimedia production or that wmcs would completely satisfy any privacy concerns for proposed, embedded content like this.

There are a couple of privacy issues with displaying an external page into production wikis, as currently envisioned.

  • First, although I am assuming good faith here, making an HTTP request to the proxy (owidm.wmcloud.org) could allow the maintainers of the WMCS tool to collect user information. While it is not clear how the mirroring is currently performed, it would not be far-fetched to imagine IP addresses and User Agents being harvested and stored at the server-level of the proxy (using server variables in PHP for example).
  • Second, any malicious actors gaining access to the WMCS-hosted mirror of owidm could repurpose it, alter its output, prompt users for additional information collection, including all sorts of sensitive PII such as ethnicity, financial details.
  • Third, threat actors could infer more details about the identity of users by combining the IP addresses and User Agents with details about the on-wiki pages where the iframes are embedded (referrer URL). In most cases, this would not provide accurate details about which IP and UA belong to which users. However, if the data visualization is embedded in a low-visibility page such as a user talk page, threat actors could deduce additional PII on a very specific group of users.

A series of mitigating strategies could be considered to reduce risks:

  • Refrain from storing user information that is sent to the proxy. This mitigation is limited since a maintainer or any malicious actor with access to the WMCS instance could alter the code and have it do whatever they would like. Also, I assume WMCS administrators have access to web logs anyway.
  • Using a publicly available code repository for the WMCS-hosted proxy. This would bring a bit more transparency into what the proxy is expected to do. This mitigation suffers the same limitation as above as there is little control over the codebase behind the proxy; for example, it isn’t strictly controlled in a CI/CD pipeline as production code and the proxy codebase could be changed without most people knowing it.
  • Ensure strict access to the proxy instance in WMCS. This is somewhat already applied since WMCS requires developer access. However, because WMCS is not as secure as Production, the proxy website could be compromised more easily.
  • Properly inform users of the data storing and sharing risks. This would help end-users make informed decisions about the use of that feature. That being said, in case of PII misuse, having informed the users in the first place will have little effect on the impact once the privacy violation has occurred.

With the above in mind, our privacy threat modeling has rated the initial risk level as HIGH. Since they have lots of limitations, even if those mitigations are applied the overall risk level would go down to MEDIUM.

sbassett changed the task status from In Progress to Open.Mar 6 2023, 3:50 PM
sbassett raised the priority of this task from Medium to Needs Triage.
sbassett moved this task from In Progress to Waiting on the secscrum board.

Thanks, @sguebo_WMF for the privacy review. I'm going to block this review for now on the above issues being acknowledged and either accepted by the correct level of WMF staff per our current risk management framework or mitigated in some other fashion to reduce the assessed risk level.

Name of team responsible for tool/project after deployment and primary contact.

Not sure. Platform Engineering maybe?

This is a concern both for security (making sure that there are folks ready to do security bug remediation work) and also for the communities that adopt the tool. My concerns from T301044#7747899 are unresolved. Putting an extension with no long term maintenance plan that is backed by a service that has no long term maintenance plan into production is not a sustainable practice.

sbassett moved this task from Waiting to Back Orders on the secscrum board.
sbassett subscribed.
sbassett removed projects: secscrum, Security.

Can be re-opened once this system has been appropriately re-architected.

A series of mitigating strategies could be considered to reduce risks:

BTW. I think the simplest mitigation strategy was missed in this list ? Bring the proxy into WMF production and under WMF control ?

A series of mitigating strategies could be considered to reduce risks:

BTW. I think the simplest mitigation strategy was missed in this list ? Bring the proxy into WMF production and under WMF control ?

That could be part of the solution, for sure, though I'm not sure I'd agree it'd be the simplest. Any code that would live in Wikimedia production would have to be brought up to those standards (quite different from wmcs, etc.) and there typically involves some fairly rigorous justifications for dedicated Wikimedia production hardware IME, more so than just "a bunch of people would really like this thing".