Page MenuHomePhabricator

Image Browsing: Self-service security review checklist
Closed, ResolvedPublic5 Estimated Story Points

Description

Overview

This checklist provides actionable steps to verify the security of the ImageBrowsing feature before deployment to production Wikipedia for A/B testing.


1. XSS Protection - v-html Usage Audit
Caption HTML Sanitization
  • Confirm that all caption sources (label, figcaption, alt text) are coming from MW-parsed content which has been is sanitized
  • Confirm that all uses of v-html in the feature will only receive sanitized content; no user-controllable content should be able to reach these areas

After reviewing, I feel like our usage of v-html should be considered safe.

Usage locations:

  • VisualTableOfContentsItem.vue:25
  • DetailViewCaption.vue:11

Security rationale:
We are re-displaying content that has already gone through MW's parser without changing it in any way.

We have also take care to ensure that plain text content (like Wikidata labels, which may not be sanitized in the same way) are never put into v-html contexts.

Trust model:
We trust MediaWiki's parser output because:

  • The parser uses a comprehensive Sanitizer class to strip dangerous HTML
  • Script tags, event handlers, and javascript: URLs are removed
  • This is the same HTML already displayed on the Wikipedia page
  • Any XSS vulnerability here would affect Wikipedia itself, not just our extension
2. Third-Party JavaScript Library Security
Audit NPM packages
Audit Foreign Resources

Review resources/lib/foreign-resources.yaml:

  • fast-average-color (v9.5.0)
    • Current version: 9.5.0
    • License: MIT
    • Usage: Color extraction from images
    • Action: Check for known vulnerabilities: Search CVE databases and npm audit
    • Action: Verify integrity hash matches: sha512-nC6x2YIlJ9xxgkMFMd1BNoM1ctMjNoRKfRliPmiEWW3S6rLTHiQcy9g3pt/xiKv/D0NAAkhb9VyV+WJFvTqMGg==
    • Action: Review source code at https://github.com/fast-average-color/fast-average-color for suspicious patterns
    • Action: Check latest version available and changelog for security fixes
  • smartcrop (v2.0.5)
    • Current version: 2.0.5
    • License: MIT
    • Usage: Image cropping intelligence
    • Action: Check for known vulnerabilities: Search CVE databases and npm audit
    • Action: Verify integrity hash matches: sha384-MFx/krFqqAwdVQcbFsG7NZ6io1/k+p/zRG1aJxYtjjnzaiwRUxVslq915QQ/c3Cl
    • Action: Review source code at https://github.com/jwagner/smartcrop.js for suspicious patterns
    • Action: Check latest version available and changelog for security fixes

Foreign resources have been manually reviewed. Interestingly it looks like the Smartcrop library was re-formatted locally at some point (probably unintentionally). This patch restores all project foreign resources to their pristine versions to ensure that integrity is maintained: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ReaderExperiments/+/1192958/1

3. API Security & Data Flow
MediaWiki API Usage
  • Review API interactions in useMwApi.js
    • Verify mw.ForeignApi is initialized with { anonymous: true } (line 11)
    • Confirm no authentication tokens or sensitive data in API calls
    • Action: Verify all API calls are read-only (no POST/write operations)
  • Audit API endpoints called
    • useEntityId.js: Wikibase entity lookups
    • useMediaSearchResults.js: Commons media search
    • useExternalImages.js: External wiki image queries
    • Action: Verify all queries use proper parameter escaping

Posted a small patch to throw errors on bad entity IDs (which are used as URL params in useMediaSearchResults

External Wiki Configuration
  • Review external wiki allowlist in extension.json:28-34
    • Verify only trusted wikis: wikipedia.org and wikidata.org
    • Action: Confirm no arbitrary external URLs can be injected
    • Action: Test that user input cannot override allowed wikis list

These config variables are set by administrators and should be considered safe; if we are dealing with a malicious admin we have bigger problems.

4. Input Validation & Sanitization
Image URL Handling
  • Review URL parsing in thumbExtractor.js:72-113
    • Uses mw.util.parseImageUrl() - MediaWiki core utility
    • Verify URLs are validated before use
    • Action: Test with malformed/malicious image URLs
    • Action: Verify crossorigin="anonymous" attribute on all image urls
Alt Text Escaping
  • Verify HTML escaping of alt text
    • Confirmed: mw.html.escape() used anywhere alt text is displayed
    • Action: Test with alt text containing HTML entities and script tags
    • Action: Verify escaping is consistent across all components

Posted a patch here that ensures we handle HTML content (fig caption, paragraph text) and plain text (alt text, label) in different ways, ensuring that plain text content is never passed into v-html.

5. Client-Side Security
Local Storage / Caching
  • Review useBackgroundColor.js
    • Uses in-memory Map for caching (lines 45, 66)
    • No localStorage or persistent storage used
    • Action: Verify no sensitive data stored client-side
Router Security
  • Verify router usage (dependency: mediawiki.router)
    • Check for open redirect vulnerabilities
    • Verify URL parameters are properly validated
    • Action: Test with malicious URL fragments and query parameters
6. Server-Side Security
PHP Hook Review
  • Audit Hooks.php
    • Action: Confirm no user input reflected in server-side output without sanitization
7. Manual Security Review
  • Code review checklist
    • All v-html uses documented and justified
    • No eval(), Function(), or innerHTML assignments from user input
    • All external data properly validated
    • No sensitive data in client-side code or console logs
8.. Pre-Deployment Final Checks
  • Run full test suite including security tests
  • Run npm audit and resolve any high/critical vulnerabilities
  • Review all foreign resource versions for known CVEs
  • Conduct manual security review

Event Timeline

ovasileva set the point value for this task to 8.
ovasileva changed the point value for this task from 8 to 5.

Snyk found one issue in project NPM dependencies:

Testing /Users/eric/Dev/wmf/mw-homebrew/mediawiki/extensions/ReaderExperiments...

Tested 839 dependencies for known issues, found 1 issue, 4 vulnerable paths.


Issues to fix by upgrading:

  Upgrade grunt-eslint@24.3.0 to grunt-eslint@25.0.0 to fix
  ✗ Missing Release of Resource after Effective Lifetime [Medium Severity][https://security.snyk.io/vuln/SNYK-JS-INFLIGHT-6095116] in inflight@1.0.6
    introduced by grunt@1.6.1 > glob@7.1.7 > inflight@1.0.6 and 3 other path(s)

This seems like a relatively minor vulnerability which could lead to memory leaks and program crashes (as opposed to data leakage / remote execution / etc).

The inflight NPM package is no longer supported or maintained; it winds up as a transitive dependency in Grunt. Unlike inflight, Grunt is still maintained (at least in theory – the last release came out in 2023). Someone has already filed an issue on Github highlighting this dependency (back in August – see here) but there has been no response from the maintainers as of October 1.

Grunt uses an old version of the package Glob which is no longer supported (more recent versions of Glob, v9.0 and later, are still supported).

As a work-around here, I'm going to post a patch that overrides our project's version of the Glob package from v7.x (unsupported) to v10.x (supported). There seems to be no change in behavior that impacts us, and scripts like npm run test (which rely on Grunt) continue to work fine locally.

No issues with Composer deps found.

Change #1192912 had a related patch set uploaded (by Eric Gardner; author: Eric Gardner):

[mediawiki/extensions/ReaderExperiments@master] Dependencies: Add snyk, override Glob to use supported version

https://gerrit.wikimedia.org/r/1192912

Change #1192958 had a related patch set uploaded (by Eric Gardner; author: Eric Gardner):

[mediawiki/extensions/ReaderExperiments@master] Dependencies: Ensure local foreign resources are unmodified

https://gerrit.wikimedia.org/r/1192958

Change #1192969 had a related patch set uploaded (by Eric Gardner; author: Eric Gardner):

[mediawiki/extensions/ReaderExperiments@master] ImageBrowsing: Reject bad entityIds in useMediaSearchResults

https://gerrit.wikimedia.org/r/1192969

Change #1192912 merged by jenkins-bot:

[mediawiki/extensions/ReaderExperiments@master] Dependencies: Override Glob to use supported version

https://gerrit.wikimedia.org/r/1192912

Change #1192958 merged by jenkins-bot:

[mediawiki/extensions/ReaderExperiments@master] Dependencies: Ensure local foreign resources are unmodified

https://gerrit.wikimedia.org/r/1192958

Change #1193267 had a related patch set uploaded (by Eric Gardner; author: Eric Gardner):

[mediawiki/extensions/ReaderExperiments@master] ImageBrowsing: Never pass plain text through v-html

https://gerrit.wikimedia.org/r/1193267

Change #1192969 merged by jenkins-bot:

[mediawiki/extensions/ReaderExperiments@master] ImageBrowsing: Reject bad entityIds in useMediaSearchResults

https://gerrit.wikimedia.org/r/1192969

Change #1193267 merged by jenkins-bot:

[mediawiki/extensions/ReaderExperiments@master] ImageBrowsing: Never pass plain text through v-html

https://gerrit.wikimedia.org/r/1193267

@NBaca-WMF I believe this is ready for sign off now – let me know if you have any questions about my findings or the work that was done to address them.

Actually, it sounds like manager-level approval is sufficient here. In that case @HSwan-WMF can sign-off when ready.

Thanks for your thorough review, @egardner ! I feel comfortable signing off on this. You can consider it approved.