Overview
This checklist provides actionable steps to verify the security of the ImageBrowsing feature before deployment to production Wikipedia for A/B testing.
1. XSS Protection - v-html Usage Audit
Caption HTML Sanitization
- Confirm that all caption sources (label, figcaption, alt text) are coming from MW-parsed content which has been is sanitized
- Confirm that all uses of v-html in the feature will only receive sanitized content; no user-controllable content should be able to reach these areas
After reviewing, I feel like our usage of v-html should be considered safe.
Usage locations:
- VisualTableOfContentsItem.vue:25
- DetailViewCaption.vue:11
Security rationale:
We are re-displaying content that has already gone through MW's parser without changing it in any way.
We have also take care to ensure that plain text content (like Wikidata labels, which may not be sanitized in the same way) are never put into v-html contexts.
Trust model:
We trust MediaWiki's parser output because:
- The parser uses a comprehensive Sanitizer class to strip dangerous HTML
- Script tags, event handlers, and javascript: URLs are removed
- This is the same HTML already displayed on the Wikipedia page
- Any XSS vulnerability here would affect Wikipedia itself, not just our extension
2. Third-Party JavaScript Library Security
Audit NPM packages
- NPM packages (all dev dependencies in this project) have been audited with a vulnerability scanning tool like Snyk and any discovered issues have been patched.
- Patch to update Glob to a supported version is here: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ReaderExperiments/+/1192912
Audit Foreign Resources
Review resources/lib/foreign-resources.yaml:
- fast-average-color (v9.5.0)
- Current version: 9.5.0
- License: MIT
- Usage: Color extraction from images
- Action: Check for known vulnerabilities: Search CVE databases and npm audit
- Action: Verify integrity hash matches: sha512-nC6x2YIlJ9xxgkMFMd1BNoM1ctMjNoRKfRliPmiEWW3S6rLTHiQcy9g3pt/xiKv/D0NAAkhb9VyV+WJFvTqMGg==
- Action: Review source code at https://github.com/fast-average-color/fast-average-color for suspicious patterns
- Action: Check latest version available and changelog for security fixes
- smartcrop (v2.0.5)
- Current version: 2.0.5
- License: MIT
- Usage: Image cropping intelligence
- Action: Check for known vulnerabilities: Search CVE databases and npm audit
- Action: Verify integrity hash matches: sha384-MFx/krFqqAwdVQcbFsG7NZ6io1/k+p/zRG1aJxYtjjnzaiwRUxVslq915QQ/c3Cl
- Action: Review source code at https://github.com/jwagner/smartcrop.js for suspicious patterns
- Action: Check latest version available and changelog for security fixes
Foreign resources have been manually reviewed. Interestingly it looks like the Smartcrop library was re-formatted locally at some point (probably unintentionally). This patch restores all project foreign resources to their pristine versions to ensure that integrity is maintained: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ReaderExperiments/+/1192958/1
3. API Security & Data Flow
MediaWiki API Usage
- Review API interactions in useMwApi.js
- Verify mw.ForeignApi is initialized with { anonymous: true } (line 11)
- Confirm no authentication tokens or sensitive data in API calls
- Action: Verify all API calls are read-only (no POST/write operations)
- Audit API endpoints called
- useEntityId.js: Wikibase entity lookups
- useMediaSearchResults.js: Commons media search
- useExternalImages.js: External wiki image queries
- Action: Verify all queries use proper parameter escaping
Posted a small patch to throw errors on bad entity IDs (which are used as URL params in useMediaSearchResults
External Wiki Configuration
- Review external wiki allowlist in extension.json:28-34
- Verify only trusted wikis: wikipedia.org and wikidata.org
- Action: Confirm no arbitrary external URLs can be injected
- Action: Test that user input cannot override allowed wikis list
These config variables are set by administrators and should be considered safe; if we are dealing with a malicious admin we have bigger problems.
4. Input Validation & Sanitization
Image URL Handling
- Review URL parsing in thumbExtractor.js:72-113
- Uses mw.util.parseImageUrl() - MediaWiki core utility
- Verify URLs are validated before use
- Action: Test with malformed/malicious image URLs
- Action: Verify crossorigin="anonymous" attribute on all image urls
Alt Text Escaping
- Verify HTML escaping of alt text
- Confirmed: mw.html.escape() used anywhere alt text is displayed
- Action: Test with alt text containing HTML entities and script tags
- Action: Verify escaping is consistent across all components
Posted a patch here that ensures we handle HTML content (fig caption, paragraph text) and plain text (alt text, label) in different ways, ensuring that plain text content is never passed into v-html.
5. Client-Side Security
Local Storage / Caching
- Review useBackgroundColor.js
- Uses in-memory Map for caching (lines 45, 66)
- No localStorage or persistent storage used
- Action: Verify no sensitive data stored client-side
Router Security
- Verify router usage (dependency: mediawiki.router)
- Check for open redirect vulnerabilities
- Verify URL parameters are properly validated
- Action: Test with malicious URL fragments and query parameters
6. Server-Side Security
PHP Hook Review
- Audit Hooks.php
- Action: Confirm no user input reflected in server-side output without sanitization
7. Manual Security Review
- Code review checklist
- All v-html uses documented and justified
- No eval(), Function(), or innerHTML assignments from user input
- All external data properly validated
- No sensitive data in client-side code or console logs
8.. Pre-Deployment Final Checks
- Run full test suite including security tests
- Run npm audit and resolve any high/critical vulnerabilities
- Review all foreign resource versions for known CVEs
- Conduct manual security review