Page MenuHomePhabricator

Security team input on Wikimedia Developer Portal static site
Closed, ResolvedPublic

Description

Basic Information Section

Brief description

  • The Wikimedia-Developer-Portal project aims to deploy a new static website in April-June 2022 to help folks navigate developer facing documentation related to Wikimedia projects. As a static site it seems unlikely that a full Application Security Reviews will be done, but we would like to give the Security-Team an opportunity to highlight any concerns or best practices they recommend prior to launch.

Do you have a project/product/program plan or documentation?

Primary Contacts

What Security Team services do you anticipate needing?

  • I don’t know

What is the 'go live' date for deployment of this project

"FY 20/21 Q4"


Privacy Information Section

Will any sensitive data to be collected, stored or exposed?

  • no

Technical Information Section

Do related discussions exist in Phab, on wiki, or in an RFC'?

  • no

Technology Stack

  • mkdocs static site generator

Security Readiness Review Section

Code

Post-deployment

  • Technical Engagement, @bd808

Working test environment

Details

Author Affiliation
WMF Technology Dept

Event Timeline

Hey @bd808 @Aklapper -

As a static site it seems unlikely that a full Application Security Reviews will be done, but we would like to give the Security-Team an opportunity to highlight any concerns or best practices they recommend prior to launch.

Sure, thanks for letting us know. Assuming that no new code or a significant refactor of existing code is being introduced here - be it an mw extension or gadget or something - I think we're fine to resolve this and rate it low risk.

Well, maybe I spoke too soon after having a glance at https://gerrit.wikimedia.org/g/wikimedia/developer-portal/+/refs/heads/main. Is the general workflow that this tool is used to generate the static site - that is, purely static assets like html and css files - which are then deployed to the developer portal site? Via a gerrit change set which would go through standard code review prior to deployment? That still sounds fine to me, but please let me know if I'm misunderstanding some things here.

Is the general workflow that this tool is used to generate the static site - that is, purely static assets like html and css files - which are then deployed to the developer portal site? Via a gerrit change set which would go through standard code review prior to deployment? That still sounds fine to me, but please let me know if I'm misunderstanding some things here.

Yes, that sounds basically like our current implementation plan. We will be using https://www.mkdocs.org/ as a static site generator to composite data from YAML and Markdown files into HTML templates resulting in static HTML with CSS, JS, and other media assets. Content changes to the YAML and Markdown files will go through normal code review. Localized strings will also come into the repo via typical translatewiki.net integration mechanisms (PO files exported from their message space for the project). Javascript is currently planned as progressive enhancement features only, although one of those "enhancements" will be a search system built using https://lunrjs.com/ (currently handled by the mkdocs theme we have chosen).

Our plan is for the mkdocs build step to run as a PipelineLib managed job in Wikimedia's CI service resulting in a Docker container holding the generated static assets and an nginx instance configured to serve them. This will then be deployed to a Kubernetes cluster or similar container runtime and exposed via an ingress of some sort to the general internet. In the current tech stack for production Kubernetes the ingress would very likely be a pybal route to the k8s service mounted behind the text CDN edge.

@bd808 - ok, thanks for all of that information; it's very helpful. I think at the bare minimum, the Security-Team will want to have a quick look at mkdocs and lunrjs as well as https://gerrit.wikimedia.org/r/admin/repos/wikimedia/developer-portal. The latter looked fine to me after a quick glance, but we should likely spend a bit more time running both our automated and manual checks against the code.

One quick question - did you have a more specific deployment date in mind than FY 20/21 Q4? I assume this means "at some point during the first half of 2022", correct?

One quick question - did you have a more specific deployment date in mind than FY 20/21 Q4? I assume this means "at some point during the first half of 2022", correct?

At the moment it is a very wide target of deploying in April-June 2022. The gerrit repo is in its very early stages of moving from a proof of concept hack towards the real implementation. We still need to implement T297168: Write localization/internationalization mkdocs plugin at the very least, and I have really just started on that task this week.

Security Review Summary - T297167 - 2022-03-14
Last commit reviewed: f6c0c04

Summary

low risk confirmed after a review. Keeping dependencies up to date would be ideal.

Vulnerable Packages - Production
None found low
Ran python safety, python bandit

Vulnerable Packages - Development
None found low
Ran python safety, python bandit

Outdated Packages
As reported via poetry show --outdated:
(no explicit vulnerabilities reported, simply noting for completeness' sake.)

PackageCurrentWantedDescription
attrs21.2.021.4.0Classes Without Boilerplate
black21.12b022.1.0The uncompromising code formatter
click8.0.38.0.4Composable command line interface toolkit
entrypoints0.30.4Discover and load entry points from installed packages
flake83.9.24.0.1the modular source code checker pep8 pyflakes and co
flake8-bugbear21.11.2922.1.11A plugin for flake8 finding likely bugs and design problems
flake8-comprehensions3.7.03.8.0A flake8 plugin to help you write better comprehensions
importlib-metadata4.8.24.11.3Read metadata from Python packages
markupsafe2.0.12.1.0Safely add untrusted strings to markup.
mccabe0.6.10.7.0McCabe checker, plugin for flake8
mdpo0.3.810.3.85Markdown files translation using PO files.
mkdocs-macros-plugin0.6.30.6.4Unleash the power of MkDocs with macros and variables
mkdocs-material7.3.68.2.5A Material Design theme for MkDocs
platformdirs2.4.02.5.1A small Python module for determining appropriate platform-specific
pycodestyle2.7.02.8.0Python style guide checker
pyflakes2.3.12.4.0passive checker of Python programs
pygments2.10.02.11.2Pygments is a syntax highlighting package written in Python.
pymdown-extensions9.19.2Extension pack for Python Markdown.
pyparsing3.0.63.0.7Python parsing module
tomli1.2.32.0.1A lil' TOML parser
typing-extensions4.0.14.1.1Backported and Experimental Type Hints for Python 3.6+
urllib31.26.71.26.8HTTP library with thread-safe connection pooling, file post, and more
zipp3.6.03.7.0Backport of pathlib-compatible object wrapper for zip files

Other Vulnerable Code
low Nothing found from snyk, semgrep

Miscellanous Issues/Points of Discussion/Nits

  1. Updating the out of date python packages would be nice, but not required
  2. Ensure that development server is not run in production since there's a known path traversal with the development server
  3. Lunr.js appears to have no security issues
sbassett triaged this task as Medium priority.
sbassett moved this task from In Progress to Our Part Is Done on the secscrum board.
Mstyles changed the visibility from "Custom Policy" to "All Users".
Mstyles changed the edit policy from "Custom Policy" to "All Users".
sbassett changed the visibility from "All Users" to "Public (No Login Required)".Apr 5 2022, 4:52 PM