Page MenuHomePhabricator

CSP violations with known domains in the blocked-uri are not collected by csp-report
Closed, ResolvedPublic

Description

Yesterday when investigating T422829: Toolforge HTML head links sometimes are issued as http://<tool>.toolforge:443 I noticed and learned about csp-report which is awesome. My browser sent the following report, though I could not find it on https://csp-report.toolforge.org/search?ft=sal

{
  "blocked-uri": "http://sal.toolforge.org:443/assets/main.css",
  "disposition": "report",
  "document-uri": "https://sal.toolforge.org/admin?p=0&q=&d=2026-03-13",
  "effective-directive": "style-src-elem",
  "original-policy": "default-src 'self' 'unsafe-eval' 'unsafe-inline' blob: data: filesystem: mediastream: *.toolforge.org wikibooks.org *.wikibooks.org wikidata.org *.wikidata.org wikimedia.org *.wikimedia.org wikinews.org *.wikinews.org wikipedia.org *.wikipedia.org wikiquote.org *.wikiquote.org wikisource.org *.wikisource.org wikiversity.org *.wikiversity.org wikivoyage.org *.wikivoyage.org wiktionary.org *.wiktionary.org *.wmcloud.org *.wmflabs.org wikimediafoundation.org mediawiki.org *.mediawiki.org wss://sal.toolforge.org; report-uri https://csp-report.toolforge.org/collect;",
  "referrer": "https://sal.toolforge.org/admin",
  "script-sample": "",
  "status-code": 200,
  "violated-directive": "style-src-elem"
}

I don't know if things are working as intended or not, I thought I'd report (hah!) it

Event Timeline

I think this rule in the report processing logic may have discarded your report @fgiunchedi:

csp/__init__.py
# Apparently some User Agents don't follow the spec and ignore wildcards
# in the directive. Filter out false positives that come from that.
if RE_ALLOWED_DOMAINS.match(blocked_host):
    logger.debug("False report for %s: %s", tool, report["blocked-uri"])
    return resp

I have confirmed that toolforge.org is in the allowed_domains config list that is used to create the RE_ALLOWED_DOMAINS pattern at runtime.

bd808 renamed this task from sal csp violation not showing on csp-report to CSP violations with known domains in the blocked-uri are not collected by csp-report.Apr 13 2026, 3:15 PM

I guess the question to ask now is if the me that decided this rule to filter out some obviously false positive reports from I assume older User-Agents is a good thing to retain.

I think I would argue that keeping the dashboards from reporting things that are confusing (like the CSP policy blocking sal.toolforge.org) is a benefit for end-user understanding of a potentially nuanced data set. I think that we should look for other mechanisms to discover implementation problems like something that would lead a User-Agent to attempt HTTP over port 443 which is apparently what happened to trigger the missing report.

T422829: Toolforge HTML head links sometimes are issued as http://<tool>.toolforge:443 seems to have been the problem behind the scenes here that led to the discarded report.

That's fair re: user confusion concerns. From my SRE POV I was surprised to find that the CSP report url we announce filters the feed of legitimate, albeit confusing to tool maintainers, reports. I am thinking of a middle ground where we collect all reports and present the report firehose unfiltered only on demand. The known-domains retention of course can be short as we don't really care for it except for operational problems. What do you think ?

That's fair re: user confusion concerns. From my SRE POV I was surprised to find that the CSP report url we announce filters the feed of legitimate, albeit confusing to tool maintainers, reports. I am thinking of a middle ground where we collect all reports and present the report firehose unfiltered only on demand. The known-domains retention of course can be short as we don't really care for it except for operational problems. What do you think ?

I'm not sure in the current pipeline where I would store reports with a different retention pattern or where I would insert output filtering to screen that noise from the maintainer facing interface. I don't have any objections to someone figuring those things out and implementing them if it feels like it would add value for administrative investigations.

That's fair re: user confusion concerns. From my SRE POV I was surprised to find that the CSP report url we announce filters the feed of legitimate, albeit confusing to tool maintainers, reports. I am thinking of a middle ground where we collect all reports and present the report firehose unfiltered only on demand. The known-domains retention of course can be short as we don't really care for it except for operational problems. What do you think ?

I'm not sure in the current pipeline where I would store reports with a different retention pattern or where I would insert output filtering to screen that noise from the maintainer facing interface. I don't have any objections to someone figuring those things out and implementing them if it feels like it would add value for administrative investigations.

Ok! That's now T423847: Store and optionally show the full firehose in csp-report, resolving this task

fgiunchedi claimed this task.