Page MenuHomePhabricator

See if we can borrow parts of the wikiprod WAF for Toolforge
Open, HighPublic

Description

We should see if we can re-use parts of the wikiprod WAF stack to reduce impact of scrapers on Toolforge tools. In particular, the tools-infrastructure-team meeting pointed out that the requestctl interface for managing rules as well as the traffic (cloud) source classification data could be useful.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

One of the big things that makes hiddenparma useful on the prod edge is the webrequest-live traffic analysis dashboard on superset.wikimedia.org. I'm sure that we can benefit from the etcd managed rule sets without that traffic visibility, but having it really makes the whole prod edge management system work.

taavi triaged this task as High priority.Nov 12 2025, 3:04 PM

the requestctl interface for managing rules as well as the traffic (cloud) source classification data could be useful

This was discussed during the recent WE5+6 offsite, and IIUC @Joe thinks that requestctl and hiddenparma are too tailored to production use cases to be easily reusable, but other parts of the production tool set might be reusable more easily (namely Lua rules used to filter traffic in haproxy).

the webrequest-live traffic analysis dashboard on superset.wikimedia.org

This was also discussed, and it should be possible to replicate the kafka-based log ingestion in cloud to get a similar level of observability for cloud traffic. A more lightweight alternative could be https://goaccess.io/