At a high level, this will proceed in two phases: identification of known-client requests in HAProxy and rate-limiting of identified requests in Varnish, each described in detail below.
Status: As of 2025-11-25 identification is live and enabled in cache-text and cache-upload globally and rate-limiting is ready to be enabled in cache-text (cache-upload is not yet supported, pending further discussion on an appropriate default limit).
Phase 1: Identification
In this phase, we introduce the known-client identity object and the ability to, for requests that identify as the configured User-Agent(s):
- apply an X-Provenance label-value pair and set X-Trusted-Request to "B" for requests that originate from IP blocks authorized to do so
- deny (403) those that do not originate from those IP blocks
This includes both the basic CRUD support in HIDDENPARMA for managing these identities, as well as translation to haproxy DSL fragments and integration with our haproxy config confd template.
Once this is complete, requests from known clients can be identified as such (via X-Provenance and X-Trusted-Request) and requests attempting to impersonate them can be denied, but no specialized rate limits are applied - i.e., they're treated like any other request.
Logging these classification or denial decisions (or would-be decisions, when the functionality is disabled) will be a key part of gaining confidence that we're ready to move on to the next phase.
There should be a clear, deterministic outcome in either of the following cases of multiple-match:
- A request that matches the identification or impersonation-detection behaviors of multiple known-client identities (e.g., first match wins, with identities ordered lexicographically).
- A request that is classified as impersonating one client, but successfully identifies as another (e.g., an identified request is never denied).
One could imagine the latter scenario occurring if there exist multiple similarly named clients with distinct source IPs owned by the same organization (e.g., if overlapping User-Agent patterns are incorrectly configured).
TODO:
- Priorities: Currently, known client rules are ordered lexicographically by name - i.e., there's no way for the user to control the order they're applied. This is different from action-type entities which have a priority that determines sort order. While I anticipate that the behavior described above (denial is deferred until all rules have been applied, and is superseded by identification) will obviate certain use cases for fine control over rule ordering, it probably won't cover everything. Thus, we should consider introducing a similar priority mechanism, with name functioning only as a tie-breaker.
- Selectors: In their current form, known-client objects technically support selectors (e.g., site scopes), but we do not support them in the UI. The original plan to was to facilitate incremental rollout of potentially risky rules. We should decide whether that actually makes sense in practice, and if so, make it so.
- Superset: The logging-mode tags (yellow) on known-client objects should become superset links for the respective x-requestctl action.
Phase 2: Rate limiting
In this phase, we introduce the ability to side-step "default" requestctl rules for requests originating from identified known clients, enrolling them in a different set of defaults and supporting per-client (and cache cluster) overrides.
There are a couple of details to sort out here while we're working on the first phase, including selection of the known-client default rate limits and deciding which layers of request processing in the CDN support per-client rate limit overrides (i.e., all or only a subset of haproxy, varnish hits, varnish misses).
Regardless, it is likely at this stage that per-client rate limit overrides will be configured via "normal" action and haproxy_action objects, rather than being tightly integrated with the known-client identity object itself, with the scope:identified selector set.
Update (2025-11-25) - The proposal from 2025-10-22 below is now implemented, and in the process of being enabled in cache-text as part of T406545: FY 25/26 WE 5.4.5: Enforce global rate-limits.
Update (2025-10-22) - After additional discussion, things have evolved a bit:
- Per-client rate limit overrides must support limits that are less restrictive than the default from day 1. In short, that's impossible to support with "normal" scope:identified action objects as they exist today - i.e., admission by a given matching rule has no way to prevent rejection by a subsequent one (e.g., the default limit).
- Rather than trying to add that kind of "accept on admit" option to actions in general (due to complexity and usability concerns), the simplest / fastest solution points back toward managing the limits directly in the known_client object, but doing so in a way that avoids introducing a new VCL rendering path for known_clients in HIDDENPARMA (since that too carries some complexity, which we may in fact throw away if actions become more powerful later on).
- The absolute simplest option (h/t to @Joe) would be to side-step VCL rendering in HIDDENPARMA, and instead produce a simple VCL if / elseif chain directly from (enabled) known_client objects in a confd template. The closest precedent for something like this is ipblock-to-HAProxy-map rendering, which is similarly out-of-band from DSL rendering.
- Like ipblocks, renaming carries some pitfalls - i.e., the delay between a known_client rename and subsequent commit could result in requests from that client transiently falling through to the default limit or have no limit at all, depending on how the VCL is structured (i.e., while HAProxy identification rules and match expressions on Varnish-side rate limits are out of sync).
- On balance, that's strictly more graceful than the ipblock rename case, where a delayed commit can result in requests incorrectly classified as impersonation.