Page MenuHomePhabricator

HAProxy log format doesn't support "invalid" request path
Closed, DeclinedPublic

Description

Checking messages sent to the DLQ by Benthos in ulsfo, we discovered some parsing error due to some unexpected missing field in the HAProxy log, the request path.

Errors emitted by Benthos looked like

{
  "error": "processors failed: no pattern matches found",
  "hostname": "localhost",
  "message": "260764 <REDACTED>:30365 8281383 [15/May/2024:13:56:54.703] tls~ tls/backend_server 0/0/0/0/0 200 20099 - - ---- 7140/7131/55/55/0 0/0 {www.wikipedia.org|||||*/*|vers=TLSv1.3;keyx=UNKNOWN;auth=ECDSA;ciph=AES-256-GCM-SHA384;prot=h1;sess=new} {hit-front|text/html|https=1;client_port=30365;nocookies=1|cp4037 hit, cp4037 hit/546860|ATS/9.1.4} GET  ",
  "priority": 134,
  "severity": 6,
  "timestamp": "2024-05-15T13:56:54Z"
}

The missing parts after the GET verb are the ones that makes Benthos processor fails. This kind of errors are infrequent, if compared to the host request volume, and always targeted at https://www.wikipedia.org, without UA or other significant headers. They seems to be originated from public cloud networks (mainly AWS).

Analyzing the same request as seen by Varnish (with Varnishlog) we noticed that the ReqURL was set to https://www.wikipedia.org. While this is sanitized in Varnish (https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/varnish/templates/wikimedia-frontend.vcl.erb#272), for HAProxy it's clearly not loggable with our current log-format that uses %HPO (HTTP path only (without host nor query string)).

So, to recap, the problems here are:

  • Benthos fails to parse these requests because HAProxy doesn't log them
  • We do already have these requests logged by Varnish so it's mandatory to have them also with this new pipeline

Using another HAProxy log-format variable (like %HP) seems the most feasible way to avoid this but we should carefully check that this doesn't change the current DE ingestion pipeline format.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change #1032410 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] cache:haproxy: %HP variable in log-format to log also invalid uri

https://gerrit.wikimedia.org/r/1032410

Change #1032410 merged by Fabfur:

[operations/puppet@production] cache:haproxy: use %HP in log-format to log absolute-form reqs

https://gerrit.wikimedia.org/r/1032410

Some other information about this:

  • In HAProxy replacing %HPO with %HP logs the whole uri (without the query string part), including host part, but only on HTTP/2 requests. HTTP/1.1 requests are still logged "the old way" (just the path). This obviously breaks compatibility with our assumptions and pipelines, introducing a new log format.
  • If we keep %HPO we had similar issues: requests without the path are not logged (the host part) by HAProxy and this breaks Benthos formatting that expects something in the path field (our starting issue)

My proposal would be keep the %HPO log format variable instead of %HP and configure Benthos to 1) accept logs without the path field 2) replace missing path with / (other paths are correctly logged by HAProxy and correctl processed by Benthos).

Also note this issue: https://github.com/haproxy/haproxy/issues/1388

Edit: another (viable?) option could be use this kind of configuration:

...
http-request set-var(txn.path) path
log-format "%rt %Tr %Tw %Tc %ST {%[capture.req.hdr(0)]} {%[capture.res.hdr(0)]} %ts newlog %pid %ci:%cp %ID [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %HM %[var(txn.path)] %HQ"
...

Has the ability to log "missing" paths with a -. Don't know if this is something we want instead of blindly replacing missing path with a /

@gmodena WDYT about this?

Change #1031818 had a related patch set uploaded (by Fabfur; author: Fabfur):

[operations/puppet@production] benthos:cache: better parsing for path and query string

https://gerrit.wikimedia.org/r/1031818

Change #1031818 merged by Fabfur:

[operations/puppet@production] benthos:cache: better parsing for path and query string

https://gerrit.wikimedia.org/r/1031818

Update: opened this issue upstream to have another opinion about this and, in case, fix this behavior