I've tried to deploy https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/585721So, the backend purging queues in esams are way behind. On the one node I'm staring at the most, there are currently about 87 million backlogged purge requests, which is probably somewhere in the ballpark of 10 hours of lag time. The backlog is in the local daemon on the esams hosts themselves (so this isn't a network issue with delivering the purges to the hosts over the WAN), so the culprit is likely the ATS daemon consuming them slowly.
This is text@eqiad GETs vs PURGEs the past week. You can see GETs have the usual organic pattern, and then executed `echo 'https://en.wikipedia.org/static/images/project-logos/cswiki.png' | mwscript purgeList.php`PURGEs are fairly spiky as we normally see.
Whereas with text@esams, we see a curious pattern to the PURGE traffic, as instructed in https://wikitech.wikimedia.org/wiki/SWAT_deploys/Deployers#Purging.
Howeverseems like it's being load-limited and somewhat-recovering when organic traffic is low overnight:
<edited here>: Those purge rates are from the frontend varnish, the cachbut the frontend varnish PURGE queue doesn't seem to be invalidated.receive entries until they've traversed the backend one, Trying to curl the logo directly from an application server works correctly.which is backlogged, It seems HTCP purge doesn't work anymore?so that's why we still see the effect there.