Page MenuHomePhabricator

Swift-recon -d overstates disk capacity and usage
Closed, ResolvedPublic

Description

We use "servers_per_port" i.e. we run multiple swift daemons on the same host (cf T222366). Currently swift-recon -d asks each daemon for all the swift disks on its hosts. The effect of this is that disks get multi-counted, and so swift-recon -d says there is 50PB of storage in the ms cluster!

Reported upstream as https://bugs.launchpad.net/swift/+bug/1947852

I have patches for stretch and bullseye recon.py that resolve this (by keeping track of which hosts we've had a response from, and only counting the first for each host); submitting upstream is waiting on CLA approval.

Event Timeline

Stick the patch here just in case...

diff --git a/swift/cli/recon.py b/swift/cli/recon.py
index cd0952875..304a75a90 100644
--- a/swift/cli/recon.py
+++ b/swift/cli/recon.py
@@ -895,6 +895,7 @@ class SwiftRecon(object):
         percents = {}
         top_percents = [(None, 0)] * top
         low_percents = [(None, 100)] * lowest
+        hosts_checked = []
         recon = Scout("diskusage", self.verbose, self.suppress_errors,
                       self.timeout)
         print("[%s] Checking disk usage now" % self._ptime())
@@ -902,6 +903,10 @@ class SwiftRecon(object):
                 recon.scout, hosts):
             if status == 200:
                 hostusage = []
+                host = urlparse(url).netloc.split(':')[0]
+                if host in hosts_checked:
+                    continue
+                hosts_checked.append(host)
                 for entry in response:
                     if not isinstance(entry['mounted'], bool):
                         print("-> %s/%s: Error: %s" % (url, entry['device'],

A revised version is now merged upstream. Probably best to just wait until this gets into Debian, but it is a client-side patch, so we could deploy a patched client at reasonably low risk.

[NB - if we do do this, take the patch that was accepted, not the one in this task]

Resolved by deploying 2.26.0-10+deb11u1+wmf1 fleet-wide.