Page MenuHomePhabricator

hw troubleshooting: Move dbproxy1019 from C5 to B6
Closed, ResolvedPublicRequest

Description

  • - Provide FQDN of system.

dbproxy1019.eqiad.wmnet

  • - If other than a hard drive issue, please depool the machine (and confirm that it’s been depooled) for us to work on it. If not, please provide time frame for us to take the machine down.

Please coordinate with WMCS to migrate this machine. It'll need to be depooled before the operation, which can occur when the work is scheduled.

  • - Put system into a failed state in Netbox.
  • - Provide urgency of request, along with justification (redundancy, dependencies, etc)

In light of outage, T313382, dbproxy1018 and dbproxy1019 shouldn't be in the same rack.

dbproxy1018 and dbproxy1019 are the 2 hardware proxies for wikireplicas. The HA setup currently has a single physical point of failure.

  • - Assign correct project tag and appropriate owner (based on above). Also, please ensure the service owners of the host(s) are added as subscribers to provide any additional input.

Please move this machine into a new rack/row. I've suggested B6, but any non C rack should be sufficient. WMCS see https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Runbooks/Depool_wikireplicas#Hardware_proxies for taking offline before operation.

Event Timeline

@nskaggs I would like to schedule this to be completed on Monday around 1600UTC. Does that work for you?

@nskaggs is out for several weeks, so this should wait until late August unless someone else appears who wants to coordinate on this.

See also T304478: Move wikireplicas dbproxy haproxy config to etcd

As noted, I won't be available to coordinate, but someone else is welcome to do the depooling step in my absence (I don't have any special knowledge here and lack access to these machines). As I understand it, after switching in puppet, you'll want to wait for all connections to the old machine to cease before taking offline. So doing the switch several hours in advance is a good idea for whomever makes the change. Thanks!

Yes, feel free to coordinate with @fnegri for the depooling portion. Thanks!

@nskaggs can this be led by your team, as these proxies are from your service :-)

@Marostegui yes. Sorry, my comment about coordination was directed towards @Cmjohnson. Need to pick a convenient time for DCOPs and WMCS.

Jclark-ctr added a subscriber: Cmjohnson.

I am taking over this ticket @nskaggs what day of the week works best for you to do this move?

I am taking over this ticket @nskaggs what day of the week works best for you to do this move?

@fnegri ^^

@Jclark-ctr Monday 31st would be fine, or any other day except Tuesday.

If I understand correctly, we need to depool dbproxy1019 and wait for all connections to stop so the server can be moved. I can depool in the morning (EU time) so @Jclark-ctr can move the server after a few hours (morning US time) and I can resume it afterwards before the end of my working day.

This page is saying that "connections should not be longer than an hour or so" so hopefully depooling will only take about 1 hour from the moment I change the puppet config.

@Jclark-ctr Thursday works for me! I will aim to get the server depooled by 11:00 UTC on Thursday, and will post an update in this task when that's done and the server is ready to be moved.

Note that {T316195} needs to happen. Perhaps both can be/will be accomplished during this time?

@nskaggs yes, we'll get dbproxy1019 for free, and after that I can depool dbproxy1018 and reboot that one as well.

@fnegri i have not gotten confirmation that dbproxy1019 is depooled yet any update?

@Jclark-ctr sorry for the late update, I depooled it about 2 hours ago (11:30 UTC), by editing Hiera in Horizon and pointing everything to use dbproxy1018 instead of 1019:

Screenshot 2022-11-03 at 12.28.13.png (988×1 px, 129 KB)

I'm still seeing active connections. I'm trying to understand why.

# netstat -tp |grep haproxy |wc -l
62

root@dbproxy1019:~# netstat -tp |grep haproxy
tcp        0      0 dbproxy1019.eqiad:57170 clouddb1013.eqiad.:3313 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3315 vl1119-ens1f0np0.:56972 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3313 instance-clouddb-:23652 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:53856 clouddb1013.eqiad.:3313 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3315 instance-clouddb-w:5266 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3313 instance-clouddb-:10890 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:54274 clouddb1016.eqiad.:3318 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:36196 clouddb1016.eqiad.:3315 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:55970 clouddb1015.eqiad.:3314 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:55602 clouddb1014.eqiad.:3317 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:57180 clouddb1013.eqiad.:3313 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3313 instance-clouddb-:15178 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3316 vl1119-ens1f0np0.:39364 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:38934 clouddb1013.eqiad.:3313 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:49584 clouddb1014.eqiad.:3317 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3311 vl1119-ens1f1np1.:47728 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:52942 clouddb1016.eqiad.:3315 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:60740 clouddb1013.eqiad.:3311 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:52082 clouddb1015.eqiad.:3316 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3313 vl1119-ens1f1np1.:39916 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3315 instance-clouddb-:11453 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:40220 clouddb1016.eqiad.:3315 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3313 vl1119-ens1f0np0.:49736 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:40222 clouddb1016.eqiad.:3315 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:47488 clouddb1013.eqiad.:3313 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3318 vl1119-ens1f1np1.:33380 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3312 vl1119-ens1f1np1.:33232 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:44500 clouddb1016.eqiad.:3315 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:36336 clouddb1014.eqiad.:3312 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3313 instance-clouddb-:54866 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:59802 clouddb1015.eqiad.:3316 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3318 instance-clouddb-:50563 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:51522 clouddb1016.eqiad.:3318 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:49952 clouddb1014.eqiad.:3317 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:55804 clouddb1013.eqiad.:3313 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:52150 clouddb1013.eqiad.:3313 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3317 vl1119-ens1f1np1.:38668 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:56244 clouddb1016.eqiad.:3318 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3311 instance-clouddb-:48975 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:36478 clouddb1014.eqiad.:3312 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3315 vl1119-ens1f1np1.:46666 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3317 instance-clouddb-:61569 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3314 vl1119-ens1f0np0.:49516 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3317 vl1119-ens1f0np0.:32820 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:56274 clouddb1016.eqiad.:3315 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3314 vl1119-ens1f1np1.:58924 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:49772 clouddb1016.eqiad.:3318 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:60724 clouddb1013.eqiad.:3311 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3313 instance-clouddb-:39594 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3311 vl1119-ens1f0np0.:36544 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3316 vl1119-ens1f1np1.:50164 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3313 instance-clouddb-:35023 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:47970 clouddb1013.eqiad.:3311 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:46132 clouddb1013.eqiad.:3313 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3318 instance-clouddb-:10017 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3318 vl1119-ens1f0np0.:54268 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3315 instance-clouddb-:40211 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad.:3312 vl1119-ens1f0np0.:34004 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:46600 clouddb1015.eqiad.:3314 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3315 instance-clouddb-:10486 ESTABLISHED 10105/haproxy
tcp        0      0 wikireplicas-b.wik:3313 instance-clouddb-:35755 ESTABLISHED 10105/haproxy
tcp        0      0 dbproxy1019.eqiad:34036 clouddb1013.eqiad.:3313 ESTABLISHED 10105/haproxy

It looks like they're all old connections that are probably stuck or unused, and will never terminate. There's no open connection that was created today, and only one from yesterday. I think it's safe to shut down the instance and move it.

# lsof -i tcp |grep haproxy| awk '{print $2,$4}' | tr -d 'u' | sort -u | while read pid fd; do stat --printf "%z %N\n" /proc/$pid/fd/$fd ; done |sort

[...]

2022-10-21 02:43:04.537453772 +0000 '/proc/10105/fd/115' -> 'socket:[1144913610]'
2022-10-21 17:56:05.572767152 +0000 '/proc/10105/fd/120' -> 'socket:[1102862042]'
2022-11-02 13:29:05.582038387 +0000 '/proc/10105/fd/143' -> 'socket:[1145171211]'

Mentioned in SAL (#wikimedia-operations) [2022-11-03T13:52:06Z] <fnegri@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on dbproxy1019.eqiad.wmnet with reason: T313445

Mentioned in SAL (#wikimedia-operations) [2022-11-03T13:52:19Z] <fnegri@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on dbproxy1019.eqiad.wmnet with reason: T313445

Mentioned in SAL (#wikimedia-cloud) [2022-11-03T13:58:13Z] <dhinus> shutting down dbproxy1019 T313445

Mentioned in SAL (#wikimedia-cloud) [2022-11-03T14:39:32Z] <dhinus> depooling dbproxy1019 'confctl select "service=wikireplicas-b,name=dbproxy1019" set/pooled=no' T313445

The first attempt at depooling didn't work because I didn't specify a full hostname. This one worked:

$ sudo confctl select "service=wikireplicas-b,name=dbproxy1019.eqiad.wmnet" set/pooled=no
eqiad/wikireplicas-b/wikireplicas-b/dbproxy1019.eqiad.wmnet: pooled changed yes => no
WARNING:conftool.announce:conftool action : set/pooled=no; selector: service=wikireplicas-b,name=dbproxy1019.eqiad.wmnet

When the server was in row c it was in vlan-id 1119 cloud-support1-c-eqiad in row B where the server is now, we have no cloud-support1-b-eqiad so which vlan are we putting this server in sorry if this was already mentioned in the task i didn't read all the task.

@Papaul it wasn't mentioned in the task, checking.

I discussed this with @aborrero and we think "VLAN private1-b-eqiad (1018)" will work.

new IP address for the hosts is
``
10.64.16.14/22
``
switch also setup

[edit interfaces ge-6/0/28]
-   description DISABLED;
+   description "dbproxy1019 {#3284}";
-   disable;
[edit interfaces ge-6/0/28]
+    unit 0 {
+        family ethernet-switching {
+            interface-mode access;
+            vlan {
+                members private1-b-eqiad;
+            }
+        }
+    }
papaul@asw2-b-eqiad> show interfaces ge-6/0/28 descriptions
Interface       Admin Link Description
ge-6/0/28       up    down dbproxy1019 {#3284}

Cookbook cookbooks.sre.hosts.reimage was started by fnegri@cumin1001 for host dbproxy1019.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by fnegri@cumin1001 for host dbproxy1019.eqiad.wmnet with OS bullseye completed:

  • dbproxy1019 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Set pooled=inactive for the following services on confctl:

{"dbproxy1019.eqiad.wmnet": {"weight": 0, "pooled": "inactive"}, "tags": "dc=eqiad,cluster=wikireplicas-a,service=wikireplicas-a"}
{"dbproxy1019.eqiad.wmnet": {"weight": 0, "pooled": "inactive"}, "tags": "dc=eqiad,cluster=wikireplicas-b,service=wikireplicas-b"}

  • Unable to disable Puppet, the host may have been unreachable
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh bullseye OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga/Alertmanager
  • Removed previous downtime on Alertmanager (old OS)
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202211031602_fnegri_2562506_dbproxy1019.out
  • Checked BIOS boot parameters are back to normal
  • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • No changes in confctl are needed to restore the previous state.
  • Updated Netbox data from PuppetDB

I repooled the server using conftool:

sudo confctl select "name=dbproxy1019.eqiad.wmnet,service=wikireplicas-b" set/pooled=yes

fnegri@cumin1001:~$ confctl select "name=dbproxy1018.eqiad.wmnet" get
{"dbproxy1018.eqiad.wmnet": {"weight": 0, "pooled": "inactive"}, "tags": "dc=eqiad,cluster=wikireplicas-b,service=wikireplicas-b"}
{"dbproxy1018.eqiad.wmnet": {"weight": 0, "pooled": "yes"}, "tags": "dc=eqiad,cluster=wikireplicas-a,service=wikireplicas-a"}

fnegri@cumin1001:~$ confctl select "name=dbproxy1019.eqiad.wmnet" get
{"dbproxy1019.eqiad.wmnet": {"weight": 0, "pooled": "inactive"}, "tags": "dc=eqiad,cluster=wikireplicas-a,service=wikireplicas-a"}
{"dbproxy1019.eqiad.wmnet": {"weight": 0, "pooled": "yes"}, "tags": "dc=eqiad,cluster=wikireplicas-b,service=wikireplicas-b"}

I undid the config change in Cloud-Puppet, then restarted HAProxy on dbproxy1019, and it's not happy:

Nov  3 16:27:24 dbproxy1019 haproxy[37971]: [WARNING] 306/162724 (37971) : Health check for server mariadb-s8/clouddb1016.eqiad.wmnet failed, reason: Layer4 timeout, check duration: 3000ms, status: 0/99999999 DOWN.
Nov  3 16:27:24 dbproxy1019 haproxy[37971]: [WARNING] 306/162724 (37971) : Server mariadb-s8/clouddb1016.eqiad.wmnet is DOWN. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.
Nov  3 16:27:24 dbproxy1019 haproxy[37971]: [ALERT] 306/162724 (37971) : proxy 'mariadb-s8' has no server available!
Nov  3 16:27:24 dbproxy1019 haproxy[37971]: Server mariadb-s8/clouddb1016.eqiad.wmnet is DOWN. 0 active and 0 backup servers left. 0 sessions active, 0 requeued, 0 remaining in queue.

I can connect to backend servers using nc though:

# nc clouddb1016.eqiad.wmnet 3318
Y
5.5.5-10.4.22-MariaDB

This is caused because of the host changing its IP. We need to update the grants on the clouddb* hosts for the new IP. I just applied the fix to clouddb1016:3318 and it worked:

Nov 03 18:10:34 dbproxy1019 haproxy[21020]: [WARNING] 306/181034 (21020) : Health check for server mariadb-s8/clouddb1016.eqiad.wmnet succeeded, reason: Layer7 check passed, code: 0, info: "5.5.5-10.4.22-MariaDB", check duration: 0ms, st>

Can I have the old IP too so I can clean those?

10.64.37.28/24
2620:0:861:119:10:64:37:28/64

I have fixed the rest and they are all now up.
@fnegri can I still have the old ip so I can clean up the leftovers?

root@dbproxy1019:~# echo "show stat" | socat /run/haproxy/haproxy.sock stdio
# pxname,svname,qcur,qmax,scur,smax,slim,stot,bin,bout,dreq,dresp,ereq,econ,eresp,wretr,wredis,status,weight,act,bck,chkfail,chkdown,lastchg,downtime,qlimit,pid,iid,sid,throttle,lbtot,tracked,type,rate,rate_lim,rate_max,check_status,check_code,check_duration,hrsp_1xx,hrsp_2xx,hrsp_3xx,hrsp_4xx,hrsp_5xx,hrsp_other,hanafail,req_rate,req_rate_max,req_tot,cli_abrt,srv_abrt,comp_in,comp_out,comp_byp,comp_rsp,lastsess,last_chk,last_agt,qtime,ctime,rtime,ttime,agent_status,agent_code,agent_duration,check_desc,agent_desc,check_rise,check_fall,check_health,agent_rise,agent_fall,agent_health,addr,cookie,mode,algo,conn_rate,conn_rate_max,conn_tot,intercepted,dcon,dses,wrew,connect,reuse,cache_lookups,cache_hits,srv_icur,src_ilim,qtime_max,ctime_max,rtime_max,ttime_max,eint,idle_conn_cur,safe_conn_cur,used_conn_cur,need_conn_est,
mariadb-s1,FRONTEND,,,4,14,5000,271,261418,265327,0,0,0,,,,,OPEN,,,,,,,,,1,2,0,,,,0,6,0,18,,,,,,,,,,,0,0,0,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,tcp,,6,18,271,,0,0,0,,,,,,,,,,,0,,,,,
mariadb-s1,clouddb1013.eqiad.wmnet,0,0,4,14,,271,261418,265327,,0,,0,18,0,0,UP,1,1,0,0,0,31,0,,1,2,1,,271,,2,6,,18,L7OK,0,0,,,,,,,,,,,0,18,,,,,0,,,0,1,0,318,,,,Layer7 check passed,,99999999,20,100000018,,,,,,tcp,,,,,,,,0,271,0,,,0,,0,17,0,6332,0,0,0,4,14,
mariadb-s1,clouddb1017.eqiad.wmnet,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,0,1,0,0,31,0,,1,2,2,,0,,2,0,,0,L7OK,0,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer7 check passed,,2,3,4,,,,,,tcp,,,,,,,,0,0,0,,,0,,0,0,0,0,0,0,0,0,0,
mariadb-s1,BACKEND,0,0,4,14,500,271,261418,265327,0,0,,0,18,0,0,UP,1,1,1,,0,31,0,,1,2,0,,271,,1,6,,18,,,,,,,,,,,,,,0,18,0,0,0,0,0,,,0,1,0,318,,,,,,,,,,,,,,tcp,,,,,,,,0,271,0,,,,,0,17,0,6332,0,,,,,
mariadb-s2,FRONTEND,,,4,6,5000,63,3701416,11727839,0,0,0,,,,,OPEN,,,,,,,,,1,3,0,,,,0,5,0,6,,,,,,,,,,,0,0,0,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,tcp,,5,6,63,,0,0,0,,,,,,,,,,,0,,,,,
mariadb-s2,clouddb1014.eqiad.wmnet,0,0,4,6,,63,3701416,11727839,,0,,0,18,0,0,UP,1,1,0,0,0,31,0,,1,3,1,,63,,2,5,,6,L7OK,0,1,,,,,,,,,,,0,18,,,,,0,,,0,1,0,1332,,,,Layer7 check passed,,99999999,20,100000018,,,,,,tcp,,,,,,,,0,63,0,,,0,,0,15,0,12915,0,0,0,4,6,
mariadb-s2,clouddb1018.eqiad.wmnet,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,0,1,0,0,31,0,,1,3,2,,0,,2,0,,0,L7OK,0,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer7 check passed,,2,3,4,,,,,,tcp,,,,,,,,0,0,0,,,0,,0,0,0,0,0,0,0,0,0,
mariadb-s2,BACKEND,0,0,4,6,500,63,3701416,11727839,0,0,,0,18,0,0,UP,1,1,1,,0,31,0,,1,3,0,,63,,1,5,,6,,,,,,,,,,,,,,0,18,0,0,0,0,0,,,0,1,0,1332,,,,,,,,,,,,,,tcp,,,,,,,,0,63,0,,,,,0,15,0,12915,0,,,,,
mariadb-s3,FRONTEND,,,4,7,5000,79,173678,1400955,0,0,0,,,,,OPEN,,,,,,,,,1,4,0,,,,0,1,0,7,,,,,,,,,,,0,0,0,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,tcp,,1,7,79,,0,0,0,,,,,,,,,,,0,,,,,
mariadb-s3,clouddb1013.eqiad.wmnet,0,0,4,6,,79,173678,1400955,,0,,0,18,0,0,UP,1,1,0,0,0,31,0,,1,4,1,,79,,2,1,,7,L7OK,0,0,,,,,,,,,,,0,18,,,,,1,,,0,1,0,1343,,,,Layer7 check passed,,99999999,20,100000018,,,,,,tcp,,,,,,,,0,79,0,,,0,,0,1,0,9415,0,0,0,4,6,
mariadb-s3,clouddb1017.eqiad.wmnet,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,0,1,0,0,31,0,,1,4,2,,0,,2,0,,0,L7OK,0,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer7 check passed,,2,3,4,,,,,,tcp,,,,,,,,0,0,0,,,0,,0,0,0,0,0,0,0,0,0,
mariadb-s3,BACKEND,0,0,4,6,500,79,173678,1400955,0,0,,0,18,0,0,UP,1,1,1,,0,31,0,,1,4,0,,79,,1,1,,7,,,,,,,,,,,,,,0,18,0,0,0,0,1,,,0,1,0,1343,,,,,,,,,,,,,,tcp,,,,,,,,0,79,0,,,,,0,1,0,9415,0,,,,,
mariadb-s4,FRONTEND,,,3,4,5000,39,6712,70302,0,0,0,,,,,OPEN,,,,,,,,,1,5,0,,,,0,1,0,4,,,,,,,,,,,0,0,0,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,tcp,,1,4,39,,0,0,0,,,,,,,,,,,0,,,,,
mariadb-s4,clouddb1015.eqiad.wmnet,0,0,3,4,,39,6712,70302,,0,,0,18,0,0,UP,1,1,0,0,0,31,0,,1,5,1,,39,,2,1,,4,L7OK,0,0,,,,,,,,,,,0,18,,,,,1,,,0,0,0,1411,,,,Layer7 check passed,,99999999,20,100000018,,,,,,tcp,,,,,,,,0,39,0,,,0,,0,0,0,3003,0,0,0,3,4,
mariadb-s4,clouddb1019.eqiad.wmnet,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,0,1,0,0,31,0,,1,5,2,,0,,2,0,,0,L7OK,0,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer7 check passed,,2,3,4,,,,,,tcp,,,,,,,,0,0,0,,,0,,0,0,0,0,0,0,0,0,0,
mariadb-s4,BACKEND,0,0,3,4,500,39,6712,70302,0,0,,0,18,0,0,UP,1,1,1,,0,31,0,,1,5,0,,39,,1,1,,4,,,,,,,,,,,,,,0,18,0,0,0,0,1,,,0,0,0,1411,,,,,,,,,,,,,,tcp,,,,,,,,0,39,0,,,,,0,0,0,3003,0,,,,,
mariadb-s5,FRONTEND,,,2,6,5000,39,10396,14818,0,0,0,,,,,OPEN,,,,,,,,,1,6,0,,,,0,1,0,5,,,,,,,,,,,0,0,0,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,tcp,,1,5,39,,0,0,0,,,,,,,,,,,0,,,,,
mariadb-s5,clouddb1016.eqiad.wmnet,0,0,2,4,,39,10396,14818,,0,,0,18,0,0,UP,1,1,0,0,0,31,0,,1,6,1,,39,,2,1,,5,L7OK,0,0,,,,,,,,,,,0,18,,,,,1,,,0,0,0,1468,,,,Layer7 check passed,,99999999,20,100000018,,,,,,tcp,,,,,,,,0,39,0,,,0,,0,0,0,3004,0,0,0,2,4,
mariadb-s5,clouddb1020.eqiad.wmnet,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,0,1,0,0,31,0,,1,6,2,,0,,2,0,,0,L7OK,0,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer7 check passed,,2,3,4,,,,,,tcp,,,,,,,,0,0,0,,,0,,0,0,0,0,0,0,0,0,0,
mariadb-s5,BACKEND,0,0,2,4,500,39,10396,14818,0,0,,0,18,0,0,UP,1,1,1,,0,31,0,,1,6,0,,39,,1,1,,5,,,,,,,,,,,,,,0,18,0,0,0,0,1,,,0,0,0,1468,,,,,,,,,,,,,,tcp,,,,,,,,0,39,0,,,,,0,0,0,3004,0,,,,,
mariadb-s6,FRONTEND,,,2,6,5000,61,60387,2891290,0,0,0,,,,,OPEN,,,,,,,,,1,7,0,,,,0,1,0,27,,,,,,,,,,,0,0,0,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,tcp,,1,27,61,,0,0,0,,,,,,,,,,,0,,,,,
mariadb-s6,clouddb1015.eqiad.wmnet,0,0,2,5,,61,60387,2891290,,0,,0,18,0,0,UP,1,1,0,0,0,31,0,,1,7,1,,61,,2,1,,27,L7OK,0,0,,,,,,,,,,,0,18,,,,,1,,,0,0,0,1424,,,,Layer7 check passed,,99999999,20,100000018,,,,,,tcp,,,,,,,,0,61,0,,,0,,0,0,0,25573,0,0,0,2,5,
mariadb-s6,clouddb1019.eqiad.wmnet,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,0,1,0,0,31,0,,1,7,2,,0,,2,0,,0,L7OK,0,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer7 check passed,,2,3,4,,,,,,tcp,,,,,,,,0,0,0,,,0,,0,0,0,0,0,0,0,0,0,
mariadb-s6,BACKEND,0,0,2,5,500,61,60387,2891290,0,0,,0,18,0,0,UP,1,1,1,,0,31,0,,1,7,0,,61,,1,1,,27,,,,,,,,,,,,,,0,18,0,0,0,0,1,,,0,0,0,1424,,,,,,,,,,,,,,tcp,,,,,,,,0,61,0,,,,,0,0,0,25573,0,,,,,
mariadb-s7,FRONTEND,,,4,5,5000,53,37114,3741917,0,0,0,,,,,OPEN,,,,,,,,,1,8,0,,,,0,1,0,5,,,,,,,,,,,0,0,0,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,tcp,,1,5,53,,0,0,0,,,,,,,,,,,0,,,,,
mariadb-s7,clouddb1014.eqiad.wmnet,0,0,4,5,,53,37114,3741917,,0,,0,18,0,0,UP,1,1,0,0,0,31,0,,1,8,1,,53,,2,1,,5,L7OK,0,1,,,,,,,,,,,0,18,,,,,0,,,0,1,0,1565,,,,Layer7 check passed,,99999999,20,100000018,,,,,,tcp,,,,,,,,0,53,0,,,0,,0,1,0,15814,0,0,0,4,5,
mariadb-s7,clouddb1018.eqiad.wmnet,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,0,1,0,0,31,0,,1,8,2,,0,,2,0,,0,L7OK,0,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer7 check passed,,2,3,4,,,,,,tcp,,,,,,,,0,0,0,,,0,,0,0,0,0,0,0,0,0,0,
mariadb-s7,BACKEND,0,0,4,5,500,53,37114,3741917,0,0,,0,18,0,0,UP,1,1,1,,0,31,0,,1,8,0,,53,,1,1,,5,,,,,,,,,,,,,,0,18,0,0,0,0,0,,,0,1,0,1565,,,,,,,,,,,,,,tcp,,,,,,,,0,53,0,,,,,0,1,0,15814,0,,,,,
mariadb-s8,FRONTEND,,,3,5,5000,59,33877,545850,0,0,0,,,,,OPEN,,,,,,,,,1,9,0,,,,0,2,0,6,,,,,,,,,,,0,0,0,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,tcp,,2,6,59,,0,0,0,,,,,,,,,,,0,,,,,
mariadb-s8,clouddb1016.eqiad.wmnet,0,0,3,5,,59,33877,545850,,0,,0,18,0,0,UP,1,1,0,0,0,31,0,,1,9,1,,59,,2,2,,6,L7OK,0,0,,,,,,,,,,,0,18,,,,,0,,,0,0,0,1330,,,,Layer7 check passed,,99999999,20,100000018,,,,,,tcp,,,,,,,,0,59,0,,,0,,0,0,0,20688,0,0,0,3,5,
mariadb-s8,clouddb1020.eqiad.wmnet,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,0,1,0,0,31,0,,1,9,2,,0,,2,0,,0,L7OK,0,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer7 check passed,,2,3,4,,,,,,tcp,,,,,,,,0,0,0,,,0,,0,0,0,0,0,0,0,0,0,
mariadb-s8,BACKEND,0,0,3,5,500,59,33877,545850,0,0,,0,18,0,0,UP,1,1,1,,0,31,0,,1,9,0,,59,,1,2,,6,,,,,,,,,,,,,,0,18,0,0,0,0,0,,,0,0,0,1330,,,,,,,,,,,,,,tcp,,,,,,,,0,59,0,,,,,0,0,0,20688,0,,,,,

@Marostegui see Papaul's comment above for the old IP :)

I have cleaned up the old IP and changed the report users script.

Thanks @Marostegui!

@Jclark-ctr I think this task can be resolved.