Page MenuHomePhabricator

swift hosts (thanos-fe1001, ms-be2012) with failed prometheus-ipmi-exporter services
Open, MediumPublic

Description

Found a couple of swift hosts (thanos-fe1001, ms-be2012) with failed prometheus-ipmi-exporter services today. looks like a race with swift-proxy which eventually binds 9290 as a source port.

Jun 23 15:59:08 ms-fe2012 prometheus-ipmi-exporter[971859]: time="2022-06-23T15:59:08Z" level=fatal msg="listen tcp :9290: bind: address already in use" source="main.go:150"
ms-fe2012:~# lsof -i | grep 9290
swift-pro    866                      swift   66u  IPv4 1836545862      0t0  TCP ms-fe2012.codfw.wmnet:9290->ms-fe2009.codfw.wmnet:11211 (ESTABLISHED)

Event Timeline

herron triaged this task as Medium priority.Thu, Jun 23, 5:18 PM
herron created this task.

Looks like we customize the ephemeral port range on the swift hosts to 1024-65535, maybe we can push up the range swift-proxy chooses source ports from to help prevent this?

Change 808040 had a related patch set uploaded (by Herron; author: Herron):

[operations/puppet@production] swift: update ephemeral port range from 1024-65535 to 10240-65535

https://gerrit.wikimedia.org/r/808040