Page MenuHomePhabricator

Probe for ipv6 host reachability
Closed, ResolvedPublic

Description

Yesterday during row D maintenance we ran into an issue similar to T133387 where hosts would lack ipv6 connectivity but ipv4 was working correctly. Icinga didn't alert about lack of ipv6 connectivity but it should have IMO, in case hosts have A and AAAA records.

Event Timeline

Dzahn moved this task from Inbox to Up next on the observability board.
Dzahn removed Dzahn as the assignee of this task.Jan 8 2019, 11:57 PM
Dzahn subscribed.

i noticed i had this comment that i started typing but never saved:


currently not going to work on this and we should probably wait a bit closely watching performance metrics before we add so many extra checks

meanwhile there have also been efforts to add a lot more AAAA records to DNS and to add the mapped v6 address

ayounsi raised the priority of this task from Medium to High.Nov 19 2021, 8:48 AM
ayounsi subscribed.

Raising the priority to bring attention to this task, feel free to re-triage accordingly.

Yesterday's short outage could probably have been avoided if we had IPv6 checks on hosts.

Note for later and reworked for an alertmanager/prometheus world: we should extend netops::prometheus::hosts to also probe for ipv6, this way we'll have smoke probes also testing v6 connectivity

fgiunchedi renamed this task from Icinga check for ipv6 host reachability to Probe for ipv6 host reachability.Dec 7 2023, 3:55 PM

Change 981358 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] netops: prometheus::hosts: also probe ipv6 if available

https://gerrit.wikimedia.org/r/981358

Change 981358 merged by Majavah:

[operations/puppet@production] netops: prometheus::hosts: also probe ipv6 if available

https://gerrit.wikimedia.org/r/981358

fgiunchedi claimed this task.
fgiunchedi added a subscriber: taavi.

Thank you @taavi ! The check is working as expected now, and uncovered T353254: prometheus5002 unable to ping ipv6 ganeti500[74] eqsin !

I'm resolving, though feel free to reopen