We've been having the occasional alert flap on the LVS HTTPS IPv6 on mobile-lb.eqiad alert, which has even caused some opsens to completely ignore it. This is something that we should fix ASAP, as a) i's highly probable it's a real problem, b) it conditions us to ignore pages.
I realized that despite his happening often, I have never been awake and/or present when this was happening. This made me go look at my IRC logs, which show this:
--- Day changed Tue Sep 08 2015 04:06 < icinga-wm> PROBLEM - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out 04:10 < icinga-wm> RECOVERY - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 301 TLS Redirect - 505 bytes in 0.010 second response time --- Day changed Wed Sep 09 2015 --- Day changed Thu Sep 10 2015 --- Day changed Fri Sep 11 2015 05:07 < icinga-wm> PROBLEM - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out 05:09 < icinga-wm> RECOVERY - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 10770 bytes in 0.137 second response time --- Day changed Sat Sep 12 2015 --- Day changed Sun Sep 13 2015 --- Day changed Mon Sep 14 2015 00:08 < icinga-wm> PROBLEM - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out 00:10 < icinga-wm> RECOVERY - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 10772 bytes in 0.334 second response time --- Day changed Tue Sep 15 2015 --- Day changed Wed Sep 16 2015 04:38 < icinga-wm> PROBLEM - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out 04:40 < icinga-wm> RECOVERY - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 10693 bytes in 0.114 second response time 04:48 < icinga-wm> PROBLEM - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out 04:55 < icinga-wm> RECOVERY - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 301 TLS Redirect - 505 bytes in 1.008 second response time 05:16 < icinga-wm> PROBLEM - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out 05:19 < icinga-wm> RECOVERY - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 10693 bytes in 0.103 second response time --- Day changed Thu Sep 17 2015 05:28 < icinga-wm> PROBLEM - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out 05:30 < icinga-wm> RECOVERY - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 10693 bytes in 1.079 second response time 05:50 < icinga-wm> PROBLEM - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out 05:51 < icinga-wm> RECOVERY - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 10693 bytes in 0.079 second response time --- Day changed Fri Sep 18 2015 04:45 < icinga-wm> PROBLEM - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out 04:46 < icinga-wm> RECOVERY - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 10514 bytes in 1.105 second response time 06:04 < icinga-wm> PROBLEM - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out 06:06 < icinga-wm> RECOVERY - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 10514 bytes in 0.096 second response time --- Day changed Sat Sep 19 2015 04:02 < icinga-wm> PROBLEM - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out 04:04 < icinga-wm> RECOVERY - LVS HTTPS IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 200 OK - 10512 bytes in 0.121 second response time 04:19 < icinga-wm> PROBLEM - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out 04:20 < icinga-wm> RECOVERY - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is OK: HTTP OK: HTTP/1.1 301 TLS Redirect - 503 bytes in 1.003 second response time 04:42 < icinga-wm> PROBLEM - LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6 is CRITICAL: Connection timed out
(hours are UTC+3)
Apparently there is some correlation with times of the day; this could be related to traffic levels or some other periodic tasks (IPsec session renewal?).