Page MenuHomePhabricator

High Prometheus TCP retransmits
Closed, ResolvedPublic0 Story Points

Description

Why eqiad Prometheus cluster have such a higher than normal % of TCP retransmits?
see https://grafana.wikimedia.org/d/000000366/network-performances-global?panelId=18&fullscreen&orgId=1&from=now-24h&to=now

It seems like Prometheus is trying to query endpoints such as analytics1049.eqiad.wmnet:51010 which resolves to both a v4 and v6 address.
But the service listening on those ports are only listening on IPv4
eg. tcp 0 0 10.64.21.108:51010 0.0.0.0:* LISTEN 8303/java
Prometheus tries to establish a TCP session over IPv6 first, then retries a couple times before giving up and successfully trying IPv4.

Seems like the same goes for other services such as:
kafka-jumbo1004:7800

Another curious one is for example on analytics1047, which is setting up a tcp6 socket but binding to a v4 IP:
tcp6 0 0 10.64.21.106:8141 :::* LISTEN 1667/java

Unrelated, the (some?) coudvirt hosts have the prometheus rsyslog exporter listening on port 9105.
tcp6 0 0 :::9105 :::* LISTEN 33652/prometheus-rs
but it can't be queried from prometheus1004
eg.
prometheus1004:~$ curl -v cloudvirt1015.eqiad.wmnet:9105/metrics hangs
While the other exporter listening on 9100 replies fine.
As Prometheus is configured to query that endpoint, it tries, retries, and fails.

Event Timeline

ayounsi triaged this task as Normal priority.Jun 7 2019, 11:00 AM
ayounsi created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 7 2019, 11:00 AM
fdans moved this task from Incoming to Radar on the Analytics board.Jun 10 2019, 3:45 PM
elukey added a comment.Jul 1 2019, 6:55 AM

The main issue is that originally I have set up yarn (8141) and hdfs (51010) daemons to bind with ${::ipaddr}:port in puppet, and then we added the IPv6 addresses. For example, from hiera:

yarn_nodemanager_opts: "-Xms4096m -javaagent:/usr/share/java/prometheus/jmx_prometheus_javaagent.jar=%{::ipaddress}:8141:/etc/prometheus/yarn_nodemanager_jmx_exporter.yaml"

The ipaddr variable is the ipv4 address, this explains part of the netstat result. To fix the problem, it is sufficient to just replace %{::ipaddress} with [::].

Yarn (port 8141) is different from port 51010 since the latter has -Djava.net.preferIPv4Stack among its default jvm parameters (just found it now), that leads to the following with the above fix:

Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
Exception in thread "main" java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:386)
        at sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:401)
Caused by: java.net.SocketException: Protocol family unavailable
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:433)
        at sun.nio.ch.Net.bind(Net.java:425)
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at sun.net.httpserver.ServerImpl.bind(ServerImpl.java:133)
        at sun.net.httpserver.HttpServerImpl.bind(HttpServerImpl.java:54)
        at io.prometheus.jmx.shaded.io.prometheus.client.exporter.HTTPServer.<init>(HTTPServer.java:145)
        at io.prometheus.jmx.shaded.io.prometheus.jmx.JavaAgent.premain(JavaAgent.java:49)
        ... 6 more
FATAL ERROR in native method: processing of -javaagent failed

The main issue is that the parameter is not set by us, but buried in the init scripts shipped by hadoop packages. I think that a long time ago HDFS was not dealing well with IPv6 addresses, but now the support should be 100%.

Change 519957 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Replace Yarn nodemanager Prometheus exp's host:port to allow IPv6

https://gerrit.wikimedia.org/r/519957

elukey added a comment.Jul 1 2019, 7:07 AM

Precisely:

elukey@analytics1031:~$ grep -rni prefer /usr/lib/hadoop/ -B 1
/usr/lib/hadoop/libexec/hadoop-config.sh-245-# Disable ipv6 as it can cause issues
/usr/lib/hadoop/libexec/hadoop-config.sh:246:HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"

Change 519957 merged by Elukey:
[operations/puppet@production] Replace Yarn nodemanager Prometheus exp's host:port to allow IPv6

https://gerrit.wikimedia.org/r/519957

Mentioned in SAL (#wikimedia-operations) [2019-07-01T08:39:36Z] <elukey> restart hadoop-yarn-nodemanager on all hadoop workers to pick up new jvm settings - T225296

Change 519979 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Disable java.net.preferIPv4Stack on the Hadoop testing cluster

https://gerrit.wikimedia.org/r/519979

Change 519979 merged by Elukey:
[operations/puppet@production] Disable java.net.preferIPv4Stack on the Hadoop testing cluster

https://gerrit.wikimedia.org/r/519979

Change 519980 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Bind to IPv6 for Hadoop HDFS daemons on the testing cluster

https://gerrit.wikimedia.org/r/519980

Change 519980 merged by Elukey:
[operations/puppet@production] Bind to IPv6 for Hadoop HDFS daemons on the testing cluster

https://gerrit.wikimedia.org/r/519980

All the Hadoop testing cluster is running with IPv6 addresses bound, it looks good (didn't see anything failing so far). I'll wait a couple of days and then slowly roll out the change to the Analytics Hadoop cluster.

Change 520767 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/dns@master] Add Ipv6 PTR/AAA records for an-worker*

https://gerrit.wikimedia.org/r/520767

elukey added a comment.Jul 4 2019, 2:50 PM

I filed a patch to add the missing PTR/AAAA records for an-coord and an-worker* hosts. After it is reviewed/merged, I'll start to roll out the prefer ipv4 = false change in the Hadoop Analytics cluster.

elukey moved this task from Backlog to In Progress on the User-Elukey board.Jul 4 2019, 2:50 PM
fgiunchedi moved this task from Backlog to Radar on the observability board.Jul 8 2019, 1:07 PM

Change 520767 merged by Elukey:
[operations/dns@master] Add Ipv6 PTR/AAAA records for an-worker* and an-coord1001

https://gerrit.wikimedia.org/r/520767

Change 523229 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Allow the use of Ipv6 in the Hadoop Analytics cluster

https://gerrit.wikimedia.org/r/523229

fgiunchedi moved this task from Backlog to Radar on the User-fgiunchedi board.Jul 16 2019, 10:31 AM

https://phabricator.wikimedia.org/T153468 is relevant for this task, since the last patch needs also to allow IPv6 addresses for the prometheus masters.

Note: $ferm_srange = "(@resolve((${prometheus_ferm_nodes})) @resolve((${prometheus_ferm_nodes}), AAAA))"

Change 525309 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::prometheus::jmx_exporter: allow IPv6 polling

https://gerrit.wikimedia.org/r/525309

Change 525309 merged by Elukey:
[operations/puppet@production] profile::prometheus::jmx_exporter: allow IPv6 polling

https://gerrit.wikimedia.org/r/525309

Change 523229 merged by Elukey:
[operations/puppet@production] Allow the use of Ipv6 in the Hadoop Analytics cluster

https://gerrit.wikimedia.org/r/523229

elukey closed this task as Resolved.Tue, Jul 30, 4:12 PM
ayounsi reopened this task as Open.Tue, Jul 30, 4:37 PM

Thanks for tackling the analytics part, the Cloud one is still an issue:

Unrelated, the (some?) coudvirt hosts have the prometheus rsyslog exporter listening on port 9105.
tcp6 0 0 :::9105 :::* LISTEN 33652/prometheus-rs
but it can't be queried from prometheus1004
eg.
prometheus1004:~$ curl -v cloudvirt1015.eqiad.wmnet:9105/metrics hangs
While the other exporter listening on 9100 replies fine.
As Prometheus is configured to query that endpoint, it tries, retries, and fails.

Mentioned in SAL (#wikimedia-operations) [2019-07-30T16:46:51Z] <XioNoX> adding port 9105 to term prometheus in filter labs-in4 - T225296

ayounsi closed this task as Resolved.Tue, Jul 30, 4:51 PM
ayounsi claimed this task.

Looking at it again, it was a missing port in the router's firewall terms.