Page MenuHomePhabricator

Deploy or consciously decide not to deploy metrics-server in toolforge kubernetes
Closed, ResolvedPublic

Description

We have noticed that some basic tooling to see what mem/cpu a pod is consuming is not available in our current build. It requires configuring metrics-server (formerly the retired heapster). This is naturally deployed via bunch of yaml, but we can take a look at it and see what makes sense for us.

https://github.com/kubernetes-sigs/metrics-server

Event Timeline

My first impulse was to look at https://github.com/kubernetes/kube-state-metrics instead, but I'm not sure at this point if they provide the same or are competing options, or what.

Change 556340 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: k8s: include metrics-server

https://gerrit.wikimedia.org/r/556340

Change 556363 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: k8s: introduce admin script to add x509 certs to k8s secrets

https://gerrit.wikimedia.org/r/556363

Change 556363 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: k8s: introduce admin script to add x509 certs to k8s secrets

https://gerrit.wikimedia.org/r/556363

Change 556340 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: k8s: include metrics-server

https://gerrit.wikimedia.org/r/556340

I think this is what we are looking for in this ticket:

root@toolsbeta-test-k8s-control-3:~# kubectl top nodes
NAME                           CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
toolsbeta-test-k8s-control-1   140m         7%     1721Mi          44%       
toolsbeta-test-k8s-control-2   187m         9%     1725Mi          44%       
toolsbeta-test-k8s-control-3   161m         8%     1829Mi          47%       
toolsbeta-test-k8s-worker-1    811m         40%    3199Mi          83%       
toolsbeta-test-k8s-worker-2    251m         12%    2177Mi          56%       
toolsbeta-test-k8s-worker-3    906m         22%    1087Mi          13%       
root@toolsbeta-test-k8s-control-3:~# kubectl top pod --all-namespaces
NAMESPACE            NAME                                                   CPU(cores)   MEMORY(bytes)   
ingress-admission    ingress-admission-55fb8554b5-5sjql                     1m           4Mi             
ingress-admission    ingress-admission-55fb8554b5-dmtg9                     1m           4Mi             
ingress-nginx        nginx-ingress-5d586d964b-4pgxb                         5m           1193Mi          
ingress-nginx        nginx-ingress-5d586d964b-kzgkv                         5m           1093Mi          
ingress-nginx        nginx-ingress-5d586d964b-qdjd4                         4m           1127Mi          
kube-system          calico-kube-controllers-59f54d6bbc-dwg76               4m           13Mi            
kube-system          calico-node-2sf9d                                      36m          96Mi            
kube-system          calico-node-dfbqd                                      16m          151Mi           
kube-system          calico-node-g4hr7                                      14m          193Mi           
kube-system          calico-node-lp2c9                                      45m          42Mi            
kube-system          calico-node-q5phv                                      16m          174Mi           
kube-system          calico-node-x2n9j                                      50m          143Mi           
kube-system          coredns-5c98db65d4-5xmnt                               3m           13Mi            
kube-system          coredns-5c98db65d4-j2pxb                               3m           14Mi            
kube-system          kube-apiserver-toolsbeta-test-k8s-control-1            17m          297Mi           
kube-system          kube-apiserver-toolsbeta-test-k8s-control-2            21m          329Mi           
kube-system          kube-apiserver-toolsbeta-test-k8s-control-3            23m          354Mi           
kube-system          kube-controller-manager-toolsbeta-test-k8s-control-1   1m           12Mi            
kube-system          kube-controller-manager-toolsbeta-test-k8s-control-2   1m           15Mi            
kube-system          kube-controller-manager-toolsbeta-test-k8s-control-3   11m          105Mi           
kube-system          kube-proxy-4n59c                                       8m           19Mi            
kube-system          kube-proxy-8c4lp                                       6m           20Mi            
kube-system          kube-proxy-8hkk4                                       7m           14Mi            
kube-system          kube-proxy-frwmb                                       4m           17Mi            
kube-system          kube-proxy-jjdnj                                       1m           13Mi            
kube-system          kube-proxy-wkkdk                                       5m           15Mi            
kube-system          kube-scheduler-toolsbeta-test-k8s-control-1            1m           12Mi            
kube-system          kube-scheduler-toolsbeta-test-k8s-control-2            2m           14Mi            
kube-system          kube-scheduler-toolsbeta-test-k8s-control-3            1m           12Mi            
kube-system          metrics-server-6459f9bcc5-qm8mz                        2m           17Mi            
maintain-kubeusers   maintain-kubeusers-7b6bb8f79d-2dbbd                    7m           69Mi            
registry-admission   registry-admission-6f5f6589c5-6mmkd                    1m           4Mi             
registry-admission   registry-admission-6f5f6589c5-wxlq9                    1m           10Mi            
tool-fourohfour      fourohfour-66bf569f4f-s67pm                            1m           28Mi            
tool-test            test-85d69fb4f9-nxp8f                                  1m           22Mi       

Change 556369 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: k8s: metrics: include some hints and comments

https://gerrit.wikimedia.org/r/556369

Change 556369 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: k8s: metrics: include some hints and comments

https://gerrit.wikimedia.org/r/556369

Yup, that's the notion :)

That makes it much easier to quickly tell what a pod needs.

https://github.com/kubernetes/kube-state-metrics#kube-state-metrics-vs-metrics-server <-- sounds like that might be a good improvement in the future, so good find there :)

I was just after kubectl top and similar metrics for now. Plus, by adjusting the script to drop certs in secrets that might make it work fine for renewing the webhook controller certs, I'll have to check the code for what it expects it all to be named, etc.

For some reason, the metrics-server doesn't work on the tools project cluster.

Some hints:

root@tools-k8s-control-2:~# kubectl logs metrics-server-575d5f6d95-flh8c -n kube-system
Error: Get https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication: dial tcp 10.96.0.1:443: i/o timeout
Usage:
   [flags]
[..]

root@tools-k8s-control-2:~# kubectl logs -n kube-system -l=component=kube-apiserver
I1211 13:46:25.032271       1 controller.go:107] OpenAPI AggregationController: Processing item v1beta1.metrics.k8s.io
W1211 13:46:25.032424       1 handler_proxy.go:91] no RequestInfo found in the context
E1211 13:46:25.032528       1 controller.go:114] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
[..]

Not sure at this point if we have additional packet filtering or some kind of firewalling in the tools project, or there are other differences from toolsbeta.

That or a node is hosed...nope, they all report ready. It's all timeouts, which sounds very firewally-ish.

kube-system metrics-server-575d5f6d95-flh8c 0/1 CrashLoopBackOff 50 4h21m

The logs from that pod:

Error: Get https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication: dial tcp 10.96.0.1:443: i/o timeout
Usage:
   [flags]

Flags:
      --alsologtostderr                                         log to standard error as well as files
      --authentication-kubeconfig string                        kubeconfig file pointing at the 'core' kubernetes server with enough rights to create tokenaccessreviews.authentication.k8s.io.
      --authentication-skip-lookup                              If false, the authentication-kubeconfig will be used to lookup missing authentication configuration from the cluster.
      --authentication-token-webhook-cache-ttl duration         The duration to cache responses from the webhook token authenticator. (default 10s)
      --authentication-tolerate-lookup-failure                  If true, failures to look up missing authentication configuration from the cluster are not considered fatal. Note that this can result in authentication that treats all requests as anonymous.
      --authorization-always-allow-paths strings                A list of HTTP paths to skip during authorization, i.e. these are authorized without contacting the 'core' kubernetes server.
      --authorization-kubeconfig string                         kubeconfig file pointing at the 'core' kubernetes server with enough rights to create subjectaccessreviews.authorization.k8s.io.
      --authorization-webhook-cache-authorized-ttl duration     The duration to cache 'authorized' responses from the webhook authorizer. (default 10s)
      --authorization-webhook-cache-unauthorized-ttl duration   The duration to cache 'unauthorized' responses from the webhook authorizer. (default 10s)
      --bind-address ip                                         The IP address on which to listen for the --secure-port port. The associated interface(s) must be reachable by the rest of the cluster, and by CLI/web clients. If blank, all interfaces will be used (0.0.0.0 for all IPv4 interfaces and :: for all IPv6 interfaces). (default 0.0.0.0)
      --cert-dir string                                         The directory where the TLS certs are located. If --tls-cert-file and --tls-private-key-file are provided, this flag will be ignored. (default "apiserver.local.config/certificates")
      --client-ca-file string                                   If set, any request presenting a client certificate signed by one of the authorities in the client-ca-file is authenticated with an identity corresponding to the CommonName of the client certificate.
      --contention-profiling                                    Enable lock contention profiling, if profiling is enabled
  -h, --help                                                    help for this command
      --http2-max-streams-per-connection int                    The limit that the server gives to clients for the maximum number of streams in an HTTP/2 connection. Zero means to use golang's default.
      --kubeconfig string                                       The path to the kubeconfig used to connect to the Kubernetes API server and the Kubelets (defaults to in-cluster config)
      --kubelet-certificate-authority string                    Path to the CA to use to validate the Kubelet's serving certificates.
      --kubelet-insecure-tls                                    Do not verify CA of serving certificates presented by Kubelets.  For testing purposes only.
      --kubelet-port int                                        The port to use to connect to Kubelets. (default 10250)
      --kubelet-preferred-address-types strings                 The priority of node address types to use when determining which address to use to connect to a particular node (default [Hostname,InternalDNS,InternalIP,ExternalDNS,ExternalIP])
      --log-flush-frequency duration                            Maximum number of seconds between log flushes (default 5s)
      --log_backtrace_at traceLocation                          when logging hits line file:N, emit a stack trace (default :0)
      --log_dir string                                          If non-empty, write log files in this directory
      --log_file string                                         If non-empty, use this log file
      --logtostderr                                             log to standard error instead of files (default true)
      --metric-resolution duration                              The resolution at which metrics-server will retain metrics. (default 1m0s)
      --profiling                                               Enable profiling via web interface host:port/debug/pprof/ (default true)
      --requestheader-allowed-names strings                     List of client certificate common names to allow to provide usernames in headers specified by --requestheader-username-headers. If empty, any client certificate validated by the authorities in --requestheader-client-ca-file is allowed.
      --requestheader-client-ca-file string                     Root certificate bundle to use to verify client certificates on incoming requests before trusting usernames in headers specified by --requestheader-username-headers. WARNING: generally do not depend on authorization being already done for incoming requests.
      --requestheader-extra-headers-prefix strings              List of request header prefixes to inspect. X-Remote-Extra- is suggested. (default [x-remote-extra-])
      --requestheader-group-headers strings                     List of request headers to inspect for groups. X-Remote-Group is suggested. (default [x-remote-group])
      --requestheader-username-headers strings                  List of request headers to inspect for usernames. X-Remote-User is common. (default [x-remote-user])
      --secure-port int                                         The port on which to serve HTTPS with authentication and authorization.If 0, don't serve HTTPS at all. (default 443)
      --skip_headers                                            If true, avoid header prefixes in the log messages
      --stderrthreshold severity                                logs at or above this threshold go to stderr
      --tls-cert-file string                                    File containing the default x509 Certificate for HTTPS. (CA cert, if any, concatenated after server cert). If HTTPS serving is enabled, and --tls-cert-file and --tls-private-key-file are not provided, a self-signed certificate and key are generated for the public address and saved to the directory specified by --cert-dir.
      --tls-cipher-suites strings                               Comma-separated list of cipher suites for the server. If omitted, the default Go cipher suites will be use.  Possible values: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_RC4_128_SHA,TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_RC4_128_SHA,TLS_RSA_WITH_3DES_EDE_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_RSA_WITH_AES_256_GCM_SHA384,TLS_RSA_WITH_RC4_128_SHA
      --tls-min-version string                                  Minimum TLS version supported. Possible values: VersionTLS10, VersionTLS11, VersionTLS12
      --tls-private-key-file string                             File containing the default x509 private key matching --tls-cert-file.
      --tls-sni-cert-key namedCertKey                           A pair of x509 certificate and private key file paths, optionally suffixed with a list of domain patterns which are fully qualified domain names, possibly with prefixed wildcard segments. If no domain patterns are provided, the names of the certificate are extracted. Non-wildcard matches trump over wildcard matches, explicit domain patterns trump over extracted names. For multiple key/certificate pairs, use the --tls-sni-cert-key multiple times. Examples: "example.crt,example.key" or "foo.crt,foo.key:*.foo.com,foo.com". (default [])
  -v, --v Level                                                 number for the log level verbosity
      --vmodule moduleSpec                                      comma-separated list of pattern=N settings for file-filtered logging

panic: Get https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication: dial tcp 10.96.0.1:443: i/o timeout

goroutine 1 [running]:
main.main()
        /go/src/github.com/kubernetes-incubator/metrics-server/cmd/metrics-server/metrics-server.go:39 +0x13b

It's running on a worker, not a control plane node, which could be important:
kube-system metrics-server-575d5f6d95-flh8c 0/1 CrashLoopBackOff 51 4h24m 192.168.29.1 tools-k8s-worker-5 <none> <none>

Interestingly, it is the ONLY pod on worker-5. Rebooting it.

Bam, that did it. It started up on worker-3 and is happy.

root@tools-k8s-control-1:~# kubectl top pod --all-namespaces
NAMESPACE            NAME                                          CPU(cores)   MEMORY(bytes)   
ingress-admission    ingress-admission-55fb8554b5-6c98v            1m           4Mi             
ingress-admission    ingress-admission-55fb8554b5-btp7h            1m           4Mi             
ingress-nginx        nginx-ingress-5dbf7cb65c-gc7ls                3m           101Mi           
ingress-nginx        nginx-ingress-5dbf7cb65c-m4b9p                3m           101Mi           
ingress-nginx        nginx-ingress-5dbf7cb65c-zkdqk                4m           1042Mi          
kube-system          calico-kube-controllers-59f54d6bbc-rwqkg      2m           20Mi            
kube-system          calico-node-44ntg                             19m          181Mi           
kube-system          calico-node-c8rxg                             18m          186Mi           
kube-system          calico-node-dn2pk                             24m          104Mi           
kube-system          calico-node-g64gn                             16m          190Mi           
kube-system          calico-node-nrk46                             16m          181Mi           
kube-system          calico-node-pcmlx                             24m          119Mi           
kube-system          calico-node-qz4tn                             21m          191Mi           
kube-system          calico-node-snn29                             30m          67Mi            
kube-system          coredns-5c98db65d4-97hjp                      3m           14Mi            
kube-system          coredns-5c98db65d4-hq6hc                      3m           15Mi            
kube-system          kube-apiserver-tools-k8s-control-1            19m          195Mi           
kube-system          kube-apiserver-tools-k8s-control-2            37m          172Mi           
kube-system          kube-apiserver-tools-k8s-control-3            26m          201Mi           
kube-system          kube-controller-manager-tools-k8s-control-1   1m           23Mi            
kube-system          kube-controller-manager-tools-k8s-control-2   12m          49Mi            
kube-system          kube-controller-manager-tools-k8s-control-3   1m           13Mi            
kube-system          kube-proxy-77qj4                              1m           19Mi            
kube-system          kube-proxy-7ssq4                              1m           16Mi            
kube-system          kube-proxy-8fv9z                              1m           16Mi            
kube-system          kube-proxy-8sl46                              3m           16Mi            
kube-system          kube-proxy-qvfvp                              1m           19Mi            
kube-system          kube-proxy-r94mq                              1m           23Mi            
kube-system          kube-proxy-wkgzb                              2m           19Mi            
kube-system          kube-proxy-wwmjc                              3m           15Mi            
kube-system          kube-scheduler-tools-k8s-control-1            2m           12Mi            
kube-system          kube-scheduler-tools-k8s-control-2            1m           12Mi            
kube-system          kube-scheduler-tools-k8s-control-3            1m           13Mi            
kube-system          metrics-server-575d5f6d95-ff7pp               1m           12Mi            
registry-admission   registry-admission-6f5f6589c5-clj2w           1m           7Mi             
registry-admission   registry-admission-6f5f6589c5-tzzk5           1m           4Mi 

Probably because you aren't stress-testing them, the ingress pods on tools are laughably uneven. :)

iptables looks good on worker-5. Presumably the next things we schedule there will be on the network.

I think that ties this up until we decide we need a sharded kube-state-metrics setup.