Page MenuHomePhabricator

k8s liveness check(?) generating session storage log noise
Closed, ResolvedPublic

Description

The liveness check (presumably) is generating the following log output from Kask (session storage service software) on 10 second intervals.

...

2019/07/08 18:30:10 http: TLS handshake error from 10.64.0.247:35706: EOF
2019/07/08 18:30:20 http: TLS handshake error from 10.64.0.247:35784: EOF
2019/07/08 18:30:30 http: TLS handshake error from 10.64.0.247:35862: EOF
2019/07/08 18:30:40 http: TLS handshake error from 10.64.0.247:35940: EOF
2019/07/08 18:30:50 http: TLS handshake error from 10.64.0.247:36018: EOF
2019/07/08 18:31:00 http: TLS handshake error from 10.64.0.247:36096: EOF
2019/07/08 18:31:10 http: TLS handshake error from 10.64.0.247:36174: EOF
2019/07/08 18:31:20 http: TLS handshake error from 10.64.0.247:36252: EOF
...

Possible solutions

See also

https://phabricator.wikimedia.org/T209110#5251135

Event Timeline

Eevans triaged this task as Medium priority.Jul 8 2019, 6:37 PM
Eevans updated the task description. (Show Details)

Change 540481 had a related patch set uploaded (by Eevans; owner: Eevans):
[mediawiki/services/kask@master] Use Logger as Writer for log module

https://gerrit.wikimedia.org/r/540481

Change 540481 merged by jenkins-bot:
[mediawiki/services/kask@master] Use Logger as Writer for log module

https://gerrit.wikimedia.org/r/540481

Change 540660 had a related patch set uploaded (by Eevans; owner: Eevans):
[operations/deployment-charts@master] staging/sessionstore: Upgrade image to 2019-10-03-182310-production

https://gerrit.wikimedia.org/r/540660

Change 540660 merged by jenkins-bot:
[operations/deployment-charts@master] staging/sessionstore: Upgrade image to 2019-10-03-182310-production

https://gerrit.wikimedia.org/r/540660

Deployed to staging, the log output now looks like:

{"msg":"Initializing Kask dbb8ec9 (Go version: go1.11.5, Build host: 87a0b5ccf9a6, Timestamp: 2019-10-03T18:23:48+00:00)...","appname":"sessionstore","time":"2019-10-03T19:12:13Z","level":"INFO"}
{"msg":"Starting service as https://0.0.0.0:8081/sessions/v1/","appname":"sessionstore","time":"2019-10-03T19:12:15Z","level":"INFO"}
{"msg":"http: TLS handshake error from 10.64.0.247:56514: EOF","appname":"sessionstore","time":"2019-10-03T19:12:21Z","level":"WARNING"}
{"msg":"http: TLS handshake error from 10.64.0.247:56614: EOF","appname":"sessionstore","time":"2019-10-03T19:12:31Z","level":"WARNING"}
{"msg":"http: TLS handshake error from 10.64.0.247:56714: EOF","appname":"sessionstore","time":"2019-10-03T19:12:41Z","level":"WARNING"}
{"msg":"http: TLS handshake error from 10.64.0.247:56814: EOF","appname":"sessionstore","time":"2019-10-03T19:12:51Z","level":"WARNING"}
{"msg":"http: TLS handshake error from 10.64.0.247:56914: EOF","appname":"sessionstore","time":"2019-10-03T19:13:01Z","level":"WARNING"}
{"msg":"http: TLS handshake error from 10.64.0.247:57014: EOF","appname":"sessionstore","time":"2019-10-03T19:13:11Z","level":"WARNING"}
{"msg":"http: TLS handshake error from 10.64.0.247:57114: EOF","appname":"sessionstore","time":"2019-10-03T19:13:21Z","level":"WARNING"}
{"msg":"http: TLS handshake error from 10.64.0.247:57214: EOF","appname":"sessionstore","time":"2019-10-03T19:13:31Z","level":"WARNING"}
{"msg":"http: TLS handshake error from 10.64.0.247:57314: EOF","appname":"sessionstore","time":"2019-10-03T19:13:41Z","level":"WARNING"}
{"msg":"http: TLS handshake error from 10.64.0.247:57414: EOF","appname":"sessionstore","time":"2019-10-03T19:13:51Z","level":"WARNING"}
{"msg":"http: TLS handshake error from 10.64.0.247:57514: EOF","appname":"sessionstore","time":"2019-10-03T19:14:01Z","level":"WARNING"}

Change 540671 had a related patch set uploaded (by Eevans; owner: Eevans):
[operations/deployment-charts@master] sessionstore: Upgrade to v1.0.5 release

https://gerrit.wikimedia.org/r/540671

Change 540671 merged by Eevans:
[operations/deployment-charts@master] sessionstore: Upgrade to v1.0.5 release

https://gerrit.wikimedia.org/r/540671

I think there were (implicitly) two issues related to this open task: a) a superfluous log message (aka log spam), and b) unstructured log messages. The latter is now solved, the former is not. Possible options for addressing (a):

  1. Adjust the Kask log level so that the superfluous messages aren't logged
  2. Alter the liveness check so that it does not trigger the log message
  3. Do nothing, live with the log spam
  4. ...?

#1 would mean either adjusting Kasks log level to ERROR, or lowering the default loglevel of otherwise un-leveled messages (aka those logged by components of Go's standard library using the [[ https://golang.org/pkg/log/ | log ]] module), which is currently WARNING. Neither of these seem very palatable to me, by YMMV.

akosiaris claimed this task.
akosiaris subscribed.

I 'll be bold and resolve this since we now have fixed b). We can always reopen and evaluate if a) becomes an issue (aka I am going with 3.)