Page MenuHomePhabricator

Jouncebot can die without people knowing from appearantly unauthorized die command
Closed, ResolvedPublic

Description

Relevant logs:
(CDT)

[2018-04-05 17:58:18] → Zoranzoki21 joined (6df58f0f@gateway/web/freenode/ip.109.245.143.15)
[2018-04-05 18:00:04] <jouncebot> addshore, hashar, anomie, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, and zeljkof: Dear deployers, time to do the Evening SWAT (Max 8 patches) deploy. Dont look at me like that. You signed up for it. (https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20180405T2300).
[2018-04-05 18:00:05] <jouncebot> Zoranzoki21: A patch you scheduled for Evening SWAT (Max 8 patches) is about to be deployed. Please be around during the process. Note: If you break AND fix the wikis, you will be rewarded with a sticker.
[2018-04-05 18:00:05] <wikibugs> ^C10Operations^O, ^C10Analytics^O, ^C10Analytics-Data-Quality^O, ^C10Analytics-Kanban^O, and 4 others: Opera mini IP addresses reassigned - https://phabricator.wikimedia.org/T187014#4110582 (^C10Nuria^O) Varnish5 rollout might have something to do with this? https://gerrit.wikimedia.org/r/#/c/409047/ cc @ema
[2018-04-05 18:00:44] <Zoranzoki21> hi
[2018-04-05 18:03:04] <wikibugs> (^C03CR^O) ^C10^BBryanDavis^O: wiki replicas: drop views with missing tables (^C03^B1^O comment) [puppet] - ^C10https://gerrit.wikimedia.org/r/424166^O (https://phabricator.wikimedia.org/T191387) (owner: ^C10^BBryanDavis^O)
[2018-04-05 18:04:40] <Zoranzoki21> who will be swater
[2018-04-05 18:05:10] ⇐ jouncebot quit (tools.joun@wikimedia/bot/jouncebot): Quit: Killed by Zoranzoki21
[2018-04-05 18:05:26] ⇐ Zoranzoki21 quit (6df58f0f@gateway/web/freenode/ip.109.245.143.15): Quit: Page closed

(UTC)

2018-04-05T22:39:09Z JounceBot    DEBUG   : Setting deploy timer to 1255.0 for deploycal-item-20180405T2300: (2018-04-05 23:00:00+00:00 -> 2018-04-06 00:00:00+00:00) Evening SWAT (Max 8 patches); addshore, hashar, anomie, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, zeljkof for Zoranzoki21
2018-04-05T22:54:09Z JounceBot    DEBUG   : Collecting new deployment information from the server
2018-04-05T22:54:09Z py.warnings  WARNING : /mnt/nfs/labstore-secondary-tools-project/jouncebot/virtenv/local/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:132: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning

2018-04-05T22:54:10Z JounceBot    DEBUG   : Got 124 items
2018-04-05T22:54:10Z JounceBot    DEBUG   : Setting deploy timer to 354.0 for deploycal-item-20180405T2300: (2018-04-05 23:00:00+00:00 -> 2018-04-06 00:00:00+00:00) Evening SWAT (Max 8 patches); addshore, hashar, anomie, aude, MaxSem, twentyafterfour, RoanKattouw, Dereckson, thcipriani, Niharika, zeljkof for Zoranzoki21
2018-04-05T22:58:29Z JounceBot    DEBUG   : Received command from Zoranzoki21 at Zoranzoki21!6df58f0f@gateway/web/freenode/ip.109.245.143.15: next
2018-04-05T22:58:57Z JounceBot    DEBUG   : Received command from Zoranzoki21 at Zoranzoki21!6df58f0f@gateway/web/freenode/ip.109.245.143.15: now
2018-04-05T22:59:40Z JounceBot    DEBUG   : Received command from Zoranzoki21 at Zoranzoki21!6df58f0f@gateway/web/freenode/ip.109.245.143.15: now
2018-04-05T23:00:04Z JounceBot    INFO    : Deploy timer kicked. Attempting to notify.
2018-04-05T23:00:04Z JounceBot    DEBUG   : Num events: 1
2018-04-05T23:00:04Z JounceBot    DEBUG   : Setting deploy timer to 302400.0 for deploycal-item-20180409T1100: (2018-04-09 11:00:00+00:00 -> 2018-04-09 11:30:00+00:00) Wikimedia Portals Update; jan_drewniak for 
2018-04-05T23:04:52Z JounceBot    DEBUG   : Received command from Zoranzoki21 at Zoranzoki21!6df58f0f@gateway/web/freenode/ip.109.245.143.15: help
2018-04-05T23:04:59Z JounceBot    DEBUG   : Received command from Zoranzoki21 at Zoranzoki21!6df58f0f@gateway/web/freenode/ip.109.245.143.15: now
2018-04-05T23:05:06Z JounceBot    DEBUG   : Received command from Zoranzoki21 at Zoranzoki21!6df58f0f@gateway/web/freenode/ip.109.245.143.15: refresh
2018-04-05T23:05:06Z JounceBot    DEBUG   : Collecting new deployment information from the server
2018-04-05T23:05:06Z py.warnings  WARNING : /mnt/nfs/labstore-secondary-tools-project/jouncebot/virtenv/local/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:132: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning

2018-04-05T23:05:07Z JounceBot    DEBUG   : Got 124 items
2018-04-05T23:05:07Z JounceBot    DEBUG   : Setting deploy timer to 302097.0 for deploycal-item-20180409T1100: (2018-04-09 11:00:00+00:00 -> 2018-04-09 11:30:00+00:00) Wikimedia Portals Update; jan_drewniak for 
2018-04-05T23:05:10Z JounceBot    DEBUG   : Received command from Zoranzoki21 at Zoranzoki21!6df58f0f@gateway/web/freenode/ip.109.245.143.15: die
2018-04-09T18:36:59Z py.warnings  WARNING : /mnt/nfs/labstore-secondary-tools-project/jouncebot/virtenv/local/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:334: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings

Apparently nobody noticed the bot died until 4 days later. I don't see Zoranzoki21 listed as a member of group tools.jouncebot, so assuming that they should not have access of killing the bot.

Event Timeline

If the bot is supposed to be able to die from anyone's commands, maybe make it able to notify / announce its death somehow, so we can mourn it? or make use of auto-restart mechanisms like bigbrother or kubernetes?

I think we can just rip this "feature" out of the codebase. I thought it was guarded by a debug mode flag, but may it is not (or maybe the main deployment is in debug mode).

I think we can just rip this "feature" out of the codebase.

+1. I can do it.

Legoktm changed the visibility from "Custom Policy" to "Public (No Login Required)".Apr 9 2018, 9:17 PM