Page MenuHomePhabricator

wm-bot is not responding to any messages
Closed, ResolvedPublic

Description

wm-bot is not responding to any commands and is not sending any messages (e.g. recent changes) in both channel messages and private messages.

Event Timeline

tom29739 created this task.Jul 3 2016, 1:37 PM
Restricted Application added subscribers: Matthewrbowker, Zppix, Aklapper. · View Herald TranscriptJul 3 2016, 1:37 PM
Paladox added a subscriber: Paladox.Jul 3 2016, 1:41 PM
K6ka added a subscriber: K6ka.Jul 3 2016, 5:28 PM
Matthewrbowker triaged this task as Unbreak Now! priority.
Restricted Application added subscribers: Luke081515, TerraCodes, Urbanecm. · View Herald TranscriptJul 3 2016, 10:33 PM
Paladox added a project: Cloud-Services.EditedJul 3 2016, 11:05 PM
Paladox added a subscriber: yuvipanda.

@yuvipanda

<Krenair> yep it's on 1006

We need to migrate to labvirt1011 since it seems all labvirt except from labvirt1011 are out of storage so ssh is not working please.

Instance is currently in an error state and wm-bot is down.

Mentioned in SAL [2016-07-04T10:43:20Z] <yuvipanda> migrate wm-bot instance to labvirt1011 for T139264

I've migrated it and the instance is back up. However I do not know what else needs to be done to bring wm-bot back up?

@yuvipanda,
https://wikitech.wikimedia.org/w/index.php?title=Wm-bot#How_to_start_the_bot

Or alternatively wait for one of the bot's roots to come along, but that
might take a while.

Thanks for the pointer, @tom29739. I have done those things now.

Can someone verify that this works fine now?

It doesn't appear to be working. It's not on IRC, so I'd assume it's not working. @yuvipanda did you delete the pid files when you tried to get it working? It crashed last time, so those files need to be removed.

sudo rm -iv /mnt/share/wm-bot/*.pid

Should do that according to the docs.

I did indeed delete the pid files.

That's weird. It should be back on then...

Technical13 added a subscriber: Technical13.EditedJul 4 2016, 2:06 PM

Did you restart the bouncers as well as the bot?

nohup mono bouncer.exe 6667 &
nohup mono bouncer.exe 6668 &
nohup mono bouncer.exe 6669 &
nohup mono bouncer.exe 6660 &
nohup mono bouncer.exe 6661 &

Once the bouncers are running again you need to connect bots to them

conn wm-bot
conn wm-bot2
conn wm-bot3
conn wm-bot4
conn wm-bot5

Yeah, need to restart the bouncers then connect the bot.

I get

Invalid username or password, bye

on following those instructions when I try to telnet(!?) into localhost 2020

I get the same error. @Petrb can you look into this? I can't access anything that would allow me to troubleshoot this further.

Petrb added a comment.Jul 5 2016, 9:19 AM

I am on vacation the bot doesn't work because instance huggle-pg is down and the bot's core can't connect to SQL

Petrb added a comment.Jul 5 2016, 9:21 AM

Once pg is back up just run service script and it will be back

I see huggle-pg is in SHUTOFF state and not responding to nova start. This will have to wait for @Andrew to come back I'm afraid.

Petrb added a comment.Jul 5 2016, 11:25 AM

In that case I am temporarily disabling it all services that rely on SQL will be turned off including public channel logs until it is back

K6ka added a comment.Jul 5 2016, 1:49 PM

wm-bot seems to be back up, saw it rejoin IRC at around 7:23AM EST.

Andrew added a comment.Jul 5 2016, 6:11 PM

huggle-pg should be back working now.

Matthewrbowker closed this task as Resolved.Jul 6 2016, 12:52 AM

I have re-enabled huggle-pg per the above. Bot is now operational.

Restricted Application added a subscriber: Jay8g. · View Herald TranscriptJun 7 2017, 6:41 PM