Page MenuHomePhabricator

Network error from grid with mono bot
Closed, ResolvedPublic


When i run my mono bot from the grid using jstart, with

jstart -stderr -continuous -N PeriodiBOT -M my@email -m beas mono [program]

it can't connect to the network (timeout error) this problem doesn't happen if i run my bot directly on the tool account with

mono [program]

Note: My tool account have the Mozilla Trusted CA ROOT certificates on /.config/mono/certs/Trust

Event Timeline

I've spent sometime trying to debug this, to no avail.

The bot runs fine in bastion or on any gridengine host since T186846 was resolved.

If sent from the bastion to grid, however it gets stuck on the step to connect to es.Wikipedia and consumes 100% cpu on the exec host.

Directly on tools-exec-*

chicocvenancio@tools-exec-1423:~$ sudo become periodibot
tools.periodibot@tools-exec-1423:~$ mono PeriodiBOT/PeriodiBOT-IRC.exe
==================== PeriodiBOT 2.6.6608.9829 ====================
[2018/02/09 20:56:46] [LOCAL LOG] Undefined: Loading config
[2018/02/09 20:56:46] [LOCAL LOG] PeriodiBOT: Loading operators
[2018/02/09 20:56:46] [LOCAL LOG] PeriodiBOT: Starting...
[2018/02/09 20:56:46] [LOCAL LOG] PeriodiBOT: Signing in...
[2018/02/09 20:56:46] [LOCAL LOG] PeriodiBOT: Obtaining token...
[2018/02/09 20:56:49] [LOCAL LOG] PeriodiBOT: Token obtained!
[2018/02/09 20:56:49] [LOCAL LOG] PeriodiBOT: Login result: Success
[2018/02/09 20:56:49] [LOCAL LOG] PeriodiBOT: UserID: 4597225
[2018/02/09 20:56:49] [LOCAL LOG] PeriodiBOT: Username: PeriodiBOT

jsub from bastion:

root@tools-bastion-03:~# become periodibot
tools.periodibot@tools-bastion-03:~$ jsub -N PeriodiBOTChico mono ~/PeriodiBOT/PeriodiBOT-IRC.exe
tools.periodibot@tools-bastion-03:~$ tail PeriodiBOTChico.out

Login Failed (Network error)
The request timed out

Press any key to exit or wait 5 seconds
......[2018/02/09 20:54:14] [LOCAL LOG] PeriodiBOT: Obtaining token...

And on tools-exec-1423:

chicocvenancio@tools-exec-1423:~$ ps aux|grep mono
tools.p+ 30836 99.4 0.2 470788 22788 ? Ssl 20:48 4:07 /usr/bin/mono-sgen /data/project/periodibot/PeriodiBOT/PeriodiBOT-IRC.exe

Is there some environment setup in the tool's .profile or .bashrc that is needed by the mono runtime? A job submitted to the grid will not be run from a login shell and may be missing environment settings that would be present for an interactive job. The fix for this is to create a shell script that sets up the needed environment and then execs the core program and submit that to the grid instead.

What's the minimum amount of code that is able to reproduce this error?

Mono itself should be okay; I have one of my bots running in mono (though not written by myself) for years.

After reading some relevant code of the bot, then googling 'WebException PostDataAndGetResult' I found T150099.

Seeing the task @zhuyifei1999 pointed to and testing a bit I found that -mem 2g fixes the problem.

Note -mem 1g will get the bot to work, but it will consume 200% cpu. If @MarioFinale confirms it is working I think we can close this.

The bot runks well with

jstart -stderr -continuous -N PeriodiBOT -M -m beas -mem 2g mono ~/PeriodiBOT/PeriodiBOT-IRC.exe

But i've never saw my bot using that much RAM, the maximum ever that it used was 120mb on Ubuntu Xenial.

Probably it's just mono just allocating way more memory that it actually needs.

Chicocvenancio claimed this task.

1.2GB of virtual memory notwithstanding, the bot only uses 51mb of physical memory. It seems it is counting a lot of .NET stuff into that.