Page MenuHomePhabricator

A mono-based program doesn't work on grid, but works on login server
Closed, ResolvedPublic

Description

My C# program gathers some pageviews stats using Wikimedia REST API. It runs successfully from my PC and from Toolforge login server (mono stats.exe), but when I run it from grid (jsub -N stats mono stats.exe), it stops on this code block without throwing exception, without any output to stats.err, without crash, it just stay running and do nothing.

try
{
    reqstr = "https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/ru.wikipedia/all-access/user/" + Uri.EscapeDataString(page) + "/daily/" + year + "0101/" + year + "1231";
    currres = webclient.DownloadString(reqstr);
}
catch
{
    Console.WriteLine(page);
    Console.WriteLine(currres);
    Console.WriteLine(reqstr);
    numofarts--;
    continue;
}

HTTPS certificates updated.

Event Timeline

Which tool is this? Could you submit the job again and I'll see what its doing with gdb?

Account name is mbh, job name is peaks. Job submitted.

root@tools-exec-1413:~# ps uf -u tools.mbh
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
tools.m+ 23190 80.0  0.8 452612 67680 ?        Ssl  06:13   4:36 /usr/bin/mono-sgen visit_peaks.exe
root@tools-exec-1413:~# gdb -p 23190 -batch -ex 'thread apply all bt'
[New LWP 23206]
[New LWP 23205]
[New LWP 23204]
[New LWP 23203]
[New LWP 23202]
[New LWP 23194]
[New LWP 23193]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
185	../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S: No such file or directory.

Thread 8 (Thread 0x2aba35e00700 (LWP 23193)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000006bfef3 in ?? ()
#2  0x00002aba34fc1184 in start_thread (arg=0x2aba35e00700) at pthread_create.c:312
#3  0x00002aba354eb03d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 7 (Thread 0x2aba37b64700 (LWP 23194)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x000000000065bf3c in ?? ()
#2  0x0000000000615a33 in ?? ()
#3  0x00002aba34fc1184 in start_thread (arg=0x2aba37b64700) at pthread_create.c:312
#4  0x00002aba354eb03d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 6 (Thread 0x2aba3f392700 (LWP 23202)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00002aba34fc3649 in _L_lock_909 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00002aba34fc3470 in __GI___pthread_mutex_lock (mutex=0xa1d638) at ../nptl/pthread_mutex_lock.c:79
#3  0x000000000066ae91 in ?? ()
#4  0x000000000066b387 in ?? ()
#5  0x0000000000615a33 in ?? ()
#6  0x00002aba34fc1184 in start_thread (arg=0x2aba3f392700) at pthread_create.c:312
#7  0x00002aba354eb03d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 5 (Thread 0x2aba3f593700 (LWP 23203)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000006cc5f5 in ?? ()
#2  0x000000000062c875 in ?? ()
#3  0x000000000062ce55 in ?? ()
#4  0x000000000065fc2f in ?? ()
#5  0x0000000040b87787 in ?? ()
#6  0x00002aba3f591760 in ?? ()
#7  0x0000000000000000 in ?? ()

Thread 4 (Thread 0x2aba3f794700 (LWP 23204)):
#0  sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
#1  0x0000000000613a2c in ?? ()
#2  0x0000000000613f0f in ?? ()
#3  0x000000000066aceb in ?? ()
#4  0x000000000066b045 in ?? ()
#5  0x000000000066b9e8 in ?? ()
#6  0x0000000000618275 in ?? ()
#7  0x0000000040b871b9 in ?? ()
#8  0x00002aba3f793cb8 in ?? ()
#9  0x00002aba44002610 in ?? ()
#10 0x0000000000000000 in ?? ()

Thread 3 (Thread 0x2aba3f995700 (LWP 23205)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00002aba34fc3649 in _L_lock_909 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00002aba34fc3470 in __GI___pthread_mutex_lock (mutex=0xa1d638) at ../nptl/pthread_mutex_lock.c:79
#3  0x000000000066ae91 in ?? ()
#4  0x000000000066b045 in ?? ()
#5  0x000000000066b9e8 in ?? ()
#6  0x0000000000618275 in ?? ()
#7  0x0000000040b871b9 in ?? ()
#8  0x00002aba358bc018 in ?? ()
#9  0x0000000000000000 in ?? ()

Thread 2 (Thread 0x2aba3fb96700 (LWP 23206)):
#0  0x00000000006de5e0 in monoeg_g_calloc ()
#1  0x00000000006d54b4 in ?? ()
#2  0x0000000000615942 in ?? ()
#3  0x00002aba34fc1184 in start_thread (arg=0x2aba3fb96700) at pthread_create.c:312
#4  0x00002aba354eb03d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Thread 1 (Thread 0x2aba346ba640 (LWP 23190)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00000000006cc5f5 in ?? ()
#2  0x000000000062c875 in ?? ()
#3  0x000000000062ce55 in ?? ()
#4  0x000000000065fc2f in ?? ()
#5  0x0000000040b87787 in ?? ()
#6  0x00007ffdbffbe230 in ?? ()
#7  0x0000000000000000 in ?? ()

#0 0x00000000006de5e0 in monoeg_g_calloc () looks like T195834. See if increasing memory helps.

FWIW:

root@tools-exec-1413:~# cat /proc/23190/limits
Limit                     Soft Limit           Hard Limit           Units     
[...]
Max address space         524288000            524288000            bytes

That's 500MiB of virtual memory (address space).

It seemed to me that on grid documentation on wikitech.wikimedia.org was stated that default memory size for grid jobs is 4 GB. OK, I will try to set memory size explicitly.

MBH claimed this task.

Yes, the job works fine after I explicitly set memory size to 4 GB.