There is currently no text in this page. You can search for this page title in other pages, or search the related logs.
User Details
- User Since
- Oct 13 2014, 10:19 AM (545 w, 2 d)
- Availability
- Available
- IRC Nick
- zhuyifei1999
- LDAP User
- Zhuyifei1999
- MediaWiki User
- Zhuyifei1999 [ Global Accounts ]
Feb 21 2025
Nov 18 2024
Nov 16 2024
Caused by Flickr adding a rate limit. Tracked on-wiki: https://commons.wikimedia.org/w/index.php?title=Commons:Village_pump/Technical&oldid=957845988#User:FlickreviewR_2_appears_to_be_broken
I got an email from 廣九直通車 about this. I'll take a look later today or tomorrow.
I am not sure whether this is related.
May 15 2023
I think the bottom line here is that the pywikibot system as a whole depends too much on external resources (such as user-config.py) which makes the test suite fragile. There really needs to be a way for the test framework to supply a complete configuration without looking at an external configuration file.
Feb 14 2023
5+ years later, should we make this public?
Feb 10 2023
All deployed via tools.yifeibot@tools-sgebastion-10:~$ toolforge-jobs load jobs.yaml https://k8s-status.toolforge.org/namespaces/tool-yifeibot/ looks good to me, gonna see if any jobs start failing when they get scheduled.
Feb 9 2023
Did that. Used a toolforge-jobs run testshell --command "sleep infinity" --image python3.9 followed by kubectl exec -it testshell-l5qw2 -- /bin/bash
Jan 8 2023
Also I'm trying to spawn a currently-working job on grid via k8s. Getting No module named 'pkg_resources'. I'm suspecting the venv version is too old and needs a rebuild. Is there a supported way to spawn a shell for a given container image?
A direct conversion from the crontab (which has like ~40 entries for various bot jobs) gets a quota error. Though a lot of scripts have been dead for years so let me check which ones to keep.
Moved & deleted crontab
Dec 30 2022
Dec 29 2022
(venv) tools.video2commons@tools-sgebastion-10:~$ crontab -l > crontab.bak (venv) tools.video2commons@tools-sgebastion-10:~$ crontab -l | tail # Wikimedia Tool Labs specific note: # Please be aware that *only* jsub and jstart are acceptable # commands to schedule via cron. Any command specified here will # be modified to be invoked through jsub unless it is one of # the two. # # m h dom mon dow command
Dec 28 2022
tools.fontcdn@tools-sgebastion-10:~$ crontab -l | tail # For more information see the manual pages of crontab(5) and cron(8) # # Wikimedia Tool Labs specific note: # Please be aware that *only* jsub and jstart are acceptable # commands to schedule via cron. Any command specified here will # be modified to be invoked through jsub unless it is one of # the two. # # m h dom mon dow command 29 5 * * * /usr/bin/jsub -N dl-fontdata -once -quiet bash .dl-fontdata.sh tools.fontcdn@tools-sgebastion-10:~$ toolforge-jobs run dl-fontdata --command 'bash .dl-fontdata.sh' --image tf-bullseye-std --schedule "26 5 * * *" tools.fontcdn@tools-sgebastion-10:~$ crontab -r
Bot has been dead for a long time. I just commented out the crontab too.
I'll handle it during this new year break.
Dec 3 2022
Thanks!
Dec 2 2022
Done.
bastion-eqiad1-03.bastion.eqiad1.wikimedia.cloud:/home/zhuyifei1999/2fa-reset-request.txt
Aug 8 2022
Disk image backed up to ~tools.commonsarchive/commonsarchive-vda-20220807.bak.img.gz
Aug 7 2022
Backed up XML dump + files to https://archive.org/download/wiki-commonsarchive
- commonsarchive-mwtest: Test instance, will delete.
- commonsarchive-prod: I'm not interested in maintaining this anymore, but to keep files archived I'll take a disk image for now.
Jun 2 2022
Feb 23 2022
For the record, disable reason is
Feb 20 2022
The VM was fully unresponsible, even the serial console didn't react at all.
kernel bug. RCU should not stall.
Nov 17 2021
I think it's trying to create the file but the directory doesn't exist.
Nov 5 2021
(trying to reproduce it again)
Nov 4 2021
Sorry, I think I was working on it last year and then forgot about this
ticket. I'll check what I was doing back then.
Oct 17 2021
I think we could apply the patch above T286415#7203570. The other issue is looks like more complex fix however.
Aug 22 2021
If this is something "common", it might be helpful to provide a path to directly reupload on failure, skipping the slow transcoding process.
iirc, I mentioned this to @bd808 regarding the chunked uploading issue
during hackathon 2019. I don't remember exactly what he said.
FWIW, uploading large files is known to fail 'sometimes', with larger file,
the larger the likelihood to fail. Pretty sure it's some bug in chunked
uploading MediaWiki code since v2c is 'sometimes' able to upload 3+GiB just
fine with the exact same code.
Aug 12 2021
I downloaded it. I currently don't have time to test it however.
Aug 3 2021
There's never an example of an mp4 file involved. The previous given example File:Contra dancers at the 2019 Flurry Festival.webm is a webm after transcode. I cannot replicate a transcode with the result and no source.
Jul 19 2021
Thanks and sorry for the late response. My primary laptop was in repairs.
Jul 2 2021
I also reduced the 360 days to 180 days to save some space. (now using 703G)
The automated process cleans up old files in /data/scratch/video2commons/uploads/ after 360 days and /data/scratch/video2commons/ssu/ after 60 days. Do you need it shrunk right now or do you mean a rolling cleanup process is okay?
Jun 16 2021
If you can tell me which flags I should add to ffmpeg I'm happy to add that.
The core file I'm pretty sure is related to T283957. deleted. I also truncated the largest logs. It's now at 26G (du -hs). Resolved?
Jun 12 2021
How it finds whether you can delete: https://github.com/wikimedia/pywikibot/blob/595b2497374e40a990303afc75672144ab18ae5f/pywikibot/page/__init__.py#L1746:
Jun 7 2021
Backed up .bash_history to P16310
Jun 6 2021
Should be good to go unless I missed anything.
08:28:40 0 ✓ zhuyifei1999@tools-sgebastion-08: /data/project/stimmberechtigung$ ls -Alh total 26G -rw-r--r-- 1 tools.stimmberechtigung tools.stimmberechtigung 159M Nov 6 2019 access.log -rw------- 1 tools.stimmberechtigung tools.stimmberechtigung 16K Jun 6 20:23 .bash_history drwxrwsr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 12 2013 bin -rw------- 1 tools.stimmberechtigung tools.stimmberechtigung 175 Mar 25 2019 crontab.trusty.save -rw-rw-r-- 1 tools.stimmberechtigung tools.stimmberechtigung 167 Sep 13 2013 .description -rw-r--r-- 1 tools.stimmberechtigung tools.stimmberechtigung 40G Jun 6 20:29 error.log drwxrwsr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 10 2013 include drwxrwsr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Jul 6 2016 .kube drwxrwsr-x 3 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 10 2013 lib -rw-rw-r-- 1 tools.stimmberechtigung tools.stimmberechtigung 243 Dec 10 2013 .lighttpd.conf drwxrwsr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 10 2013 local drwxrwxr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 6 2016 logs -rw------- 1 tools.stimmberechtigung tools.stimmberechtigung 1.5K May 30 2019 .mysql_history drwxrwsr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 12 2013 .pip drwxrwsr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Jun 2 2019 public_html -r-------- 1 tools.stimmberechtigung tools.stimmberechtigung 51 Mar 3 2014 replica.my.cnf -rw-r--r-- 1 tools.stimmberechtigung tools.stimmberechtigung 49K Mar 25 18:45 service.log -rw-r--r-- 1 tools.stimmberechtigung tools.stimmberechtigung 186 Mar 25 18:45 service.manifest drwxrwsr-x 3 tools.stimmberechtigung tools.stimmberechtigung 4.0K Jun 2 2019 src drwx--S--- 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K May 3 2015 .ssh drwxr-sr-x 3 tools.stimmberechtigung tools.stimmberechtigung 4.0K Jun 2 2019 stimmberechtigung -rw-r--r-- 1 tools.stimmberechtigung tools.stimmberechtigung 50 Dec 3 2018 .tmux.conf drwxr-sr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 17 2019 .toolskube drwxr-sr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Jun 2 2019 .vim -rw------- 1 tools.stimmberechtigung tools.stimmberechtigung 16K Jun 2 2019 .viminfo
May 29 2021
May 4 2021
Apr 16 2021
How do I reproduce this? I can see if I can do a bit of debugging.
Apr 11 2021
tools-sgeexec-0919 seems to be getting lots of OOM kills lately. This one in particular:
Mar 27 2021
Hmm. Is the goal trying to find when a worker gets SIGKILL-ed? Celery does
internally detect when a worker dies, as per the logs, but I did not figure
out how to hook it so that it would report to the db.
Mar 3 2021
Nov 23 2020
I'll review the queries too to anonymize if needed and paste the results somewhere.
Nov 11 2020
No, the TFSC checks for secret information before the adopter gets the access.
Nov 7 2020
Mono v5.12.0.226 didn't compile:
Nov 6 2020
Hmm interesting. I'm running 6.10.0.104 and didn't consider version to be an issue. Will test later.
Sorry, took a break. I can't reproduce this locally, for some reason.
Nov 3 2020
My local laptop is a 4-core 8-thread machine, but I fail to get that many threads. I tried ti trace where the threads starts:
Do you have a minimal code that starts a thread pool and pause infinitely? I can do a debug build of mono on my laptop and see what is going on.
FWIW, (ASLR offsets redacted since the process is kept running)
Nov 2 2020
Ah ok
ThreadPool.GetMinThreads(out workerThreads, out completionPortThreads); ThreadPool.SetMaxThreads(workerThreads * 2, completionPortThreads * 2);
(btw, k8s's memory limits is on RSS, using cgroups, IIRC)
Correct. Grid stubbornly enforce virtual memory limits because enforcing RSS isn't as trivial. The former is just an rlimit whereas the latter needs cgroups. I really hate how it works but it is how it is. :/
Yeah, if so many threads start they get different glibc heaps and it would not be efficient with the address space....
#if defined (HOST_ANDROID) || defined (HOST_IOS)
worker.limit_worker_max = CLAMP (threads_count * 100, MIN (threads_count, 200), MAX (threads_count, 200));
#else
worker.limit_worker_max = threads_count * 100;
#endif
What does ThreadPool.GetMaxThreads return?
Hmm. I wonder if I can get the stack trace of the mono frames
I accidentally killed the process while gdb-ing, but now gdb-ing in the core dump:
(gdb) info threads Id Target Id Frame * 1 Thread 0x2b4c7e0c0680 (LWP 21380) 0x00002b4c7e9c217f in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0 2 Thread 0x2b4c7f3e8700 (LWP 21382) 0x00002b4c7e9c217f in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0 3 Thread 0x2b4c8173b700 (LWP 21385) 0x00002b4c7e9c4556 in do_futex_wait.constprop () from /lib/x86_64-linux-gnu/libpthread.so.0 4 Thread 0x2b4c8244c700 (LWP 21546) 0x00002b4c7e9c2528 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0 5 Thread 0x2b4c8294e700 (LWP 21550) 0x000055d6f177f5a0 in ?? () 6 Thread 0x2b4c82b4f700 (LWP 21551) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 7 Thread 0x2b4c8395f700 (LWP 21554) 0x00002b4c7eec88bd in poll () from /lib/x86_64-linux-gnu/libc.so.6 8 Thread 0x2b4c83b60700 (LWP 21555) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 9 Thread 0x2b4c83d61700 (LWP 21556) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 10 Thread 0x2b4c83f62700 (LWP 21561) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 11 Thread 0x2b4ca457c700 (LWP 21562) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 12 Thread 0x2b4ca477d700 (LWP 21563) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 13 Thread 0x2b4ca7380700 (LWP 21702) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 14 Thread 0x2b4ca7581700 (LWP 21875) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 15 Thread 0x2b4ca7100700 (LWP 21968) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 16 Thread 0x2b4ca7d00700 (LWP 22015) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 17 Thread 0x2b4ca7f84700 (LWP 22085) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 18 Thread 0x2b4cc4700700 (LWP 22111) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 19 Thread 0x2b4cc4901700 (LWP 22112) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 20 Thread 0x2b4cc4b02700 (LWP 22113) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 21 Thread 0x2b4cc4d03700 (LWP 22122) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 22 Thread 0x2b4cde600700 (LWP 24038) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 23 Thread 0x2b4cdeb00700 (LWP 24039) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 24 Thread 0x2b4c8274d700 (LWP 24085) 0x00002b4c7e9c2528 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
This was when I generated the core:
$ cat tmp | awk '{ split($1, a, "-"); printf "%s %08x %s %s\n", a[1], strtonum("0x"a[2]) - strtonum("0x"a[1]), $2, $6 }' | sort -k 2 | less -N:
Oct 28 2020
I have added catch (OutOfMemoryException e) to GZipUnpack with Syscall.kill(currentPID, Signum.SIGSTOP); as you said.
And lowered -mem to 1536m.
If my changes are correct (and OOM will happen inside GZipUnpack as usual), then at some time process will stop after OOM triggered.
Honestly, this just reflects
That may be large number. How many cores grid computers have?
+1 try: big chuck of errors:
128MiB glibc arena
tools.wikitasks@tools-sgebastion-08:~/wp_cyrlat$ gdb --args mono WikiTasks.exe GNU gdb (Debian 7.12-6) 7.12.0.20161007-git Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from mono...(no debugging symbols found)...done. (gdb) catch syscall mmap Catchpoint 1 (syscall 'mmap' [9]) (gdb) commands Type commands for breakpoint(s) 1, one per line. End with a line saying just "end". >silent >if $rsi < 50 * 1048576 >c >end >end (gdb) r Starting program: /usr/bin/mono WikiTasks.exe [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7ffff67ff700 (LWP 20010)] [New Thread 0x7ffff44a4700 (LWP 20011)] [Switching to Thread 0x7ffff44a4700 (LWP 20011)] (gdb) bt #0 0x00007ffff6fda64a in __mmap (addr=addr@entry=0x0, len=len@entry=134217728, prot=prot@entry=0, flags=flags@entry=16418, fd=fd@entry=-1, offset=offset@entry=0) at ../sysdeps/unix/sysv/linux/wordsize-64/mmap.c:34 #1 0x00007ffff6f6cad9 in new_heap (size=135168, top_pad=<optimized out>) at arena.c:437 #2 0x00007ffff6f708a6 in _int_new_arena (size=<optimized out>) at arena.c:643 #3 arena_get2 (size=size@entry=1656, avoid_arena=avoid_arena@entry=0x0) at arena.c:875 #4 0x00007ffff6f71c32 in arena_get2 (avoid_arena=0x0, size=1656) at malloc.c:3300 #5 __libc_calloc (n=<optimized out>, elem_size=<optimized out>) at malloc.c:3246 #6 0x000055555585456b in monoeg_g_calloc () #7 0x000055555584ae1c in ?? () #8 0x0000555555785415 in ?? () #9 0x00007ffff74b34a4 in start_thread (arg=0x7ffff44a4700) at pthread_create.c:456 #10 0x00007ffff6fded0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
As for the hang, you are hitting T195834:
Oct 27 2020
The large maps are basically zeros as I scroll through with xxd. Both of the maps have an 132KiB read-writable map immediately preceding it but I don't find any readable strings inside, hence I can't be sure what they are actually for or what allocated them by looking at the core dump.
11:21:24 0 ✓ zhuyifei1999@tools-sgeexec-0906: ~$ sudo gdb -p 13609 -batch -ex 'generate-core-file ~tools.wikitasks/T266377.core' [New LWP 13610] [New LWP 13611] [New LWP 13612] [New LWP 13613] [New LWP 13614] [New LWP 13615] [New LWP 13616] [New LWP 13617] [New LWP 13618] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00002b362ae0d17f in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0 warning: target file /proc/13609/cmdline contained unexpected null characters Saved corefile /data/project/wikitasks/T266377.core 11:29:53 0 ✓ zhuyifei1999@tools-sgeexec-0906: ~$ ls -l /data/project/wikitasks/T266377.core -rw-r--r-- 1 root tools.wikitasks 208923592 Oct 27 23:29 /data/project/wikitasks/T266377.core 11:30:16 0 ✓ zhuyifei1999@tools-sgeexec-0906: ~$ sudo chmod 640 /data/project/wikitasks/T266377.core 11:30:41 0 ✓ zhuyifei1999@tools-sgeexec-0906: ~$ ls -l /data/project/wikitasks/T266377.core -rw-r----- 1 root tools.wikitasks 208923592 Oct 27 23:29 /data/project/wikitasks/T266377.core