zhuyifei1999
Not Serious business title.

Projects (19)
View All

Chinese-Sites
Tag
Commons
Group
Dumps-Rewrite
Component
Maintainer needed
Milestone
MediaWiki-extensions-UploadsLink
Component

Calendar

User Details

User Since: Oct 13 2014, 10:19 AM (500 w, 1 d)
Availability: Available
IRC Nick: zhuyifei1999
LDAP User: Zhuyifei1999
MediaWiki User: Zhuyifei1999 [ Global Accounts ]

There is currently no text in this page. You can search for this page title in other pages, or search the related logs.

Recent Activity
View All

May 15 2023

zhuyifei1999 added a comment to T336630: TestShortLink::test_create_short_link sometimes fails with oauth credentials in user-config.py.

I think the bottom line here is that the pywikibot system as a whole depends too much on external resources (such as user-config.py) which makes the test suite fragile. There really needs to be a way for the test framework to supply a complete configuration without looking at an external configuration file.

May 15 2023, 6:59 PM · Pywikibot-tests, Pywikibot

Feb 14 2023

zhuyifei1999 changed the visibility for T168002: Deleted files can remain on swift due to race conditions.

Feb 14 2023, 1:32 PM · Security, MediaWiki-File-management, Commons, MediaWiki-Page-deletion, Multimedia, SRE-swift-storage, SRE

zhuyifei1999 added a comment to T168002: Deleted files can remain on swift due to race conditions.

5+ years later, should we make this public?

Feb 14 2023, 1:32 PM · Security, MediaWiki-File-management, Commons, MediaWiki-Page-deletion, Multimedia, SRE-swift-storage, SRE

Feb 10 2023

zhuyifei1999 closed T320198: Migrate yifeibot from Toolforge GridEngine to Toolforge Kubernetes as Resolved.

Feb 10 2023, 2:43 PM · Grid-Engine-to-K8s-Migration

zhuyifei1999 added a comment to T320198: Migrate yifeibot from Toolforge GridEngine to Toolforge Kubernetes.

All deployed via tools.yifeibot@tools-sgebastion-10:~$ toolforge-jobs load jobs.yaml https://k8s-status.toolforge.org/namespaces/tool-yifeibot/ looks good to me, gonna see if any jobs start failing when they get scheduled.

Feb 10 2023, 11:15 AM · Grid-Engine-to-K8s-Migration

zhuyifei1999 awarded T329350: Request increased quota for yifeibot Toolforge tool a Love token.

Feb 10 2023, 11:12 AM · Toolforge (Quota-requests)

zhuyifei1999 awarded T311917: [webservice,toolforge-cli] Make `webservice shell` a standalone tool a Like token.

Feb 10 2023, 9:40 AM · Toolforge

zhuyifei1999 updated the task description for T329350: Request increased quota for yifeibot Toolforge tool.

Feb 10 2023, 9:24 AM · Toolforge (Quota-requests)

zhuyifei1999 added a subtask for T320198: Migrate yifeibot from Toolforge GridEngine to Toolforge Kubernetes: T329350: Request increased quota for yifeibot Toolforge tool.

Feb 10 2023, 8:50 AM · Grid-Engine-to-K8s-Migration

zhuyifei1999 added a parent task for T329350: Request increased quota for yifeibot Toolforge tool: T320198: Migrate yifeibot from Toolforge GridEngine to Toolforge Kubernetes.

Feb 10 2023, 8:50 AM · Toolforge (Quota-requests)

zhuyifei1999 created T329350: Request increased quota for yifeibot Toolforge tool.

Feb 10 2023, 8:49 AM · Toolforge (Quota-requests)

Feb 9 2023

zhuyifei1999 added a comment to T320198: Migrate yifeibot from Toolforge GridEngine to Toolforge Kubernetes.

Did that. Used a toolforge-jobs run testshell --command "sleep infinity" --image python3.9 followed by kubectl exec -it testshell-l5qw2 -- /bin/bash

Feb 9 2023, 1:41 PM · Grid-Engine-to-K8s-Migration

Jan 8 2023

zhuyifei1999 added a comment to T320198: Migrate yifeibot from Toolforge GridEngine to Toolforge Kubernetes.

Also I'm trying to spawn a currently-working job on grid via k8s. Getting No module named 'pkg_resources'. I'm suspecting the venv version is too old and needs a rebuild. Is there a supported way to spawn a shell for a given container image?

Jan 8 2023, 6:42 PM · Grid-Engine-to-K8s-Migration

zhuyifei1999 added a comment to T320198: Migrate yifeibot from Toolforge GridEngine to Toolforge Kubernetes.

A direct conversion from the crontab (which has like ~40 entries for various bot jobs) gets a quota error. Though a lot of scripts have been dead for years so let me check which ones to keep.

Jan 8 2023, 6:22 PM · Grid-Engine-to-K8s-Migration

zhuyifei1999 closed T320125: Migrate video2commons from Toolforge GridEngine to Toolforge Kubernetes as Resolved.

Moved & deleted crontab

Jan 8 2023, 1:40 AM · Grid-Engine-to-K8s-Migration

Dec 30 2022

komla awarded T319888: Migrate media-dubiety from Toolforge GridEngine to Toolforge Kubernetes a Love token.

Dec 30 2022, 12:10 AM · Grid-Engine-to-K8s-Migration

Dec 29 2022

zhuyifei1999 added a comment to T320125: Migrate video2commons from Toolforge GridEngine to Toolforge Kubernetes.

(venv) tools.video2commons@tools-sgebastion-10:~$ crontab -l > crontab.bak 
(venv) tools.video2commons@tools-sgebastion-10:~$ crontab -l | tail
# Wikimedia Tool Labs specific note:
#   Please be aware that *only* jsub and jstart are acceptable
#   commands to schedule via cron.  Any command specified here will
#   be modified to be invoked through jsub unless it is one of
#   the two.
# 
# m h  dom mon dow   command

Dec 29 2022, 12:03 AM · Grid-Engine-to-K8s-Migration

Dec 28 2022

zhuyifei1999 closed T319748: Migrate fontcdn from Toolforge GridEngine to Toolforge Kubernetes as Resolved.

tools.fontcdn@tools-sgebastion-10:~$ crontab -l | tail
# For more information see the manual pages of crontab(5) and cron(8)
#
# Wikimedia Tool Labs specific note:
#   Please be aware that *only* jsub and jstart are acceptable
#   commands to schedule via cron.  Any command specified here will
#   be modified to be invoked through jsub unless it is one of
#   the two.
# 
# m h  dom mon dow   command
29 5 * * *  /usr/bin/jsub -N dl-fontdata -once -quiet bash .dl-fontdata.sh 
tools.fontcdn@tools-sgebastion-10:~$ toolforge-jobs run dl-fontdata --command 'bash .dl-fontdata.sh' --image tf-bullseye-std --schedule "26 5 * * *"
tools.fontcdn@tools-sgebastion-10:~$ crontab -r

Dec 28 2022, 11:56 PM · Grid-Engine-to-K8s-Migration

zhuyifei1999 closed T319888: Migrate media-dubiety from Toolforge GridEngine to Toolforge Kubernetes as Resolved.

Bot has been dead for a long time. I just commented out the crontab too.

Dec 28 2022, 11:07 PM · Grid-Engine-to-K8s-Migration

zhuyifei1999 claimed T320198: Migrate yifeibot from Toolforge GridEngine to Toolforge Kubernetes.

I'll handle it during this new year break.

Dec 28 2022, 11:03 PM · Grid-Engine-to-K8s-Migration

Dec 3 2022

zhuyifei1999 added a comment to T324309: Reset Phab & Wikitech 2FA for Zhuyifei1999.

Thanks!

Dec 3 2022, 2:22 PM · wikitech.wikimedia.org, Trust-and-Safety, cloud-services-team (Kanban), Phabricator

Dec 2 2022

zhuyifei1999 added a comment to T324309: Reset Phab & Wikitech 2FA for Zhuyifei1999.

Done.
bastion-eqiad1-03.bastion.eqiad1.wikimedia.cloud:/home/zhuyifei1999/2fa-reset-request.txt

Dec 2 2022, 9:55 PM · wikitech.wikimedia.org, Trust-and-Safety, cloud-services-team (Kanban), Phabricator

Aug 8 2022

zhuyifei1999 added a comment to T306064: Cloud VPS "commonsarchive" project Stretch deprecation.

Disk image backed up to ~tools.commonsarchive/commonsarchive-vda-20220807.bak.img.gz

Aug 8 2022, 1:40 AM · Cloud-VPS (Debian Stretch Deprecation)

Aug 7 2022

zhuyifei1999 added a comment to T306064: Cloud VPS "commonsarchive" project Stretch deprecation.

Backed up XML dump + files to https://archive.org/download/wiki-commonsarchive

Aug 7 2022, 9:45 PM · Cloud-VPS (Debian Stretch Deprecation)

zhuyifei1999 added a comment to T306064: Cloud VPS "commonsarchive" project Stretch deprecation.

commonsarchive-mwtest: Test instance, will delete.
commonsarchive-prod: I'm not interested in maintaining this anymore, but to keep files archived I'll take a disk image for now.

Aug 7 2022, 7:39 PM · Cloud-VPS (Debian Stretch Deprecation)

Jun 2 2022

zhuyifei1999 updated the task description for T308013: Assign SPDX headers to puppet.git.

Jun 2 2022, 10:17 AM · Patch-For-Review, Infrastructure-Foundations, SRE

Feb 23 2022

zhuyifei1999 lowered the priority of T302154: quarry-nfs-1 went down; quarry is offline from Unbreak Now! to High.

Feb 23 2022, 5:42 AM · cloud-services-team (Kanban), Quarry

zhuyifei1999 added a comment to T302302: quarry-nfs-1 - alertname: PuppetAgentStaleLastRun project: quarry.

For the record, disable reason is

Feb 23 2022, 5:40 AM · Cloud-Services-Worktype-Unplanned, Cloud-Services-Origin-Alert, cloud-services-team (Kanban), User-dcaro

Feb 20 2022

zhuyifei1999 added a comment to T302154: quarry-nfs-1 went down; quarry is offline.

The VM was fully unresponsible, even the serial console didn't react at all.

Feb 20 2022, 11:49 PM · cloud-services-team (Kanban), Quarry

zhuyifei1999 added a comment to T302154: quarry-nfs-1 went down; quarry is offline.

kernel bug. RCU should not stall.

Feb 20 2022, 11:35 PM · cloud-services-team (Kanban), Quarry

Nov 17 2021

zhuyifei1999 added a comment to T295115: "Video2commons is getting an error: An exception occurred: FileNotFoundError: b'[Errno 2] No such file or directory'".

I think it's trying to create the file but the directory doesn't exist.

Nov 17 2021, 7:52 PM · video2commons

Nov 5 2021

zhuyifei1999 added a comment to T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'.

(trying to reproduce it again)

Nov 5 2021, 11:25 AM · Upstream, User-TheSandDoctor, Pywikibot, Commons, Pywikibot-category.py

Nov 4 2021

zhuyifei1999 added a comment to T181443: Pywikibot stops when finding the character \uFFFD - 'REPLACEMENT CHARACTER'.

Sorry, I think I was working on it last year and then forgot about this
ticket. I'll check what I was doing back then.

Nov 4 2021, 10:08 AM · Upstream, User-TheSandDoctor, Pywikibot, Commons, Pywikibot-category.py

Oct 17 2021

zhuyifei1999 added a comment to T286415: XSS in ISA tool.

I think we could apply the patch above T286415#7203570. The other issue is looks like more complex fix however.

Oct 17 2021, 12:21 AM · User-Sebastian_Berlin-WMSE, Toolforge-standards-committee, SecTeam-Processed, Vuln-XSS, Toolforge, ISA, Security

Aug 22 2021

zhuyifei1999 added a comment to T289365: video2commons stashfailed.

If this is something "common", it might be helpful to provide a path to directly reupload on failure, skipping the slow transcoding process.

Aug 22 2021, 11:32 PM · video2commons

zhuyifei1999 updated subscribers of T289365: video2commons stashfailed.

iirc, I mentioned this to @bd808 regarding the chunked uploading issue
during hackathon 2019. I don't remember exactly what he said.

Aug 22 2021, 9:36 PM · video2commons

zhuyifei1999 added a comment to T289365: video2commons stashfailed.

FWIW, uploading large files is known to fail 'sometimes', with larger file,
the larger the likelihood to fail. Pretty sure it's some bug in chunked
uploading MediaWiki code since v2c is 'sometimes' able to upload 3+GiB just
fine with the exact same code.

Aug 22 2021, 9:34 PM · video2commons

Aug 12 2021

zhuyifei1999 added a comment to T284970: Video2commons scrubbing metadata.

I downloaded it. I currently don't have time to test it however.

Aug 12 2021, 10:25 PM · video2commons

Aug 3 2021

zhuyifei1999 changed the status of T284970: Video2commons scrubbing metadata from Open to Stalled.

Aug 3 2021, 8:11 PM · video2commons

zhuyifei1999 added a comment to T284970: Video2commons scrubbing metadata.

There's never an example of an mp4 file involved. The previous given example File:Contra dancers at the 2019 Flurry Festival.webm is a webm after transcode. I cannot replicate a transcode with the result and no source.

Aug 3 2021, 8:11 PM · video2commons

Jul 19 2021

zhuyifei1999 added a comment to T286067: Maintainers needed for video2commons.

Thanks and sorry for the late response. My primary laptop was in repairs.

Jul 19 2021, 5:53 PM · Toolforge-standards-committee (Maintainer needed), video2commons

Jul 2 2021

Chicocvenancio awarded T286067: Maintainers needed for video2commons a Love token.

Jul 2 2021, 4:53 PM · Toolforge-standards-committee (Maintainer needed), video2commons

zhuyifei1999 created T286067: Maintainers needed for video2commons.

Jul 2 2021, 4:37 PM · Toolforge-standards-committee (Maintainer needed), video2commons

zhuyifei1999 added a comment to T285982: video2commons needs cleanup on /data/scratch following NFS changes.

I also reduced the 360 days to 180 days to save some space. (now using 703G)

Jul 2 2021, 4:23 PM · video2commons

zhuyifei1999 added a comment to T285982: video2commons needs cleanup on /data/scratch following NFS changes.

The automated process cleans up old files in /data/scratch/video2commons/uploads/ after 360 days and /data/scratch/video2commons/ssu/ after 60 days. Do you need it shrunk right now or do you mean a rolling cleanup process is okay?

Jul 2 2021, 4:16 PM · video2commons

Jun 16 2021

zhuyifei1999 added a comment to T284970: Video2commons scrubbing metadata.

If you can tell me which flags I should add to ffmpeg I'm happy to add that.

Jun 16 2021, 2:56 AM · video2commons

zhuyifei1999 added a comment to T285020: /data/project/yifeibot is getting large (129G), please clean up.

The core file I'm pretty sure is related to T283957. deleted. I also truncated the largest logs. It's now at 26G (du -hs). Resolved?

Jun 16 2021, 12:31 AM · Tools

Jun 12 2021

zhuyifei1999 added a comment to T284863: bot suggests it doesn't have a delete right while it has.

How it finds whether you can delete: https://github.com/wikimedia/pywikibot/blob/595b2497374e40a990303afc75672144ab18ae5f/pywikibot/page/__init__.py#L1746:

Jun 12 2021, 6:32 PM · Pywikibot, Pywikibot-Scripts

Jun 7 2021

zhuyifei1999 closed T283289: Adoption request for stimmberechtigung as Resolved.

Jun 7 2021, 9:56 AM · Toolforge-standards-committee

zhuyifei1999 added a comment to T283289: Adoption request for stimmberechtigung.

Backed up .bash_history to P16310

Jun 7 2021, 9:23 AM · Toolforge-standards-committee

Jun 6 2021

zhuyifei1999 added a comment to T283289: Adoption request for stimmberechtigung.

Should be good to go unless I missed anything.

Jun 6 2021, 9:06 PM · Toolforge-standards-committee

zhuyifei1999 added a comment to T283289: Adoption request for stimmberechtigung.

08:28:40 0 ✓ zhuyifei1999@tools-sgebastion-08: /data/project/stimmberechtigung$ ls -Alh
total 26G
-rw-r--r-- 1 tools.stimmberechtigung tools.stimmberechtigung 159M Nov  6  2019 access.log
-rw------- 1 tools.stimmberechtigung tools.stimmberechtigung  16K Jun  6 20:23 .bash_history
drwxrwsr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 12  2013 bin
-rw------- 1 tools.stimmberechtigung tools.stimmberechtigung  175 Mar 25  2019 crontab.trusty.save
-rw-rw-r-- 1 tools.stimmberechtigung tools.stimmberechtigung  167 Sep 13  2013 .description
-rw-r--r-- 1 tools.stimmberechtigung tools.stimmberechtigung  40G Jun  6 20:29 error.log
drwxrwsr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 10  2013 include
drwxrwsr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Jul  6  2016 .kube
drwxrwsr-x 3 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 10  2013 lib
-rw-rw-r-- 1 tools.stimmberechtigung tools.stimmberechtigung  243 Dec 10  2013 .lighttpd.conf
drwxrwsr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 10  2013 local
drwxrwxr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec  6  2016 logs
-rw------- 1 tools.stimmberechtigung tools.stimmberechtigung 1.5K May 30  2019 .mysql_history
drwxrwsr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 12  2013 .pip
drwxrwsr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Jun  2  2019 public_html
-r-------- 1 tools.stimmberechtigung tools.stimmberechtigung   51 Mar  3  2014 replica.my.cnf
-rw-r--r-- 1 tools.stimmberechtigung tools.stimmberechtigung  49K Mar 25 18:45 service.log
-rw-r--r-- 1 tools.stimmberechtigung tools.stimmberechtigung  186 Mar 25 18:45 service.manifest
drwxrwsr-x 3 tools.stimmberechtigung tools.stimmberechtigung 4.0K Jun  2  2019 src
drwx--S--- 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K May  3  2015 .ssh
drwxr-sr-x 3 tools.stimmberechtigung tools.stimmberechtigung 4.0K Jun  2  2019 stimmberechtigung
-rw-r--r-- 1 tools.stimmberechtigung tools.stimmberechtigung   50 Dec  3  2018 .tmux.conf
drwxr-sr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Dec 17  2019 .toolskube
drwxr-sr-x 2 tools.stimmberechtigung tools.stimmberechtigung 4.0K Jun  2  2019 .vim
-rw------- 1 tools.stimmberechtigung tools.stimmberechtigung  16K Jun  2  2019 .viminfo

Jun 6 2021, 9:01 PM · Toolforge-standards-committee

May 29 2021

zhuyifei1999 added a comment to T283957: Pywikibot memory leak after accessing BasePage.botMayEdit.

In T283957#7123071, @Xqt wrote:

@zhuyifei1999: what is your Python version running?

See also https://bugs.python.org/issue19859

May 29 2021, 7:56 PM · Pywikibot

Dvorapa awarded T283957: Pywikibot memory leak after accessing BasePage.botMayEdit a Burninate token.

May 29 2021, 1:18 PM · Pywikibot

zhuyifei1999 created T283957: Pywikibot memory leak after accessing BasePage.botMayEdit.

May 29 2021, 7:14 AM · Pywikibot

May 4 2021

zhuyifei1999 added a comment to T280807: Exception trying to upload file to Commons Archive at commonsarchive.wmflabs.org.

In T280807#7054353, @Majavah wrote:

In T280807#7054326, @Aklapper wrote:

See https://admin.toolforge.org/tool/commonsarchive for a list of maintainers

That displays it for a Toolforge tool (something.toolforge.org), for a Cloud VPS project (something.wmflabs.org or something.wmcloud.org) you want to look up the name on https://openstack-browser.toolforge.org/proxy/ and look at the maintainers (admins and users) of that project.

May 4 2021, 12:37 PM · VPS-Projects

Apr 16 2021

zhuyifei1999 added a comment to T280215: mysqli (PHP 7.3 kubernetes container) fails to communicate with MariaDB v10.4.12 instance hosted on cyberbot-db-01.cyberbot.eqiad1.wikimedia.cloud.

How do I reproduce this? I can see if I can do a bit of debugging.

Apr 16 2021, 2:45 AM · Toolforge, cloud-services-team (Kanban)

Apr 11 2021

zhuyifei1999 added a comment to T219351: Java jobs run the Stretch grid seem to require a very large memory reservation.

In T219351#5293665, @jberkel wrote:

Sorry if this isn't directly related to the reported bug, but is there any way to find out why a process got sigkilled on the grid? I have a Java process getting terminated without traces of hserr*.log, and I suspect it is related to some native libraries allocating too much off-heap memory. I'd like to confirm that it's due to the OOM killer (or perhaps quota related?)

Apr 11 2021, 11:02 AM · Toolforge

zhuyifei1999 added a comment to T219351: Java jobs run the Stretch grid seem to require a very large memory reservation.

tools-sgeexec-0919 seems to be getting lots of OOM kills lately. This one in particular:

Apr 11 2021, 12:28 AM · Toolforge

zhuyifei1999 created P15271 (An Untitled Masterwork).

Apr 11 2021, 12:22 AM

Mar 27 2021

zhuyifei1999 added a comment to T278583: Quarry should detect a dead worker and report something better than "running" forever.

Hmm. Is the goal trying to find when a worker gets SIGKILL-ed? Celery does
internally detect when a worker dies, as per the logs, but I did not figure
out how to hook it so that it would report to the db.

Mar 27 2021, 9:25 AM · cloud-services-team, Quarry

Mar 3 2021

zhuyifei1999 updated the task description for T269609: Ensure all ToolsDB databases comply with current naming conventions.

Mar 3 2021, 11:16 PM · cloud-services-team, Data-Services

Nov 23 2020

zhuyifei1999 added a comment to T267989: Do some checks of how many Quarry queries will break in a multiinstance environment.

I'll review the queries too to anonymize if needed and paste the results somewhere.

Nov 23 2020, 11:18 PM · Quarry, cloud-services-team (Kanban)

Nov 11 2020

zhuyifei1999 added a comment to T267478: Adoption request for wikilint.

No, the TFSC checks for secret information before the adopter gets the access.

Nov 11 2020, 4:02 PM · Toolforge-standards-committee

Nov 7 2020

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

Mono v5.12.0.226 didn't compile:

Nov 7 2020, 5:18 AM · Tools, Toolforge

Nov 6 2020

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

Hmm interesting. I'm running 6.10.0.104 and didn't consider version to be an issue. Will test later.

Nov 6 2020, 4:19 PM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

Sorry, took a break. I can't reproduce this locally, for some reason.

Nov 6 2020, 1:10 PM · Tools, Toolforge

zhuyifei1999 created P13237 (An Untitled Masterwork).

Nov 6 2020, 12:59 PM

Nov 3 2020

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

Nov 3 2020, 10:34 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

My local laptop is a 4-core 8-thread machine, but I fail to get that many threads. I tried ti trace where the threads starts:

Nov 3 2020, 10:21 AM · Tools, Toolforge

zhuyifei1999 created P13141 (An Untitled Masterwork).

Nov 3 2020, 10:21 AM

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

Do you have a minimal code that starts a thread pool and pause infinitely? I can do a debug build of mono on my laptop and see what is going on.

Nov 3 2020, 8:09 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

FWIW, (ASLR offsets redacted since the process is kept running)

Nov 3 2020, 7:57 AM · Tools, Toolforge

Nov 2 2020

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

Ah ok

Nov 2 2020, 8:18 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

ThreadPool.GetMinThreads(out workerThreads, out completionPortThreads);
ThreadPool.SetMaxThreads(workerThreads * 2, completionPortThreads * 2);

Nov 2 2020, 8:15 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

(btw, k8s's memory limits is on RSS, using cgroups, IIRC)

Nov 2 2020, 7:59 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

Correct. Grid stubbornly enforce virtual memory limits because enforcing RSS isn't as trivial. The former is just an rlimit whereas the latter needs cgroups. I really hate how it works but it is how it is. :/

Nov 2 2020, 7:54 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

Yeah, if so many threads start they get different glibc heaps and it would not be efficient with the address space....

Nov 2 2020, 7:50 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

#if defined (HOST_ANDROID) || defined (HOST_IOS)
worker.limit_worker_max = CLAMP (threads_count * 100, MIN (threads_count, 200), MAX (threads_count, 200));
#else
worker.limit_worker_max = threads_count * 100;
#endif

Nov 2 2020, 7:44 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

What does ThreadPool.GetMaxThreads return?

Nov 2 2020, 7:39 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

Hmm. I wonder if I can get the stack trace of the mono frames

Nov 2 2020, 7:36 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

I accidentally killed the process while gdb-ing, but now gdb-ing in the core dump:

(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 0x2b4c7e0c0680 (LWP 21380) 0x00002b4c7e9c217f in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
  2    Thread 0x2b4c7f3e8700 (LWP 21382) 0x00002b4c7e9c217f in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
  3    Thread 0x2b4c8173b700 (LWP 21385) 0x00002b4c7e9c4556 in do_futex_wait.constprop () from /lib/x86_64-linux-gnu/libpthread.so.0
  4    Thread 0x2b4c8244c700 (LWP 21546) 0x00002b4c7e9c2528 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
  5    Thread 0x2b4c8294e700 (LWP 21550) 0x000055d6f177f5a0 in ?? ()
  6    Thread 0x2b4c82b4f700 (LWP 21551) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  7    Thread 0x2b4c8395f700 (LWP 21554) 0x00002b4c7eec88bd in poll () from /lib/x86_64-linux-gnu/libc.so.6
  8    Thread 0x2b4c83b60700 (LWP 21555) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  9    Thread 0x2b4c83d61700 (LWP 21556) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  10   Thread 0x2b4c83f62700 (LWP 21561) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  11   Thread 0x2b4ca457c700 (LWP 21562) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  12   Thread 0x2b4ca477d700 (LWP 21563) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  13   Thread 0x2b4ca7380700 (LWP 21702) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  14   Thread 0x2b4ca7581700 (LWP 21875) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  15   Thread 0x2b4ca7100700 (LWP 21968) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  16   Thread 0x2b4ca7d00700 (LWP 22015) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  17   Thread 0x2b4ca7f84700 (LWP 22085) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  18   Thread 0x2b4cc4700700 (LWP 22111) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  19   Thread 0x2b4cc4901700 (LWP 22112) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  20   Thread 0x2b4cc4b02700 (LWP 22113) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  21   Thread 0x2b4cc4d03700 (LWP 22122) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  22   Thread 0x2b4cde600700 (LWP 24038) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  23   Thread 0x2b4cdeb00700 (LWP 24039) 0x00002b4c7e9c4720 in do_futex_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
  24   Thread 0x2b4c8274d700 (LWP 24085) 0x00002b4c7e9c2528 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0

Nov 2 2020, 7:36 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

This was when I generated the core:

Nov 2 2020, 7:33 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

$ cat tmp | awk '{ split($1, a, "-"); printf "%s %08x %s %s\n", a[1], strtonum("0x"a[2]) - strtonum("0x"a[1]), $2, $6 }' | sort -k 2 | less -N:

Nov 2 2020, 7:26 AM · Tools, Toolforge

zhuyifei1999 created P13125 (An Untitled Masterwork).

Nov 2 2020, 7:12 AM

Oct 28 2020

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

I have added catch (OutOfMemoryException e) to GZipUnpack with Syscall.kill(currentPID, Signum.SIGSTOP); as you said.
And lowered -mem to 1536m.
If my changes are correct (and OOM will happen inside GZipUnpack as usual), then at some time process will stop after OOM triggered.

Oct 28 2020, 8:37 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

Honestly, this just reflects

Oct 28 2020, 8:08 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

That may be large number. How many cores grid computers have?

Oct 28 2020, 7:49 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

+1 try: big chuck of errors:

Oct 28 2020, 7:43 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

128MiB glibc arena

tools.wikitasks@tools-sgebastion-08:~/wp_cyrlat$ gdb --args mono WikiTasks.exe
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from mono...(no debugging symbols found)...done.
(gdb) catch syscall mmap
Catchpoint 1 (syscall 'mmap' [9])
(gdb) commands
Type commands for breakpoint(s) 1, one per line.
End with a line saying just "end".
>silent
>if $rsi < 50 * 1048576
 >c
 >end
>end
(gdb) r
Starting program: /usr/bin/mono WikiTasks.exe
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff67ff700 (LWP 20010)]
[New Thread 0x7ffff44a4700 (LWP 20011)]
[Switching to Thread 0x7ffff44a4700 (LWP 20011)]
(gdb) bt
#0  0x00007ffff6fda64a in __mmap (addr=addr@entry=0x0, len=len@entry=134217728, prot=prot@entry=0, flags=flags@entry=16418, fd=fd@entry=-1, offset=offset@entry=0)
    at ../sysdeps/unix/sysv/linux/wordsize-64/mmap.c:34
#1  0x00007ffff6f6cad9 in new_heap (size=135168, top_pad=<optimized out>) at arena.c:437
#2  0x00007ffff6f708a6 in _int_new_arena (size=<optimized out>) at arena.c:643
#3  arena_get2 (size=size@entry=1656, avoid_arena=avoid_arena@entry=0x0) at arena.c:875
#4  0x00007ffff6f71c32 in arena_get2 (avoid_arena=0x0, size=1656) at malloc.c:3300
#5  __libc_calloc (n=<optimized out>, elem_size=<optimized out>) at malloc.c:3246
#6  0x000055555585456b in monoeg_g_calloc ()
#7  0x000055555584ae1c in ?? ()
#8  0x0000555555785415 in ?? ()
#9  0x00007ffff74b34a4 in start_thread (arg=0x7ffff44a4700) at pthread_create.c:456
#10 0x00007ffff6fded0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97

Oct 28 2020, 7:36 AM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

As for the hang, you are hitting T195834:

Oct 28 2020, 12:02 AM · Tools, Toolforge

zhuyifei1999 created P13084 (An Untitled Masterwork).

Oct 28 2020, 12:02 AM

Oct 27 2020

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

The large maps are basically zeros as I scroll through with xxd. Both of the maps have an 132KiB read-writable map immediately preceding it but I don't find any readable strings inside, hence I can't be sure what they are actually for or what allocated them by looking at the core dump.

Oct 27 2020, 11:51 PM · Tools, Toolforge

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.

11:21:24 0 ✓ zhuyifei1999@tools-sgeexec-0906: ~$ sudo gdb -p 13609 -batch -ex 'generate-core-file ~tools.wikitasks/T266377.core'
[New LWP 13610]
[New LWP 13611]
[New LWP 13612]
[New LWP 13613]
[New LWP 13614]
[New LWP 13615]
[New LWP 13616]
[New LWP 13617]
[New LWP 13618]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00002b362ae0d17f in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
warning: target file /proc/13609/cmdline contained unexpected null characters
Saved corefile /data/project/wikitasks/T266377.core
11:29:53 0 ✓ zhuyifei1999@tools-sgeexec-0906: ~$ ls -l /data/project/wikitasks/T266377.core
-rw-r--r-- 1 root tools.wikitasks 208923592 Oct 27 23:29 /data/project/wikitasks/T266377.core
11:30:16 0 ✓ zhuyifei1999@tools-sgeexec-0906: ~$ sudo chmod 640 /data/project/wikitasks/T266377.core
11:30:41 0 ✓ zhuyifei1999@tools-sgeexec-0906: ~$ ls -l /data/project/wikitasks/T266377.core
-rw-r----- 1 root tools.wikitasks 208923592 Oct 27 23:29 /data/project/wikitasks/T266377.core

Oct 27 2020, 11:34 PM · Tools, Toolforge

zhuyifei1999 created P13083 (An Untitled Masterwork).

Oct 27 2020, 11:33 PM

zhuyifei1999 added a comment to T266377: wikitasks mono bot crashes when running on grid engine with OutOfMemoryException.