Page MenuHomePhabricator

Some yifeibot tasks seem to hang indefinately
Closed, ResolvedPublic

Description

170586 	gap-PQG8N1-sLfeV5Q 	yifeibot 	Task / Running 	2016-11-21 09:21:49 	4h46m 	806/0
466481 	gap-_gE8ytdUtiwLCw 	yifeibot 	Task / Running 	2016-11-28 21:56:18 	1h8m 	879/0
170570 	gap-HgGOftlBH3Lkvg 	yifeibot 	Task / Running 	2016-11-21 09:20:33 	2h58m 	795/0
466422 	gap-CAE0ecKTgAn34Q 	yifeibot 	Task / Running 	2016-11-28 21:54:04 	44m17s 	994/0

I noticed these jobs while working on T151980 and draining jobs from exec nodes. A brief look at the code makes me think that these jobs are supposed to download a single image from Google Art Project and then exit. It looks like under some circumstances the download can hang indefinitely and leave the job sitting on the grid.

Event Timeline

bd808 created this task.Nov 30 2016, 10:45 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 30 2016, 10:45 PM

I killed the first two jobs so I could finish decommissioning the host they were running on.

The script broke a long time ago, and is being replaced by T147013. I wasn't aware that there are still tasks running. I'll shutdown the web UI soon.

zhuyifei1999 added a comment.EditedDec 1 2016, 12:49 AM

(Oh and slimerjs, which this script depends on, seems to have a strong tendency to segfault on tool labs trusty nodes for some unknown reasons. Both precise and jessie work, but the internal logic of the script is outdated)

zhuyifei1999 closed this task as Resolved.Dec 1 2016, 2:55 PM
zhuyifei1999 claimed this task.

I killed the rest and disabled web job submission. (The code is really ugly :( )