The access.log files for tools don't show new lines since yesterday. (Related to NFS outage?)
Description
Description
Event Timeline
Comment Actions
Possibly. The error.log file descriptors are still functional, but access.log is not.
tools.gerrit-reviewer-bot@tools-webgrid-lighttpd-1402:~$ ls -l /proc/8835/fd (...) l-wx------ 1 tools.gerrit-reviewer-bot tools.gerrit-reviewer-bot 64 Aug 14 15:39 2 -> /data/project/gerrit-reviewer-bot/error.log l-wx------ 1 tools.gerrit-reviewer-bot tools.gerrit-reviewer-bot 64 Aug 14 15:39 3 -> /data/project/gerrit-reviewer-bot/error.log (...) l-wx------ 1 tools.gerrit-reviewer-bot tools.gerrit-reviewer-bot 64 Aug 14 15:39 5 -> /data/project/gerrit-reviewer-bot/access.log (deleted)
Restarting the webservice solves this issue, so we should probably reschedule all webservice tasks.
Comment Actions
And looking at your work in the past months, you probably already have a command that lists all web service jobs started before $time? :-)
Comment Actions
Sort of.
qstat -f -xml | grep 'tools-webgrid' | sed -e 's/.*@//' | sed -e 's/<.*//' > webgrid_hosts qhost -j -h `cat webgrid_hosts` |sed -e 's/^\s*//' | cut -d ' ' -f 1|egrep ^[0-9] > webgrid_jobs sort webgrid_jobs > webgrid_jobs_sorted # spreads affected hosts a bit for i in `cat webgrid_jobs_sorted`; do qmod -rj $i; sleep 5; done
this will take approx 40 minutes.