Jobs that recently exited aren't available via qstat -j #JOB, which isn't the expected behavior per https://wikitech.wikimedia.org/wiki/Help:Toolforge/Grid#Returning_the_status_of_a_particular_job.
Notes from IRC:
15:13:03 <bd808> legoktm: did the job exit recently? We rotate the data file for that state lookup 15:13:44 <legoktm> bd808: it exited probably within ~10 min of me trying to look at the exit status 15:13:56 <bd808> hmm 15:14:39 <bd808> sigint almost always means OOM 15:14:54 <legoktm> bd808: 7142037 exited about a minute ago and doesn't exist according to `qstat -j` 15:15:44 <legoktm> yeah, I guessed as much. At this point the `qstat -j` questions are more to figure out whether the documentation is out of date or something isn't working as expected 15:16:16 <bd808> `qstat -j '*'` has stuff, but not nearly as much as I would expect 15:17:21 <bd808> it should list all the things that you can see at https://sge-status.toolforge.org/ 15:17:33 <bd808> and it pretty obviously does not 15:18:43 <bd808> hmm.. or does it 15:18:59 <bd808> /usr/bin/qstat -j '*' | grep job_number|wc -l == 757 jobs 15:25:53 <bd808> legoktm: I am not sure why, but qstat seems to only show running jobs even when looked up by id and not any historical jobs 15:26:24 <bd808> we haven't done anything purposefully to change the grid for a long time 15:26:32 <legoktm> should I file a bug? 15:27:06 <bd808> I wonder if tracking historic jobs got messed up by nfs restarts or something? 15:27:10 <bd808> legoktm: sure