When using --follow an artificial log entry is emitted every ~15 seconds
tools.cluebot3@tools-bastion-15:~$ date; toolforge jobs logs -f cluebot3 Thu Nov 13 15:28:44 UTC 2025 2025-11-13T15:29:00.278526Z [nopod] [nocontainer] No logs received yet for job 'cluebot3', maybe the tool is using filelog or the job name is not correct? Will continue waiting just in case 2025-11-13T15:29:15.283920Z [nopod] [nocontainer] No logs received yet for job 'cluebot3', maybe the tool is using filelog or the job name is not correct? Will continue waiting just in case 2025-11-13T15:29:30.285722Z [nopod] [nocontainer] No logs received yet for job 'cluebot3', maybe the tool is using filelog or the job name is not correct? Will continue waiting just in case
This makes the contents inconsistent to the 'fetch' mode
tools.cluebot3@tools-bastion-15:~$ toolforge jobs logs cluebot3 ERROR: Job 'cluebot3' does not have any logs available
I am currently using the get_raw_lines method to fetch all logs, adding them to a list, until a pre-defined end marker is seen (working around previous issues with logs being dropped).
This requires a lot of calls to the logging endpoint and would be better served by the streaming endpoint, however the streaming endpoint pollutes the job output, which is persisted for the run (e.g. https://cluebotng-trainer.toolforge.org/Original%20Testing%20Training%20Set%20-%20Old%20Triplet/2025-08-30%2023:13:04/logs/bayes-train.log)
These can be filtered out by the pod and container fields being known strings, however those are are really 'internal' identifiers. Having an additional field to identify the contents is a 'response message' (to use the same name as what is used elsewhere) rather than a log entry would likely be better.