My job, job.status are available in /scratch/masabas/bandwidth/cylc-run/…… and the job.err, job.out files are available here /scratch/masabas/cylc-run/… . My cylc is 8.5.1.
ls /scratch/masabas/bandwidth/cylc-run/workforecastmain6/run1/log/job/20250813T1200Z/unGrib/11/*
/scratch/masabas/bandwidth/cylc-run/workforecastmain6/run1/log/job/20250813T1200Z/unGrib/11/job
/scratch/masabas/bandwidth/cylc-run/workforecastmain6/run1/log/job/20250813T1200Z/unGrib/11/job.status
ls /scratch/masabas/cylc-run/workforecastmain6/run1/log/job/20250813T1200Z/unGrib/11/*
/scratch/masabas/cylc-run/workforecastmain6/run1/log/job/20250813T1200Z/unGrib/11/job.err
/scratch/masabas/cylc-run/workforecastmain6/run1/log/job/20250813T1200Z/unGrib/11/job.out
And the work dir is /scratch/masabas/bandwidth/cylc-run/workforecastmain6/run1/work/20250813T1200Z/unGrib/
The $HOME variable is set by default on Linux platforms. However, a couple of unusual HPC setups have decided not to do this, causing issues with some tools, including cylc.
I think this might be a case of a platform with no $HOME directory?
The job.err and job.out are writing at /scratch/masabas/cylc-run/…. and the rest of the job, job.status files are writing at /scratch/masabas/bandwidth/cylc-run/…..
masabas(login4): /home/masabas>ll /scratch/masabas/cylc-run/workforecastmain6/run1/log/job/20250813T1200Z/unGrib/17/
total 12
-rw-r--r-- 1 masabas g-masabas 4322 Aug 14 12:35 job.err
-rw-r--r-- 1 masabas g-masabas 1303 Aug 14 12:35 job.out
masabas(login4): /home/masabas>ll /scratch/masabas/bandwidth/cylc-run/workforecastmain6/run1/log/job/20250813T1200Z/unGrib/17/
total 12
-rwxr-xr-x 1 masabas g-masabas 6009 Aug 14 12:35 job
-rw-r--r-- 1 masabas g-masabas 167 Aug 14 12:35 job.status
The recent change in my global.cylc is
[install]
[[symlink dirs]]
[[[shaheennew]]]
run = /scratch/masabas/bandwidth
log = /scratch/masabas/bandwidth
work = /scratch/masabas/bandwidth
share = /scratch/masabas/bandwidth
before is
[install]
[[symlink dirs]]
[[[shaheennew]]]
run = /scratch/masabas/
log = /scratch/masabas/
work = /scratch/masabas/
share = /scratch/masabas/
Why it is consider two paths? /scratch/masabas/ and /scratch/masabas/bandwidth/even my symlink dirs are defined?
The setup outlined here is correct, if your workflow is running fine, there’s no need to change the platform configuration.
The cylc cat-log command (which is used by the GUI and Tui to display log files) will need to be changed in order to work for platforms with no $HOME directory (it is currently looking for the symlink to the files in $HOME).
I have opened an issue to fix this issue:
Until we have a fix, you can work around the problem by configuring retrieve job logs for the platform. This will tell Cylc to copy the job logs from the HPC onto the local host once the job has finished.
You won’t be able to view the logs via Cylc whilst the job is running, but you will be able to view them once the job finishes and the logs have been coppied.
Hmm, that suggests it retrieved some of the files (e.g. job.err) but not the job.out?
Try taking a look at the job’s job-activity.log file, it might contain some clues about what went wrong.
On some PBS HPCs the job.out and job.err files are written to a temporary location whilst the job is running, then PBS moves these files to the configured location once the job has succeeded. This might be the cause of the error? If so, Cylc can support this, configure retries for the job log retrieval using retrieve job logs retry delays and Cylc will keep retrying until successful.
We configure retrieve job logs retry delays = PT10S, PT30S, PT3M for our PBS HPC.
This is a homeless compute node slurm job submission script. Here, the - -output, - - error is writing homelessly (I am suspecting this is still missing HOME). If you submit a small job in login node, then - -output is directed with HOME/…..
Previously, the [symlink dirs] /scratch/masabas which is equivalent to homeless HOME. So, no issue is raised. It works fine. Because, homeless HOME, symlink dirs are directing same area.
This time I am trying with [symlink dirs] /scratch/masabas/bandwidth .This is not matching with the homeless HOME. So, not able to sync from remote to host.
The job.out and job.err files are written to a temporary location, which is defined as CYLC_RUN_DIR could be a good solution.
/var/spool/slurmd/job7522798/slurm_script: line 61: /scratch/masabas/bandwidth/cylc-run/cylctest/run2/.service/etc/job.sh: No such file or directory /var/spool/slurmd/job7522798/slurm_script: line 62: cylc__job__main: command not found