I am trying to come up with the best way to describe the the environment of ECMWF’s HPC center.
I have this:
[[ecmwf_login]]
hosts = hpc-login
job runner = background
communication method = poll
retrieve job logs = True
install target = hpc-login
[[ecmwf_compute]]
hosts = hpc-login
job runner = slurm
retrieve job logs =True
communication method = poll
install target = hpc-login
But it’s not quite right. “hpc-login” is what we ssh to, but the actual hosts are named a[a-z]\d-\d{3,} , and compute nodes with a similar pattern. This causes issues with job running in bg on ecmwf_login. I’ve solved the some issue on our local MeteoFrance hpc:
[[belenos-bg1]]
hosts = belenoslogin1
cylc path = /home/ext/mr/smer/turekg/miniconda3/envs/cylc/bin
job runner = background
retrieve job logs = True
install target = belenos
execution polling intervals = PT1M
[[belenos-bg2]]
hosts = belenoslogin2
cylc path = /home/ext/mr/smer/turekg/miniconda3/envs/cylc/bin
job runner = background
retrieve job logs = True
install target = belenos
execution polling intervals = PT1M
[[belenos-bg3]]
hosts = belenoslogin3
cylc path = /home/ext/mr/smer/turekg/miniconda3/envs/cylc/bin
job runner = background
retrieve job logs = True
install target = belenos
execution polling intervals = PT1M
[platform groups]
[[belenos_login]]
# pick one of these platforms to run operations on
platforms = belenos-bg1, belenos-bg2, belenos-bg3
so if I designate a platform [[ a[a-z]\d-\d{3,} ]] what is hosts = ?
and then what would be platforms = ? for a platform group named [[ecmwf_login]] ?
Thanx!