What are the requirements for using run hosts? My understanding it that this launches the cylc scheduler process itself on a separate server to where you run cylc install
and cylc play
.
I get errors saying the flow.cylc file cannot be found on the remote server when I try using this option, is it expected that the [run host]
should share a disk with where you’re running cylc play
?
Hello,
this launches the cylc scheduler process itself on a separate server to where you run cylc install
and cylc play
.
Yes, if run hosts
are configured, the cylc play
command will pick one of the hosts (according to any configured ranking) and re-invoke itself on that host via SSH.
The run hosts
must:
- Share a common
$HOME
directory.
- Share a common Cylc global config (
global.cylc
).
- Be set up to allow passwordless SSH between them.
The documentation for this was written recently so has not yet been published yet but you can get a sneak peak via the “nightly” build of the documentation:
https://cylc.github.io/cylc-doc/nightly/html/user-guide/writing-workflows/scheduler.html#submitting-workflows-to-a-pool-of-hosts
Not sure it’s clear from Oliver’s response, or the documentation (we’ll fix that), but yes - you need to run cylc
commands on a host that sees your workflow run directories.
Typical setup, as at NIWA: users log into particular HPC nodes for interactive work, including starting and interacting with Cylc workflows. And the run hosts
global config ensures that all the Cylc schedulers start on a small pool of dedicated “Cylc nodes”, with basic load balancing at start up. (And all the nodes are on the shared filesystem).
Hello @oliver.sanders I am quite new to Cylc eco-system and discovering it to set it up on our HPC platform. We would like to setup a dedicated node to use it as Cylc scheduler and we are interested in using run hosts
. All our nodes share a common HOME
and SCRATCH
. But cylc
is not on PATH
by default and we provide an environment module to be able to use it. When I test with run hosts
with a remote node (with passwordless SSH), cylc play <workflow>
fails saying cylc command not found
, which makes sense.
Is there a way to manipulate PATH
before invoking scheduler without having to mess up user’s .profile
or .bashrc
? Preferably using global.cylc
config file. Something like cylc path
that is provided for platforms
Cheers!!
Hi,
The way we handle this is to use a “wrapper script” called cylc
which activates the required environment, then runs the Cylc command and put that “wrapper script” in the PATH.
A basic wrapper script might look like this:
#!/usr/bin/env bash
module load cylc
exec cylc "$@"
The wrapper script we use at our site is designed to work with Conda and virtualenv environments and is built into the Cylc package, you can extract it with this command:
$ cylc get-resources cylc /somewhere/in/the/system/path
This wrapper script doesn’t need to be updated for newer versions of Cylc so only needs to be installed once. It supports parallel Cylc installations using an environment variable to switch versions e.g:
$ cylc version
8.1.3
$ export CYLC_VERSION=8.0.4
$ cylc version
8.0.4
$ export CYLC_VERSION=7
$ cylc version
7.8.12
This makes it easier to upgrade environments because you don’t have to shut down any workflows running with that environment first.
The CYLC_VERSION (and a derived variable called CYLC_ENV_NAME) are automatically forwarded to all Cylc commands (including remote commands) so that different versions are completely parallel even for a distributed installation.
There are a couple notes on the wrapper script here:
https://cylc.github.io/cylc-doc/stable/html/installation.html#managing-environments
Thanks a lot @oliver.sanders for such a quick response. I checked the wrapper script and what you said makes sense. Our current environment module will add wrapper script to PATH
and we can create multiple modules for multiple versions by simply changing CYLC_VERSION
in each module which is pretty neat. I dont know if we will set the wrapper script on PATH
“by default” as there are only subset of users that will use Cylc on our HPC platform.
But I figured out the issue and you made a PR recently to address this. I just need to ensure to have a localhost
section in platforms and point cylc path
to this wrapper script. Worked like a charm!!
This is my relevant test config
[scheduler]
[[run hosts]]
available = cylc-scheduler-node
[install]
max depth = 4
[[symlink dirs]]
[[[localhost]]]
run = ${WORK}/cylc
log = ${WORK}/cylc
share = ${WORK}/cylc
work = ${WORK}/cylc
[platforms]
[[hpc_platform]]
hosts = localhost
job runner = slurm
shell = /bin/bash
cylc path = /path/to/wrapper_script
install target = localhost
[[localhost]]
cylc path = /path/to/wrapper_script
So when I execute cylc play <>
on a login node, it picks up config from localhost
platform and it will use full path specified in cylc path
when setting up scheduler on cylc-scheduler-node
via SSH. This way we will not have to add any Cylc related binaries to default PATH
.
But I figured out the issue and you made a PR recently to address this.
You beat my memory to it !
1 Like