Configuring Cylc on REANNZ/NeSI platform

Hi all,

On the new REANNZ platform, the cylc command is no longer available. Users must install Cylc using the Conda commands. I’m trying to configure Cylc. My global.cylc file is

#!Jinja2
[platforms]
[[mahuika-slurm]]
hosts = login01
job runner = slurm
global init-script = “”"
module load Miniforge3
conda activate /nesi/project/nesi99999/pletzera/environment/cylc-env
“”"
[[[meta]]]
description = “Submit SLURM jobs locally from login node with Conda environment.”
note = “Cylc and all dependencies are loaded via Conda.”

I get (note the cylc command was not found). As you can see above I used a “global init-script” to load the Conda environment. I’d be grateful for any pointers.

Thanks in advance. --Alex


(cylc-env)login03~/cylc_patterns/cylc-src/test$ cylc play -N test


▪ ■  Cylc Workflow Engine 8.6.0
██   Copyright (C) 2008-2025 NIWA
▝▘    & British Crown (Met Office) & Contributors
INFO - Extracting job.sh to /home/pletzera/cylc-run/test/run1/.service/etc/job.sh
INFO - Workflow: test/run1
INFO - Scheduler: url=tcp://login03.hpc.nesi.org.nz:43046 pid=2298870
INFO - Workflow publisher: url=tcp://login03.hpc.nesi.org.nz:43076
INFO - Run: (re)start number=1, log rollover=1
INFO - Cylc version: 8.6.0
INFO - Run mode: live
INFO - Initial point: 1
INFO - Final point: 1
INFO - Cold start from 1
INFO - New flow: 1 (original flow from 1) 2025-10-24T12:48:24
INFO - [1/a:waiting(runahead)] => waiting
INFO - [1/a:waiting] => waiting(queued)
INFO - [1/b:waiting(runahead)] => waiting
INFO - [1/b:waiting] => waiting(queued)
INFO - [1/a:waiting(queued)] => waiting
INFO - [1/b:waiting(queued)] => waiting
INFO - [1/a:waiting] => preparing
INFO - [1/b:waiting] => preparing
INFO - platform: mahuika-slurm - remote init (on login01)
ERROR - platform: mahuika-slurm - initialisation did not complete
COMMAND:
ssh -oBatchMode=yes -oConnectTimeout=10 login01 env 
CYLC_VERSION=8.6.0 bash --login -c ‘exec “$0” “$@”’ cylc 
remote-init mahuika-slurm $HOME/cylc-run/test/run1
RETURN CODE:
127
STDERR:
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
cylc: line 1: exec: cylc: not foundERROR - [jobs-submit cmd] (init login01)
[jobs-submit ret_code] 1
[jobs-submit err] REMOTE INIT FAILED
ERROR - [jobs-submit cmd] (remote init)
[jobs-submit ret_code] 1
ERROR - [1/a/01:preparing] submission failed
INFO - [1/a/01:preparing] => submit-failed
WARNING - [1/a/01:submit-failed] did not complete the required outputs:
⨯ ┆  succeeded
ERROR - [jobs-submit cmd] (init login01)
[jobs-submit ret_code] 1
[jobs-submit err] REMOTE INIT FAILED
ERROR - [jobs-submit cmd] (remote init)
[jobs-submit ret_code] 1
ERROR - [1/b/01:preparing] submission failed
INFO - [1/b/01:preparing] => submit-failed
WARNING - [1/b/01:submit-failed] did not complete the required outputs:
⨯ ┆  succeeded
ERROR - Incomplete tasks:
* 1/a did not complete the required outputs:
⨯ ┆  succeeded
* 1/b did not complete the required outputs:
⨯ ┆  succeeded
CRITICAL - Workflow stalled
WARNING - PT1H stall timer starts NOW

That’s not going to work, because you need Cylc available on your remote platform to read the config to see the global init script.

As part of Cylc we provide a wrapper script designed to handle activating environments on remote machines. The instructions for using that are here. Let us know how you get on, so we can fine tune the instructions!

Hi,

Thanks.

I still have issues running slurm tasks on a personal installation of cylc 8.6.0 on REANNZ’s platform. The error is

login03~/nesi99999$ cat /home/pletzera/cylc-run/slurm/run1/log/job/1/a/01/job-activity.log
[jobs-submit cmd] (init login01)
[jobs-submit ret_code] 1
[jobs-submit err] REMOTE INIT FAILED
[jobs-submit cmd] (remote init)
[jobs-submit ret_code] 1

I can ssh to login01 just fine…

My global.cylc file is

login03~/nesi99999$ cat ~/.cylc/flow/global.cylc
#!Jinja2
[platforms]
    [[mahuika-slurm]]
        hosts = login01
        job runner = slurm

The steps to install cylc are below

login03~/nesi99999$ cat install_cylc.sh 
export CYLC_HOME=/nesi/nobackup/nesi99999/pletzera/envs/cylc
ml Miniforge3
conda create --prefix $CYLC_HOME --yes
conda activate $CYLC_HOME
conda install -c conda-forge cylc-flow --yes
#
# call cylc through a wrapper
#
mkdir $CYLC_HOME/wrapper
# this will save $CYLC_HOME/wrapper/cylc
cylc get-resources cylc $CYLC_HOME/wrapper
chmod +x $CYLC_HOME/wrapper/cylc
cp $CYLC_HOME/wrapper/cylc $CYLC_HOME/wrapper/cylc.bak
sed -i "s|CYLC_HOME_ROOT=\"\${CYLC_HOME_ROOT:-/opt}\"|CYLC_HOME_ROOT=\"\${CYLC_HOME_ROOT:-${CYLC_HOME}}\"|" $CYLC_HOME/wrapper/cylc
# make sure the wrapper is in the PATH
export PATH=$CYLC_HOME/wrapper:$PATH

The flow.cylc file is

login03~/nesi99999$ cat cylc_patterns/cylc-src/slurm/flow.cylc 
[scheduling]
    [[graph]]
        R1 = """
            a & b
        """
[runtime]
    [[a]]
        platform = mahuika-slurm
        execution time limit = PT1M
        script = """
        echo "executing task A..."
        sleep 5
        echo "done with task A..."
        """
        [[[directives]]]
            --ntasks = 1
    [[b]]
        script = """
        echo "executing task B..."
        sleep 2
        echo "done with task B"
        """

Thanks in advance for any help…

Your setup steps look good.

Assuming that the issue is that the cylc command cannot be found (the scheduler log might provide more debug info), the issue is likely that the wrapper script is not in the default $PATH.

I.E, this puts the wrapper in $PATH in your current session, but not in future sessions:

export PATH=$CYLC_HOME/wrapper:$PATH

Maybe copy this into your .bash_profile file.

Hi Oliver,

Thanks for your help.

I added login02 and login03 to the hosts in global.cylc

login03~/nesi99999/cylc_patterns/cylc-src/slurm$ cat ~/.cylc/flow/global.cylc
#!Jinja2

[platforms]
    [[mahuika-slurm]]
        hosts = login01,login02,login03
        job runner = slurm

and added

export CYLC_HOME=/nesi/nobackup/nesi99999/pletzera/envs/cylc
export PATH=$CYLC_HOME/wrapper:$PATH

to my ~/.bash_profile.

This got me further but I’m still having an error… Note that I’m unable to clean the workflow, maybe the error is related to

login03~/nesi99999/cylc_patterns/cylc-src/slurm$ cylc clean slurm
Would clean the following workflows:
  slurm/run1
Remove these workflows (y/n): y
INFO - Cleaning slurm/run1 on install target: mahuika-slurm
INFO - [mahuika-slurm]
    INFO - Removing directory: /home/pletzera/cylc-run/slurm/run1
    INFO - Removing directory: /home/pletzera/cylc-run/slurm/_cylc-install
    INFO - Removing directory: /home/pletzera/cylc-run/slurm
ERROR - Failed to clean slurm/run1
    Error: /home/pletzera/cylc-run/slurm/run1
CylcError: Clean failed

Though it seems the directory was removed

login03~/nesi99999/cylc_patterns/cylc-src/slurm$ ls -ltr /home/pletzera/cylc-run/slurm/run1
ls: cannot access '/home/pletzera/cylc-run/slurm/run1': No such file or directory
login03~/nesi99999/cylc_patterns/cylc-src/slurm$ cylc vip .
$ cylc validate /nesi/nobackup/nesi99999/pletzera/cylc_patterns/cylc-src/slurm
Valid for cylc-8.6.0
$ cylc install /nesi/nobackup/nesi99999/pletzera/cylc_patterns/cylc-src/slurm
INSTALLED slurm/run1 from /nesi/nobackup/nesi99999/pletzera/cylc_patterns/cylc-src/slurm
$ cylc play slurm/run1

 ▪ ■  Cylc Workflow Engine 8.6.0
 ██   Copyright (C) 2008-2025 NIWA
▝▘    & British Crown (Met Office) & Contributors

INFO - Extracting job.sh to /home/pletzera/cylc-run/slurm/run1/.service/etc/job.sh
slurm/run1: login03.hpc.nesi.org.nz PID=1687313
login03~/nesi99999/cylc_patterns/cylc-src/slurm$ cylc cat-log slurm 
2025-11-03T10:32:56+13:00 INFO - Workflow: slurm/run1
2025-11-03T10:32:56+13:00 INFO - Scheduler: url=tcp://login03.hpc.nesi.org.nz:43094 pid=1687313
2025-11-03T10:32:56+13:00 INFO - Workflow publisher: url=tcp://login03.hpc.nesi.org.nz:43027
2025-11-03T10:32:56+13:00 INFO - Run: (re)start number=1, log rollover=1
2025-11-03T10:32:56+13:00 INFO - Cylc version: 8.6.0
2025-11-03T10:32:56+13:00 INFO - Run mode: live
2025-11-03T10:32:56+13:00 INFO - Initial point: 1
2025-11-03T10:32:56+13:00 INFO - Final point: 1
2025-11-03T10:32:56+13:00 INFO - Cold start from 1
2025-11-03T10:32:56+13:00 INFO - New flow: 1 (original flow from 1) 2025-11-03T10:32:56
2025-11-03T10:32:56+13:00 INFO - [1/a:waiting(runahead)] => waiting
2025-11-03T10:32:56+13:00 INFO - [1/a:waiting] => waiting(queued)
2025-11-03T10:32:56+13:00 INFO - [1/b:waiting(runahead)] => waiting
2025-11-03T10:32:56+13:00 INFO - [1/b:waiting] => waiting(queued)
2025-11-03T10:32:56+13:00 INFO - [1/a:waiting(queued)] => waiting
2025-11-03T10:32:56+13:00 INFO - [1/b:waiting(queued)] => waiting
2025-11-03T10:32:56+13:00 INFO - [1/a:waiting] => preparing
2025-11-03T10:32:56+13:00 INFO - [1/b:waiting] => preparing
2025-11-03T10:32:56+13:00 INFO - platform: mahuika-slurm - remote init (on login01)
2025-11-03T10:32:58+13:00 INFO - [1/b/01:preparing] submitted to localhost:background[1687565]
2025-11-03T10:32:58+13:00 INFO - [1/b/01:preparing] => submitted
2025-11-03T10:33:03+13:00 ERROR - platform: mahuika-slurm - initialisation did not complete
    COMMAND:
        ssh -oBatchMode=yes -oConnectTimeout=10 login01 env \
            CYLC_VERSION=8.6.0 bash --login -c 'exec "$0" "$@"' cylc \
            remote-init mahuika-slurm $HOME/cylc-run/slurm/run1
    RETURN CODE:
        0
    STDOUT:
        REMOTE INIT FAILED
        Unexpected key directory exists: /home/pletzera/cylc-run/slurm/run1/.service/client_public_keys Check global.cylc install target is configured correctly for this platform.
    STDERR:
        tput: No value for $TERM and no -T specified
        tput: No value for $TERM and no -T specified
        tput: No value for $TERM and no -T specified
        tput: No value for $TERM and no -T specified
        tput: No value for $TERM and no -T specified
2025-11-03T10:33:03+13:00 ERROR - [jobs-submit cmd] (init login01)
    [jobs-submit ret_code] 1
    [jobs-submit err] REMOTE INIT FAILED
2025-11-03T10:33:03+13:00 ERROR - [jobs-submit cmd] (remote init)
    [jobs-submit ret_code] 1
2025-11-03T10:33:03+13:00 ERROR - [1/a/01:preparing] submission failed
2025-11-03T10:33:03+13:00 INFO - [1/a/01:preparing] => submit-failed
2025-11-03T10:33:03+13:00 WARNING - [1/a/01:submit-failed] did not complete the required outputs:
    ⨯ ┆  succeeded
2025-11-03T10:33:04+13:00 INFO - [1/b/01:submitted] => running
2025-11-03T10:33:06+13:00 INFO - [1/b/01:running] => succeeded
2025-11-03T10:33:06+13:00 ERROR - Incomplete tasks:
    * 1/a did not complete the required outputs:
      ⨯ ┆  succeeded
2025-11-03T10:33:06+13:00 CRITICAL - Workflow stalled
2025-11-03T10:33:06+13:00 WARNING - PT1H stall timer starts NOW

Your mahuika-slurm platform uses the same filesystem as the Cylc scheduler host so you need to set install target = localhost, i.e. change your global.cylc to be:

[platforms]
    [[mahuika-slurm]]
        hosts = login01,login02,login03
        job runner = slurm
        install target = localhost

See Platform Configuration — Cylc 8.6.0 documentation

Thanks Oliver. That worked!