Hi there I have a problem with the use of CYLC_WORKFLOW_SHARE_DIR. In the runtime section of my flow.cylc I have the following (note that the workflow is executed on a remote machine and that my global.cylc file specifies specific paths to softlink the work and share dirs to)
The python script calculates a number of environment variables and then is supposed to output them to a file in the CYLC_WORKFLOW_SHARE_DIR:
# Write out file with dynamic env variables
output = os.path.join(os.environ['CYLC_WORKFLOW_SHARE_DIR'],'dyn_env_vars.sh')
f = open(output, "w")
f.write('#!/bin/bash\n')
for k, v in dict_exp.items():
f.write(f"export {k}='{v}'\n")
f.close()
This file is never generated however. What am I missing?
Thanx
Gaby
We’re going to need a bit more information to understand why this example isn’t working as you expect it to.
Here’s a quick example showing how the share dir works:
[scheduling]
[[graph]]
R1 = one => two
[runtime]
[[one]]
# write to a file in the share dir in one task
script = """
echo 'Hello World!' > "${CYLC_WORKFLOW_SHARE_DIR}/message"
"""
[[two]]
# read from a file in the share dir from another task
script = """
cat "${CYLC_WORKFLOW_SHARE_DIR}/message"
"""
$ cylc vip -n myworkflow
$ # wait for the workflow to finish
$ cylc cat myworkflow//1/two
Workflow : myworkflow/run1
Job : 1/two/01 (try 1)
User@Host: me@localhost
Hello World!
2024-09-25T14:57:02+01:00 INFO - started
2024-09-25T14:57:05+01:00 INFO - succeeded
$ tree ~/cylc-run/myworkflow/runN/share
...myworkflow/runN/share
|-- cycle -> .../myworkflow/run1/share/cycle
`-- message
The share directory is specific to the platform you are using (unless your platform has been configured to use a shared filesystem).
Cylc doesn’t have any fancy logic to monitor the share directory, detect any newly created or modified files, then synchronise them to other platforms. You have to move the data yourself, e.g. using rsync or scp.
The work directory is local too so that won’t work either (without an explicit rsync/scp).
One option that is sometimes used for examples like this is to make use of Cylc’s automatic installation functionality. When the first task runs on a remote platform, Cylc will copy across various files that the remote jobs might need, e.g. the bin/ directory.
You can see what files Cylc will install by default here (note you can’t configure the work/ or share/ directories).
Here’s an example that puts the file into the lib/ directory (which gets installed by default):
Taking a look at your script above, it looks like you are trying to get one task to write out a file that sets a bunch of environment variables for other tasks to use?
If so, you might want to look into broadcasts as a possible solution. Broadcasts are implemented as messages to the Cylc scheduler, rather than as files on the filesystem. As a result no files or synchronisation are required.
Yes, your example with lib works for me, but it the same mods in my code do not work… It does not look like the script is being run at all
Here’s my flow.cylc in its entirety:
#!Jinja2
[meta]
title = "GLO12V4 prototype cylc workflow"
descripton = """
Prototype to reproduce a GLO12V4 ease pipeline
as a cylc workflow. No archiving or storage tasks to begin with
"""
{% from "d2j" import d2j %}
{% set CYCLE_START_DATE = '20201001' %}
{% set CYCLE_END_DATE = '20201002' %}
{% set JUL_START_DATE = d2j(CYCLE_START_DATE) %}
{% set JUL_END_DATE = d2j(CYCLE_END_DATE) %}
{% from "datetime" import datetime as dt %}
{% set LCYCLE = (dt.strptime(CYCLE_END_DATE, '%Y%m%d').date() - dt.strptime(CYCLE_START_DATE, '%Y%m%d').date()).days -1 %}
[scheduler]
install = env/, script/, config/
[task parameters]
recup = obs, atmf, bdy, statics, postfiles, assimparam
archi = mkoutputdir, mkdirstorage
[ scheduling ]
initial cycle point = {{ CYCLE_START_DATE }}
final cycle point = {{ CYCLE_END_DATE }}
[[ graph ]]
P1D = """ recup<recup> & archi<archi> => recup_mfiles => run_model"""
%include './env/environment.cylc'
[ runtime ]
[[ DATES ]]
script = """
${CYLC_WORKFLOW_RUN_DIR}/env/init_dyn_vars.py
"""
[[[ environment ]]]
MOI_julstart = {{ JUL_START_DATE }}
MOI_julstop = {{ JUL_END_DATE }}
MOI_lcycle = {{ LCYCLE }}
MOI_dstart = {{ CYCLE_START_DATE }}
MOI_dstop = {{ CYCLE_END_DATE }}
[[ TRANSFERT ]]
[[[ directives ]]]
--partition = transfert
--nodes = 1
--ntasks = 1
--time = 0:10:00
[[ NORMAL ]]
[[[ directives ]]]
--partition = normal256
--nodes = 1
--ntasks = 128
--time = 0:30:00
--mem = 247000
[[ PREP ]]
inherit = DATES, TRANSFERT
platform = belenos
env-script = """
conda activate glo12_ease
"""
[[ recup<recup> ]]
inherit = PREP
script = ${CYLC_WORKFLOW_RUN_DIR}/script/prep/recup_${CYLC_TASK_PARAM_recup}.sh
[[ archi<archi> ]]
inherit = PREP, DATES
script = ${CYLC_WORKFLOW_RUN_DIR}/script/prep/archi_${CYLC_TASK_PARAM_archi}.sh
[[[ directives ]]]
--time = 0:30:00
[[[ environment ]]]
MOI_ensemble_start = ${MOI_model_ensemble_start}
MOI_ensemble_end = ${MOI_model_ensemble_end}
MOI_tagcycle = R${MOI_dstop}M${MOI_ensemble_start}_${MOI_ensemble_end}
[[ MODEL_RUN ]]
platform = belenos
env-script = """
module purge
export MODULEPATH=/home/ext/mr/smer/soniv/SAVE/modulefiles:$MODULEPATH
module load gcc/9.2.0 intel/2018.5.274 intelmpi/2018.5.274 phdf5/1.8.18 netcdf_par/4.7.1_V2 xios-trunk_rev2134
# include machine dependent file with specific env vars for mpich implementation
# and function to excute the specific mpich commanda (mpirun, aprun, srun etc).
# variable HOST is defined in the suite definition file and is the name of the
# host running the parallel job
. ${CYLC_WORKFLOW_RUN_DIR}/env/mpich_belenos.sh
"""
[[ MODEL_RUN_CONDA ]]
platform = belenos
env-script = """
module purge
export MODULEPATH=/home/ext/mr/smer/soniv/SAVE/modulefiles:$MODULEPATH
module load gcc/9.2.0 intel/2018.5.274 intelmpi/2018.5.274 phdf5/1.8.18 netcdf_par/4.7.1_V2 xios-trunk_rev2134
# include machine dependent file with specific env vars for mpich implementation
# and function to excute the specific mpich commanda (mpirun, aprun, srun etc).
# variable HOST is defined in the suite definition file and is the name of the
# host running the parallel job
. ${CYLC_WORKFLOW_RUN_DIR}/env/mpich_belenos.sh
conda activate glo12_ease
"""
[[ recup_mfiles ]]
inherit = PREP
script = "${CYLC_WORKFLOW_RUN_DIR}/script/model_run/recup_mfiles.sh"
[[[ environment ]]]
MOI_model_freqout = 1
MOI_dir_tmprun = ${MOI_dir_calc_tmp}/TMPRUN/${MOI_ENSMEMBER_DIR}
MOI_DIR_CALCU_PARAM = ${MOI_dir_calc_param}
MOI_dirout_modelsshbudget = ${MOI_dir_calc_tmp}/MODEL_SSH/${MOI_TYPERUN}/${MOI_ENSMEMBER_DIR}
[[ run_model ]]
inherit = DATES, MODEL_RUN_CONDA, NORMAL
script = '${CYLC_WORKFLOW_RUN_DIR}/script/model_run/model_run.sh'
[[[ directives ]]]
--nodes = 20
--time = 0:20:00
[[[ environment ]]]
# Directory paths
MOI_ioserver_program = xios_server.exe
MOI_model_program = /home/ext/mr/smer/ruggierog/TOOLS/ease_lib_51b653ddfc/branch_4.2_nemoi_1_stochastic_perturbations/cfgs/iORCA025_ICE/BLD/bin/nemo.exe
MOI_model_procs_node = 125
MOI_model_ntasks = 2500
MOI_ioserver_procs_node = 3
MOI_ioserver_ntasks = 42
When the first task runs on a remote platform, Cylc will copy across various files that the remote jobs might need, e.g. the bin/ directory.
If you are relying on Cylc’s remote installation to install your file, then you need to ensure that your file exists before Cylc attempts to install the workflow onto the remote platform. This happens when the first task is submitted to that platform.
Have you had a look at the “broadcast” solution above? If this fits your requirement it’s a much cleaner solution than writing the variables to a file.
I think what I need to do is add an initial remote task that runs the init_dyn_vars.py that creates the file. It is only needed by a couple of tasks. I suppose I could also run it as part of the tasks themselves, but I want to avoid duplication if I can.
I don’t think the complexity of the computation should pose a barrier to using cylc broadcast.
You can even build the cylc broadcast command from within your Python script if it helps e.g:
# untested
workflow_id = os.environ['CYLC_WORKFLOW_ID']
cmd = ['cylc', 'broadcast', workflow_id]
for key, value in compute_env():
cmd.extend(['-s', f'[environment]{key}={value}'])
call(cmd)
Alternatively, the cylc broadcast command can use “broadcast files” which use Cylc format which will be a very similar format to the Bash environment files your script is currently generating.
Thanx for all these ideas. Once we have this workflow working, I plan to do some serious refactoring leveraging as many cylc’s capabilities, including broadcasting. Right now we need to get a working workflow to test against the shell script driven operational one, and the devil is in the environment variables…