Cylc8 symlink-dirs for remote host?

Hi,

I’m trying to choose a path for my share and work directories on a remote host at install time. From my experimenting and reading, it sounds like this is only configurable in the global.cylc file, but not using the --symlink-dirs option in cylc install. I did try this CLI option with several different permutations, but then looked at the source code, and it does look like its localhost only.

Is there some way I haven’t seen to be able to do this?

For example, I want to install

  • WorkflowA → DEV-HOST → share=/g/sc/disk_one/workflowA_project/$USER
  • WorkflowB → DEV-HOST → share=/g/sc/disk_one/workflowB_project/$USER
  • WorkflowC → DEV-HOST → share=/g/sc/disk_two/workflowC_project/$USER

I don’t think we want a separaet platform for each project and user for a given host, so I hope I’m missing something as this functionality existed, and we used it widely, in rose suite-run.

Thanks for any advice you can provide.

Sorry, it’s not currently possible to configure the symlink dirs on a per workflow basis (other than localhost). I’ve opened https://github.com/cylc/cylc-flow/issues/5418.

The --symlink-dirs option to cylc install only affects localhost because they are the only symlinks set up during installation. All the other symlinks are creating during workflow execution.

So, at present, separate platforms are the only solution. You’d need to configure a separate platform for each project. Users would then need to configure their own symlink dirs for the relevant install targets. Alternatively you could use environment variables which would need to be configured (either centrally or by the users) on the remote hosts, e.g.

[install]
    [[symlink dirs]]
        [[[dev-platform-projectA]]]
            share = $PROJ_A_DATA

At what point does that variable need to be defined if we go that route? In the first tasks environment space? Root? Or does it need to be in an init script or something like that?

Thanks for your help.

The variable needs to be present in the users login environment on the remote platform. So, it could be configured centrally (in /etc/profile.d) or users could be asked to add it into their .bash_profile.

You can’t configure it in the workflow or in the cylc global config.

Just to confirm something else here, the global.cylc file doesn’t know about jinja2 varaibles defined in the suite does it? From limited testing, it seems to ignore them. Even doing cylc config -s a="b" doesn’t appear to do anything for me despite the -s and -S options existing in the help instructions from cylc config --help?

  • The global Cylc configuration file does not know anything about any particular workflow.
  • Jinja2 does work in the global config (note you need the #!Jinja2 shebang at the top of the file).
  • The -s option for cylc config does not make sense with the global config as this file is loaded by Cylc commands automatically.

We have a similar requirement at my site, to symlink to project-specific data areas. So the mapping is project- rather than workflow- or host-specific, but users often have only a few workflows per project.

In the Cylc 8 global config, Jinja2 is used to configure project-based run-dir symlinking if $PROJECT is defined in the user’s environment:

# RUN DIRECTORY SYMLINKING
# USERS SHOULD DEFINE $PROJECT SO THAT cylc install CAN MAKE THE SYMLINK TO
# /nesi/noback PROJECT SPACES. Otherwise /home/$USER disk quota is at risk!
{% if environ["PROJECT"] | default("x") != "x" %}
[install]
    [[symlink dirs]]
        [[[localhost]]]
            run = /nesi/nobackup/$PROJECT/$USER
{% endif %}

Users with many projects can have a file that maps workflow ID to project name, and we provide a small utility to parse that and export the appropriate $PROJECT name on the fly.

Is the workflow ID available available when things are being installed? If so, exactly what variables are available at install time?

Slightly over-enthusiastic use of the term “workflow ID” on my part sorry - I really just mean workflow name, in the source directory. Which usually, but not necessarily, maps to the installed workflow ID. So the source location is “available” at install time, because that’s where it gets installed from.

To be explicit, I would have a file $HOME/.cylc/projects that contains, e.g.:

# workflow-name = project-code
nwp-city1 = proj00004
nwp-city2 = proj00994
...

Then when working in ~/cylc-src/nwp-city1 I would run the utility shell function set-hpc-project (or if not in the source directory I could put nwp-city1 as an command line argument) and the it would simply export PROJECT=proj00004 in my environment. Then cylc install uses that in the run-dir symlink, as prescribed in the global config snippet above.

For the record, there is also a related Jinja2 function that extracts the right project code for our Slurm job accounting ID, when the flow.cylc is parsed at start-up:

#!Jinja2
...
{% from 'hpc_project' import get_hpc_project %}
...
     [[[directives]]]
        --account = {{ get_hpc_project() }}
...

(This makes me think we should support workflow-specific install symlinks in the user’s global config, rather than in the workflow itself … )

Like how rose had it in the previous version would be good, where it can be defined at install time and actioned when it gets actioned. Linking all workflows to the same platform (or making it so all workflows need their own platforms) seems sub-optimal.

[Deleted previous post; wires crossed!]

That is what I did. I made a shell script to make a global config suitable for all projects across all disks, each with a unique platform, for our development realm. That was passed to Jacinta, you can look at it if you like to see if it can be improved based on knowing Cylc better.

Thanks @TomC - you replied to my deleted post (I got myself confused and thought I’d rethink…). I saw your script in passing already, but I thought you might still be arguing for a more rose-like approach since you hadn’t followed up here.

I just replied by email, so I’ve got no idea what the reply did.

I do think it would be useful to provide an enhancement here still though. Perhaps not rose like, but maybe a way for Cylc to pass a few extra variables through our something like that. I have not thought it through. I did suggest to Jacinta, that what I made is likely sufficient for our Dev purposes, and it could be extended to production, or, production service accounts could just have a small modification to their Bashrc files to have whatever variables that they need to make it clean. Honestly, this way, whilst removing some user control, is better from an ICT practice viewpoint. Especially if there is a way to block users using a custom one?