I am trying to rewrite some old Cylc v7 suite.rc files into v8.3.4 format and I am having trouble understanding how to define platforms and job runners. In Cylc7, I had the following in a inc/platform.rc file that would then be included in the suite.rc and inherited by certain tasks to define if that task should run in the background vs the node:
[runtime]
[[BG_TASK]]
[[[job]]]
batch system = background
I went through the 8.x tutorial and see that [[[job]]] is gone now and has been replaced with the (I think) [platforms] utility in $HOME/.cylc/flow/8.3.4/global.cylc. So I wrote the following in my global.cylc file:
which can then be inherited by tasks that I want to run in the background. The jobs won’t submit. The only output I get from running the workflow is a job-activity.log file that says:
My understanding was that you define custom named platforms in your global.cylc file which can then be referenced in your flow.cylc file using [runtime][]platform. Where did I go wrong here?
That’s right. At a site with many Cylc users you’d expect platforms to be defined centrally, but if not you can do it in your user global.cylc file.
You’ll get more information from the scheduler log. Run your workflow again with --no-detach or use cylc cat-log <workflow-id> (or look in ~/cylc-run/<workflow-id>/log/scheduler/log):
So: Cylc could not connect to your platform because “Unable to find valid host for BG_TASK”. You did not list any hosts in your platform definition, so Cylc tried the default (host-name = platform-name), which didn’t work.
You need to define a list of hosts and an install target, as well as job runner. The minimal platform definition for local background jobs (which, by the way, is the default if you don’t specify a platform at all):
thanks for the response, that makes sense and thank you for linking the documentation associated with platform configuration.
I am trying to think how to adapt this methodology to one of our suites where we include a multitude of platform files (i.e. HPC1_include.rc, HPC2_include.rc, HP3_include.rc) that the user has to choose from based on what machine the user is running on. Each of these files will have a different version of, say, job runner (i.e. pbs vs slurm).
Now that the definition of these platforms is delegated to the global.cylc file, it seems like the user would have to know to correctly change the specifications of their platforms in their global.cylc file prior to using our version controlled Cylc workflow.
A workaround might be to use $CYLC_SITE_CONF_PATH and hardwire that in flow.cylc[runtime][root] to be $CYLC_WORKFLOW_RUN_DIR/etc. In $CYLC_WORKFLOW_RUN_DIR/etc, I would have something like HPC1.cylc, HPC2.cylc, etc. and prior to installing the workflow, the user must symlink global.cylc to one of these files. Am I over thinking this?
You should not need to use CYLC_SITE_CONF_PATH for this.
You can define all the platforms in the same global.cylc, and just select the right ones as needed in workflow task definitions.
Platform definitions can overlap - e.g. you could define a platform to run background jobs on one particular host of several that also appear in a platform with PBS as a job runner (note that platform hosts are where Cylc interacts with the job runner, not the compute notes managed by the job runner itself).
I think you could keep the same set of flow.cylc include files, but just change their content to set the right platform name for the task family.
Is there anyway to do this without a global.cylc? I’d prefer everything needed to run the workflow to be self contained in one git clone checkout without the need for the user to go into their $HOME/.cylc/flow and change/create a file.
No, we don’t support platform definitions inside a workflow configuration, because platforms are inherently not workflow-specific.
Ideally, normal users shouldn’t even need to understand how to define platforms, they should just choose from the centrally-defined ones.
If that has not been done (i.e., no central definitions), you can define your own platforms, but the principle is the same - all of your workflows should select from the same platforms, no need to redefine them in every workflow.
Is it not feasible to have a central global.cylc for all users on your workflow scheduler host(s)?
Note there’s likely to be other global config needed too, not just platforms.
Okay that makes sense. It seems like we will have to figure out a way to create a centrally-defined global.cylc for each of our HPC systems that all users can reference via the *_include.rc files included in the workflow checkout
Our lab hasn’t made the full transition to using Cylc8 so there isn’t much of an infrastructure yet. There is no central global.cylc yet, but this seems like the direction our development should move in as we progress in this transition