I have a trivially simple Cylc suite for climate model monitoring (see attached gcylc screenshot).
Inside the task, a simple script is run to generate a time series of climate data (e.g. global mean surface temperature) from preexisting annual mean latitude-longitude data files.
The problem is that it does it one model at a time and takes ages.
All I want to do is to split the task up into separate ones for each climate model run, each of which has a namelist in the app/model_monitor2/rose-app.conf as follows…
I don’t think I can tell from your description exactly what the workflow requirements are here (e.g. can each model be processed independently?). But generally speaking, if you can split the script into multiple separate script for each model (or multiple instances of the same script, or whatever), you can put each one in its own task. Then, if you correctly specified the dependencies between the tasks, Cylc will run them with as much concurrency as possible. (Or is your question more about splitting up an existing Rose app, rather than Cylc workflow as such?).
I’m not the most experienced writing suites, so hoping others will chime in to confirm if it’s correct or if there are better alternatives. But I guess you could use parameterized tasks [1] for each
namelist.
And also control how many task jobs are executed at the same time with queues and/or some global configuration – if necessary.
Basically the way things work in my suite at the moment is that I want to monitor N climate models suites as they run. The thing is that the software that does it runs this as one Python process so that each model is processed one after the other rather than as separate tasks.
I think the best way to go about this will be to do this…
I’ll work on this and will reply back here if I have any issues.
Another thing to add here is that I’ve been getting some help from NeSI on this and they’ve managed to find a neat way of splitting up the rose-app.conf file so that each simulation and each year are split into separate tasks.
I’ll report back here as and when appropriate!
The solution they’ve been working on doesn’t change the way the actual suite runs but under the hood will call a simple script which spawns multiple, serial Python jobs rather than one.