I have a Cylc 8.3.6 workflow with a bit over 600 tasks. I am trying to trigger a bunch of them at once using a glob pattern. They have already succeeded, but I want to rerun them. They are still visible in the N=1 window, shown below.
When I do cylc trigger --flow=new StormSurge_18U_20250211000000_20250211001730_pt20250211T0040_rt '//20250211T0040Z/ens_aggregate_data*' (or flow=none or not having it referenced at all), nothing runs. The message I get in the scheduler log is
2025-02-11T04:37:27Z INFO - Command "force_trigger_tasks" received. ID=f2cc924e-7fbb-4fe1-9d77-ee1d5eed0d69
force_trigger_tasks(flow=['new'], flow_wait=False, tasks=['20250211T0040Z/ens_aggregate_data*'])
2025-02-11T04:37:28Z WARNING - No active tasks matching: 20250211T0040Z/ens_aggregate_data*
Similarly if I do '//20250211T0040Z/ens_aggregagate_data_000*'
But if I do '//20250211T0040Z/ens_aggregate_data_0000', it does trigger a task
2025-02-11T05:06:15Z INFO - Command "force_trigger_tasks" received. ID=6e7a94ef-1597-4051-af00-44056936f63e
force_trigger_tasks(flow=['new'], flow_wait=False, tasks=['20250211T0040Z/ens_aggregate_data_0000'])
2025-02-11T05:06:16Z INFO - New flow: 4 (no description) 2025-02-11T05:06:16+00:00
2025-02-11T05:06:16Z INFO - [20250211T0040Z/ens_aggregate_data_0000(flows=4):waiting(runahead)] => waiting
The help I read as supporting glob patterns for the ID, but, in this workflow at least, it doesn’t appear to work. The first pattern should match 200 tasks, the second pattern should match 10 tasks.
Ah, ok, so used to match active tasks: specifically means used to match active (n=0) tasks: for cylc trigger?
I thought it was like cylc show where the glob pattern works on any task visible in the n-window selected. It seems active means different things in these two commands.
(For other readers, that’s quoted from the CLI help text for commands that target tasks).
Yes, here “active tasks” means tasks in the active window of the workflow, a.k.a. “the n=0 window”.
(Historically we sometimes used “active tasks” to refer to tasks with active jobs. Of course that’s even more restrictive, but I’ll put up a quick change to remove the ambiguity.)
The active window (n=0) reflects the subset of tasks held in the scheduler’s memory to feed the scheduling algorithm. Higher n-values just allow the UI to display more of the surrounding graph.
Specific task IDs “match” anywhere in the entire graph, not just in the n-window (the command brings the target task into n=0).
Ideally globs should match anywhere in the graph too, but that requires some development (why is that non-trivial? Cylc graphs are often infinite in extent, so globs are potentially dangerous).
For the record, task globbing in Cylc 7 only ever matched in n=0 too. Differences include:
in Cylc 7 it was called the “task pool”, not the n=0 active window
the Cylc 7 task pool typically contained many more tasks than the more efficient Cylc 8, which was sometimes an advantage
but specific Cylc 7 task IDs did not match anywhere in the graph - going beyond the task pool required manually “inserting” tasks before using the intended command
cylc show is the one exception to what I’ve just described. It only queries task information (c.f. commands that manipulate tasks and their jobs) so it was easy, possibly as an interim measure, to make it glob-match in the wider n-window datastore.
I’ll post another issue to get the glob help text modified for the show command.