Unable to replicate spawn to max active cycle points in Cylc 8

So I read a prior topic that explained “runahead limit” replaced both “max active cycle points” and “spawn to max active cycle points”. While I do see it properly loading the specified number of cycle points, it’s not loading the full graph up front which is what “spawn to max active cycle points” would do in Cylc 7.

This has been an issue when we’re testing runs and have to manipulate the graph interactively by triggering things out of order, re-running prior tasks, and things like that. What I’m finding is if I want to skip to a task in the middle of a graph, I end up accidentally triggering tasks that were hidden:

For example, I have the following graph defined:
A → B → C → D

I play the suite but set the pause flag so nothing starts right away. Then I hold all tasks.

I want to manually trigger task D, but all I see in my active graph are tasks A and B. If I “set outputs” on task B, it’ll immediately start running task C, since Cylc didn’t hold task C when I told it to hold all tasks because task C wasn’t in the graph at the time. Loading up all the tasks in a graph allowed us to do that.

Is there some equivalent in Cylc 8?

At the moment there is no equivalent but we have an issue about this:
https://github.com/cylc/cylc-flow/issues/5763

Ah I see. It’s a much harder problem than I thought. I didn’t realize the task pool implementation had changed between Cylc 7 and 8.

Cylc didn’t hold task C when I told it to hold all tasks because task C wasn’t in the graph at the time.

Yes, presumably by “hold all tasks” you mean (e.g.) cylc hold workflow//2023/*. The wildcard only matches tasks in the “active window” on the graph - i.e., the tasks that are currently being actively managed by the scheduler.

Future tasks, beyond the active window, can only be held by targeting each one specifically (this might change in the near future…)

I want to manually trigger task D, but all I see in my active graph are tasks A and B.

In Cylc 8, how much of the graph you see in web UI beyond the active window is just a visualization choice. By default it’s one edge out from active tasks. To see more use Set Graph Window Extent in the UI workflow drop-down menu.

However, note you don’t need to “see” task D in the UI in order to be able trigger it:

  • The CLI cylc trigger command can trigger a task anywhere in the entire graph at any time
  • Or click on any other visible task in the UI, select the trigger command-edit option (pencil icon), and change the task name and cycle point as you like before submitting the command.

So I read the spawn-on-demand proposal and it sounds like if I attach an xtrigger to all the tasks in the graph, Cylc would treat them as parent less and spawn them. Then I should be able to truly hold all tasks under a cycle point. I’m not at my work machine so I can’t test that theory and it sounds awful but for troubleshooting model runs it might make things a lot easier for me…

Edit: and I’ll check out that setting in the gui for normal runs when I’m not mucking around with the graphs.

I had missed @dpmatthews reply when I posted my response early. Note the issue he linked to is about matching future tasks (e.g. to hold them) with a glob pattern like //2023/*(I’m not sure if you were asking about that, or for an equivalent of spawn-to-max-active-cycle-points).

No, it’s not hard at all because to do what you want in Cylc 8 does not require spawning tasks way ahead (see below).

if I attach an xtrigger to all the tasks in the graph, Cylc would treat them as parent less and spawn them. Then I should be able to truly hold all tasks under a cycle point.

I would definitely not advise you to do that! (And in fact it wouldn’t work anyway - only tasks, xtriggered or not, with no graph parents get auto-spawned out to the the runahead limit. Others are spawned on demand as the parent outputs they depend on are generated).

I want to manually trigger task D

If that’s your main requirement here, there is no need to hold future tasks in order to do it. Just trigger the task - see the two bullet points in my earlier reply above. My main points there were: (1) how far into the future you see in the UI is merely a visualization choice; and (2) in any case you don’t need to be able to “see” a task in order to trigger it (but you can if you want to).

Thanks, what I failed to mention was there’s another task, E, that I don’t want spawned:

A -> B -> C -> D -> E

Sorry about that! I do see how just triggering D assuming that’s the end of the graph would do what I’m after.

After reading the proposal and your reply to my Github comment, I think I better understand what I’m trying to accomplish that that’s just triggering (or re-running after a successful run to test a change) task D without spawning a whole new flow (and thus running E). It makes sense for D to be its own flow but I think my use case is specific to one-shot tests without clobbering prior run logs (so I can’t just qsub the existing job script).

OK, including your additional task (partly for anyone following this without seeing our comments on GitHub):

A -> B -> C -> D -> E

Let’s say A is running, and you want to hold future task B (to prevent it from running when A succeeds), and then trigger future task D as a one-off that won’t cause E to spawn and won’t impact the main flow when/if you release B:

$ cylc hold <wflow>//<point>/B
$ cylc trigger --flow=none <wflow>//<point>/D  

Now let’s say everything ran in flow 1, and now you want to rerun past task D as a one-off, without causing E to rerun:

$ cylc trigger --flow=none <wflow>//<point>/D   # (a)
# OR
$ cylc trigger --flow=1 <wflow>//<point>/D  # (b)
# OR
$ cylc trigger  <wflow>//<point>/D  # (c) - default: flow=1 (current active flows)

In this case you have the choice of triggering D as a no-flow (a), or in flow 1 (b or c). The latter works too for past tasks, because D already spawned E earlier in flow 1, so it won’t do it again.

Note you can use options --flow from the GUI too, if you click on the command-edit icon (pencil).

1 Like