Cylc7 vs 8 - adding new tasks to a graph changes

Hi,

I was wondering how Cylc8 differs to Cylc7 when adding new tasks to an active workflow? In Cylc7, if you modify the graph to add new tasks, you had either

  1. Insert the new tasks manually from CLI, and force trigger them if their pre-reqs had been satisfied but the pre-req tasks had been removed from the graph
  2. Stop and freshly start a workflow from an appropriate cycle point

How about in Cylc8?

e.g.

# assme par1, par2 are just '0,1,2,3,4' for these purposes

@START_RUN => start => a<par1, par2> => b<par1, par2> => c<par1, par2>  => cleanup
a<par1-1, par2> => a<par1, par2>
cleanup[-PT6H] => start

#BECOMES (a)

@START_RUN => start => a<par1, par2> => new<par1,par2> => b<par1, par2> => cleanup
a<par1-1, par2> => a<par1, par2>
cleanup[-PT6H] => start

#OR BECOMES (b)

@START_RUN => new_start => start => a<par1, par2> => b<par1, par2> => cleanup
a<par1-1, par2> => a<par1, par2>
cleanup[-PT6H] => start

etc

If you were to make these graph changes in Cylc8, hypothetically lets say in two situations (but please mention more if there are other variations that should be mentioned), what would need to be done, if anything other than a straight reload?

a. cleanup.10 has finished, start.11 is currently waiting on START_RUN in the graph
b. a<par1={0,1,2,3}, par2=0>.20, b<par1={0,1,2,3},par2=0>, c<par1={0,1,2},par2=0> have finished, and c<par1=3,par2=0>.20 and a<par1=4,par2=0>.20 are running.

Hi @TomC

A quick generic reply before I try to understand your specific example (I’m in a hurry)…

It’s more intuitive and easier in Cylc 8.

If you add a new task to the cycling graph, then reinstall, then reload or restart:

  • if the new task has parents downstream of current activity
    • job done; it will be spawned as normal by the upstream outputs
  • otherwise (no parents, or some that completed before the update
    • use trigger or set-outputs to make the first instance run, because current activity isn’t going to result in automatic satisfaction of all prerequisites of the new task

Note by “parent” I mean the dependency relationship, not run-time configuration inheritance, e.g.:

R1 = "parent => child1 & child2"`

Hi again @TomC

Ah, I find your example a bit hard to follow.

  • in “b.” you refer to tasks c<par1, par2> which don’t appear in any of the graphs
  • is the 2D parameterization really needed to get to the crux of the question?

Maybe you could simplify it all a bit, in terms of chains of single tasks (@start => a => b =>c), or pairs (@start => a1 & a2 ...) to see what happens when some but not all parents are running at reload time?

Ah, I missed the c<.., but I think logically can you figure that it came after b<...>.

The 2d parameterisation is merely to show that things may be trivial when there is a a=>b=>c relationship, but if you start having a<a,b,c,d> =>b type dependencies, having to set-outputs or similar gets more complex, and potentially requires a detailed knowledge of the graph to be able to do.

I’ll just elaborate on my original response a little.

When you reload a workflow, all the task definitions get updated to reflect their updated graph dependencies as well as their updated [runtime] config settings. In Cylc 8, those dependencies determine what downstream tasks get spawned, on the fly, as outputs are generated by running tasks.

So, after a reload, all future tasks that will naturally be spawned downstream of current activity will automatically spawn children according to their updated dependencies. No manual intervention is required, beyond the reload itself.

The only exceptions to this are:

  • Parentless tasks. These are not spawned by any upstream task outputs, so if you add a new parentless task you will need to manually trigger the first instance. The scheduler will then handle subsequent instances (i.e., in following cycle points)
  • Tasks that are active at reload time. We don’t expect running jobs to generate outputs that conform to a change made after they were submitted. So if you add or remove children of active tasks, you will need to manually intervene to make downstream activity conform to the new graph at that point

Well, there’s really no avoiding the need to understand the graph if you want to change dependencies mid-stream in a running workflow.

That said, Cylc 8 (unlike 7) automatically handles addition and removal of tasks and dependencies from the graph at runtime, so long as you avoid changing active tasks, and manually trigger the first instance of an added parentless task.

For the specifics of manually setting outputs and prerequisites, it would be best to wait for the 8.3.0 release. We will endeavor to document many specific examples then too.

I was also puzzling over the “a” and “b” scenarios at the bottom, given the BECOMES “a” and “b” graphs. Presumably those a’s and b’s are different, and generate a 2x2 matrix of scenarios?

If you think my general answers are insufficient, I can attempt to be more specific - but that’ll be faster if each question relates to a maximally simple (without being too simple) scenario.