Expected behaviour when re-inserting a family of tasks

There are two cases of cylc insert where I would like to check whether this is expected behaviour for CYLC_VERSION 7.7.2, ROSE_VERSION 2018.06.0, where this behaviour is documented, and whether it is likely to change with later cylc versions.

When re-inserting a family of tasks using cylc insert:

  1. If the cycle is before the initial cycle point, then no tasks will have prerequisites, and all tasks will trigger immediately.

  2. If the cycle is in some sense “current”, if some of the tasks in the family are in the relevant graph, and some tasks are not, then those tasks not in the graph will have no prerequisites and will trigger immediately.

Hi @pleopard,

  1. The behaviour you observe is probably a side effect of Cylc ignoring dependence on cycle points prior to the initial cycle point, which is done as a convenience (otherwise you have to explicitly handle the bootstrapping problem created by inter-cycle dependencies in an “all cycle points” graph section). However, the initial cycle point is supposed to be (as the name suggests) the first cycle point in the workflow, so I would argue that inserting tasks with earlier cycle points is “not supported” and the only problem here is that we should probably not even allow it to happen. (Maybe you want to argue that you have a valid use case though?)

  2. I have tested family insertion where some family members have moved on from the target cycle point, and it seems to be working fine for me at both 7.7.2 and 7.8.3 (latest release). Note that Cylc will only insert a new task proxy that does not already exist (i.e. “those not in the graph” in your wording). If I view prerequisites (via the GUI or cylc show) they are correct. Whether or not the inserted tasks trigger immediately depends on whether or not their prerequisites can be satisfied immediately by already-succeeded upstream tasks still present in the pool. If you don’t want that to happen, you need to manually manipulate the states of those upstream tasks (e.g. set them back to waiting, or to failed) so that they do not satisfy the prerequisites of the inserted tasks - before inserting them.

That said - if your inserted tasks really do have no prerequisites, as opposed to correct but immediately-satisfied prerequisites, then can you tell us exactly how to reproduce the problem - preferably with reference to a small example workflow?

Hi again @pleopard,

My colleague Xiao, who has not got onto Discourse yet, has provided a possible explanation for your 2nd case.

By default Cylc will not let you insert a task into an in-valid cycle point (in-valid means the graph does not define any dependencies for that task in that cycle point). But you can force it to with a command option (or a corresponding check-box in the GUI):

$ cylc insert --help
...
  --no-check   Add task even if the provided cycle point is not valid
               for the given task.
...

If you force a task to be inserted into an invalid cycle point, it will be inserted with no prerequisites (because the graph defines none for it, at that cycle point) and it will therefore (correctly) run immediately (unless the suite is held).

For family insertion, note that the family name is just short-hand for “all the family member tasks”, and
family membership is defined purely by runtime inheritance, so it is quite possible to define a graph in which some members of a family are not used in some cycles. If you try to insert the whole family into a cycle point, by default Cylc will only insert the members that are valid at the cycle point, but again you can force it to insert them all (and the invalid ones will naturally have no prerequisites, as described above).

Does this describe what you’ve done/seen?

Hilary

Hi @hilary.j.oliver,
Yes. That describes the situation, and confirms that it is expected behaviour. It looks like we need to be more careful both in configuring suites and in inserting families of tasks.

The configuration problem is mainly due to trying to use a Jinja2 loop to define both a “short” (72 forecast hour) and a “long” (240 forecast hour) cycle, by just using the long cycle bounds in the loop, then defining the graphs for the short and long cycles via parameters.