Extending and restarting a a workflow

hi there,

i have a workflow which has run for 1 month (1 resubmission period) and i want to extend it to 2 months.

i’ve run cylc vr after changing the run length variable but the cylc log output says i need manually trigger the task to get it to run?

 2024-03-03T22:56:20Z WARNING - PT2M restart timer starts NOW
2024-03-03T22:56:20Z WARNING - This workflow already ran to completion.
 To make it continue, trigger new tasks before the restart timeout.
2024-03-03T22:58:21Z WARNING - restart timer timed out after PT2M
2024-03-03T22:58:21Z INFO - Workflow shutting down - AUTOMATIC
2024-03-03T22:58:22Z INFO - [('workflow-event-handler', 'shutdown') cmd]

is this right/expected? in cylc 7 it would’ve worked out that it needed to continue on its own.

cheers,

jonny

also even when manually triggering the task, it shows up in cylc tui but not in the web UI (apart from the description text at the top)…

Hi @jonnyhtw

Yes that’s expected. Basically Cylc 8 tries harder not to do dodgy things automatically.

Extending the final cycle point (FCP) is not as trivial an operation as it may seem. The FCP isn’t necessarily just an arbitrary cut-off in an otherwise infinite sequence. It defines the end of the dependency graph, and the graph can have structure defined relative to the FCP, such as special “shutdown tasks” in the final cycle point(s).

So, if you try to restart, Cylc 8 knows if the original graph already ran to completion, but the scheduler will stay alive to allow you to manually trigger new flows if you extended the graph (either in old cycle points, or by moving the FCP out further). By doing so, you take responsibility for what happens next :slight_smile:

I’ve just tried this in an example, with 8.2.4, and (a) it worked; and (b) the manually triggered task in the new cycles did automatically show up in my GUI. Maybe your browser is misbehaving?

1 Like

hey @hilary.j.oliver,

thanks a lot for this!

By doing so, you take responsibility for what happens next :slight_smile:

ok cool that all makes sense, good to know!

Maybe your browser is misbehaving?

hmm yeah something definitely is misbehaving. i tried refreshing but to no avail, i eventually restarted the UI server and this fixed it but i do seem to be encountering a few instances of the web UI not ‘catching up’ with cylc tui’s output. i’ll keep an eye on this and report back if it persists.

thanks a lot,

jonny

hi again,

re this same workflow, i triggered the next ‘science’ task in the new final cycle point and this worked fine but it then didn’t trigger subsequent tasks in the cycle point (basically 1, a postprocessing task and 2, a housekeeping task).

this is the relevant bit of the graph (cylc view -j . in workflow source dir)…

 39          P1M ! $  = """
 40                     lfric_atm[-P1M] => lfric_atm
 41                     postproc[-P1M] => postproc => housekeeping
 42                     housekeeping[-P1M] => housekeeping
 43                     """

the workflow then ran to completion in spite of the fact that neither of the aforementioned tasks actually ran.

is this the expected behaviour?

cheers,

jonny

this is the situation after…

cylc trigger u-db797/run14//19881001T0000Z/postproc

where R1=19881001T0000Z and the resubmission period is 1 month. note that i have now extended it again to a 3 month run length!


e

Hi Jonny,

The problem is your intercycle dependencies. Take this example:

#!Jinja2
[scheduling]
    cycling mode = integer
    final cycle point = {{FCP | default(4) }}
    [[graph]]
        P1 = """
            foo[-P1] => foo => post
            post[-P1] => post
        """
[runtime]
    [[foo, post]]
        script = true

If I run this to completion with FCP = 4, then restart with FCP=7 and manually trigger 5/foo, all the foos will run then the workflow will stall because 5/post is waiting on 4/post - which ran before the workflow originally completed.

2024-03-06T12:05:55+13:00 INFO - [7/foo running job:01 flows:1] => succeeded
2024-03-06T12:05:55+13:00 INFO - [7/post waiting(runahead) job:00 flows:1] => waiting
2024-03-06T12:05:55+13:00 WARNING - Partially satisfied prerequisites:
      * 5/post is waiting on ['4/post:succeeded']  # <---------------
      * 6/post is waiting on ['5/post:succeeded']
      * 7/post is waiting on ['6/post:succeeded']
2024-03-06T12:05:55+13:00 CRITICAL - Workflow stalled
2024-03-06T12:05:55+13:00 WARNING - PT1H stall timer starts NOW

To unstall it, you can manually trigger 5/post like you did 5/foo. (Manual trigger means run it despite unsatisfied dependencies).

[Note in principle Cylc could automatically figure out that 4/foo and 4/post already ran, but as I commented earlier extending the final cycle point after completion of the workflow is potentially problematic, so (for now at least) you have manually trigger the flow in the newly-extended graph upon restart.]

1 Like

ok roger that, sounds good, thanks.

i thought what had happened is that the workflow shut down after the ‘new’ foo tasks had run even though the new post ones hadn’t :thinking:

i need to play with this a bit more to get it into my :brain: !

cheers

That’s because partially satisfied tasks (i.e., tasks with at least one, but not all, prerequisites satisfied) are not held in the “n=0” GUI window in Cylc 8.2.x, so if no other tasks are left in the workflow those ones won’t be visible even though they caused a stall - that turned out to be a design error, already corrected for 8.3.0.

1 Like

fyi i’ve run this workflow to completion now and i now understand the difference between ‘stalled with no visible tasks’ and ‘completed’! :slight_smile: thanks @hilary.j.oliver

1 Like

Great, and from 8.3.0 they will be visible)

1 Like

cool! look forward to having a play with that