Robust coding of expired tasks

I use cylc 7.8.1. Per cycle point all my tasks lead to a housekeep task, which uses rose_prune to tidy up logs/work/share etc. So without the housekeep task being triggered the suite will eventually fail to advance.

In this example I have tasks that copy data which doesn’t exist when the suite gets too far behind wall-clock time. I think I got this right by writing:

[[[T12]]]
    graph = """
        copy | copy:expired =>  housekeep
    """

This is the simplest case. Could I write this in a shorter way as copy:finish => housekeep ? Does :finish include :succeeded and :expired?

In a more complex case I create a plot from this copied data and other local model output which is polled for by a polling task (succeeds if local file exists, typically fails a couple of times before eventually succeeding, but may expire). The plotting shouldn’t happen if the polled data has expired, neither should it plot when the copied data has expired. But if poll and copy expire (and plot never happened), the housekeeping should still run. The thing with the polling task is that I can reasonably trust for this local data to eventually turn up (but I still want to handle expiry as there is no use in plotting data from weeks ago). The copy task on the other hand is less robust and data may never turn up. So the copy task should be allowed to fail without holding up the rest of the suite.

This is what I came up with to achieve this behaviour, and I am wondering whether it is complete:

[[[T12]]]
    graph = """
        poll & copy => plot
        poll:expired => !plot
        copy:expired => !plot
        copy:fail  => !plot
        poll:expired | copy:expired | copy:fail | plot => housekeep
"""

Thanks for any comments and hints!
Fred

1 Like

Hi Fred,

No, copy:finish is short for copy:succeed | copy:fail, i.e. “finished executing”. Whereas copy:expired means don’t bother executing task copy because it is too far behind the clock.

If I understand the description of your workflow properly, your graph should look something like this:

[[[T12]]]
    graph = """
        poll & copy => plot
        poll:expired | copy:expired | copy:fail => !plot & no_plot
        plot | no_plot => housekeep
            """

To explain:

Separate triggers for the same task are equivalent to AND. So this:

a => plot
b => plot

is equivalent to this:

a & b => plot

And the same goes for suicide triggers:

poll:expire => !plot
copy:expire => !plot
copy:fail => !plot

is equivalent to:

poll:expire & copy:expire & copy:fail=> plot

i.e. BOTH poll AND copy have to expire, AND copy has to fail, for plot to be removed from the workflow - which ain’t gonna happen. From your description, you really want this:

poll:expired | copy:expired | copy:fail => !plot

The other change is optional: I’ve used a dummy task no_plot to signify that plot was removed and no plotting was done. Then you can explicitly trigger housekeeping off of plot or no_plot:

        poll:expired | copy:expired | copy:fail => !plot & no_plot
        plot | no_plot => housekeep

I happen to think that’s easier to understand, but you could stick to your original housekeep trigger line:

        poll:expired | copy:expired | copy:fail => !plot
        poll:expired | copy:expired | copy:fail | plot => housekeep

Note (just in case I’ve got it wrong!) you should be able to test this with a dummy suite (i.e. just your graph with sleep 10 tasks or whatever) with initial cycle point and clock and expire triggers contrived so that you don’t have to wait long to see what happens.

Hope that helps.

Hilary

Thank you so much! Your explanations are very clear.

I programmed the suite according to your last suggestion:

poll:expired | copy:expired | copy:fail => !plot
poll:expired | copy:expired | copy:fail | plot => housekeep

I find it easier to get my head round it, as I was a bit scared of introducing a dummy task.

Again, thanks very much. Your help is a life-line!

2 Likes