CYLC_TASK_TRY_NUMBER only updates when auto-retries are executed?

I just noticed that CYLC_TASK_TRY_NUMBER does not update after the number of executions limit has been reached. Reading the docs, it sounds like this number only increments with an automatic retry. This seems like an odd feature to me. I expected that it would increment every time the task was run. If it passed and I ran it, I would expect the try number to increase by one. If it failed and I manually run it, I expected the try number to increment by one.

I guess my question is - is the current implementation the behaviour other people expect, or should it be like how I describe it above?

With a suite.rc of

[scheduling]
    [[dependencies]]
        graph = "hello"
[runtime]
    [[hello]]
        script = false
        [[[job]]]
            execution retry delays = 2*PT2S 

If I force the task hello to run after it has tried three times, the CYLC_TASK_TRY_NUMBER never iterates above 3.

$ grep CYLC_TASK_TRY_NUMBER 0?/job
01/job:    export CYLC_TASK_TRY_NUMBER=1
02/job:    export CYLC_TASK_TRY_NUMBER=2
03/job:    export CYLC_TASK_TRY_NUMBER=3
04/job:    export CYLC_TASK_TRY_NUMBER=3
05/job:    export CYLC_TASK_TRY_NUMBER=3

Please note, I haven’t checked any version of Cylc later than 7.8.4 as that is what is installed on the VM I’m using.

Hi @TomC

You’re right, but it’s deliberate because we distinguish between automatic retries and forced triggering. The try number only increments with automatic retries, but the submit number increments every time the task submits a job.

Submit number is also available in task environments but the variable is derived from the job script path and exported inside a boiler-plate job script function, so it’s not exposed in the main job script like the try number.

To tweak your example:

[scheduling]
    [[dependencies]]
        graph = "hello"
[runtime]
    [[hello]]
        script = """
            echo "CYLC_TASK_TRY_NUMBER=$CYLC_TASK_TRY_NUMBER"
            echo "CYLC_TASK_SUBMIT_NUMBER=$CYLC_TASK_SUBMIT_NUMBER"
            false
        """
        [[[job]]]
            execution retry delays = 2*PT2S 

Run the suite, and cylc trigger tomc hello.1 twice more when it stalls. Then:

$ grep CYLC_TASK_.*_NUMBER 0?/job.out
01/job.out:CYLC_TASK_TRY_NUMBER=1
01/job.out:CYLC_TASK_SUBMIT_NUMBER=1
02/job.out:CYLC_TASK_TRY_NUMBER=2
02/job.out:CYLC_TASK_SUBMIT_NUMBER=2
03/job.out:CYLC_TASK_TRY_NUMBER=3
03/job.out:CYLC_TASK_SUBMIT_NUMBER=3
04/job.out:CYLC_TASK_TRY_NUMBER=3
04/job.out:CYLC_TASK_SUBMIT_NUMBER=4
05/job.out:CYLC_TASK_TRY_NUMBER=3
05/job.out:CYLC_TASK_SUBMIT_NUMBER=5

Ah, of course. I should have checked for another variable in the docs where I see it is listed.

1 Like