Hi,
Say you have a model workflow and a separate workflow to process the models outputs. The model hours are chunked together or could be hourly to process in parallel.
e.g.
[[HHH000]]
[[HHH012]]
[[HHH024]]
[[HHH036]]
[[HHH048]]
[[HHH060]]
...
[[HHH240]]
Forecast hours have been processed to HHH060 inclusive, and the model has output say 63 hours of data. The model then crashes and rewinds itself changing things like halo size, timestep, etc for recovery to get past the instability. It has rewound to forecast hour 36 and will resume from there.
Because the model has been rewound and has changed its calculations, we need to rewind the postprocessing suite too otherwise there will most likely be a discontinuity at forecast hour 61 onwards (significant jumps in parameters, negative accumulations, etc). We do this in Cylc7 by code like below (there is a bit more, but you can get the idea).
< hold the suite >
relevant_blocks=()
for f in $HHH_BLOCK_HOURS; do
if ((10#$f >= RESET_FROM_HOUR)); then
relevant_blocks+=("$f")
fi
done
min_fhr=$(echo "${relevant_blocks[*]}" | xargs -n1 | sort -un | head -1)
# Kill the tasks/families
for f in "${relevant_blocks[@]}"; do
# Greater than or equal to out reset hour, so kill all tasks in the family
# It may try to kill HHH blocks that do not exist, but it is only a small
# overhead involved and does not lead to failures
cylc kill \
"$CYLC_SUITE_NAME" \
"$CYLC_TASK_CYCLE_POINT/HHH_${f}_$member"
done
# Reset the state of the families
for f in $FHRS; do
cylc reset \
--state=waiting \
"$CYLC_SUITE_NAME" \
"$CYLC_TASK_CYCLE_POINT/HHH_${f}_$member"
done
< release the suite >
From what I can see from quick checks, the tasks disappear from the task pool quite rapidly, so the above approach wonāt work (just reseting families to waiting). As Iām still learning all the tools and approaches in Cylc8, is there a good way to do similar for Cylc8?
Thanks.