Good afternoon,
I am seeking advice for debugging a broadcast.
In particular, I have a task that, after computing a variable, tries to send it to another task.
Let’s say that this is my task (named hindcast_recup_SST
)
#!/bin/bash
set -euo pipefail
# Count the number of files, that will equal the number of tasks needed to create superobs
n_task=$(ls -lrt ${MY_PATH} | wc -l)
# Send the number of tasks necessary to create superobs task
cylc broadcast ${CYLC_WORKFLOW_ID} -p ${CYLC_TASK_CYCLE_POINT} \
-n hindcast_create_superobs_SST \
-s "[directives]--ntasks=${n_task}"
hindcast_create_superobs_SST
is the task where I need to set the ntasks directive.
Now the problem: the first task fails with the following error
WorkflowStopped: glo12_cylc/run1 is not running
2025-05-13T12:48:24Z CRITICAL - failed/ERR
And the error is caused by the broadcast, since if I remove it, the task succeeds.
I tried to run the broadcast by hand on the command line, and it works
$ cylc broadcast glo12_cylc/run1 -p 20241111 -n hindcast_create_superobs_SST -s “[directives]–ntasks=4”
Broadcast set:
- [20241111/hindcast_create_superobs_SST] [directives]–ntasks=4
Now, I made many test, and I am quite at loss.
First of all, how does cycl check if a workflow is running? Normally I check my workflow it with cycl tui
, and there is marked as running, but from the error message, I understand there must be some variable that is telling the opposite to cycl.
Second: how do I debug this? I tryed to run the workflow in debug, but I see just an error like
DEBUG - [20241111/hindcast_recup_SST/01:submitted] (polled)failed
INFO - [20241111/hindcast_recup_SST/01:submitted] setting implied output: started
DEBUG - [20241111/hindcast_recup_SST/01:submitted] (internal)started
INFO - [20241111/hindcast_recup_SST/01:submitted] => running
DEBUG - [20241111/hindcast_recup_SST/01:running] health: execution timeout=None, polling intervals=PT1M,3*PT2M,PT1M,…
INFO - [20241111/hindcast_recup_SST/01:running] => failed
WARNING - [20241111/hindcast_recup_SST/01:failed] did not complete the required outputs:
⨯ ┆ succeeded
I don’t know how to retrieve more information on the failure, I added a --debug to the broadcast command without much success. Also, I have previous broadcast in the workflow, and they work. It could be that some previous broadcast fails, and this one is just trapping the error?
Any help will be highly appreciated!
Best,
Stella