Is there a simple way to see the state of a workflow after it has completed? I’ve set a run going overnight and have come back in the morning to see it’s no longer running. I am running on a HPC login node so don’t want to start up a webserver for the gui.
I’d like to know if the run was successful or not, and if unsuccessful which tasks failed
First, look at the scheduler log.
Is this Cylc 7 or 8?
Cylc 8 - I see in the log there’s an error message
2023-06-30T00:03:25Z ERROR - Incomplete tasks:
* 1/run_mesh_C24_gadi_bwintel_fast-debug-64bit did not complete required outputs: ['succeeded']
so the run has failed. There’s not a way to get this information aside from checking the logs?
Hi,
When a task doesn’t complete its required outputs, the scheduler stalls and will typically send you an email to let you know depending on site configuration. You can customise what Cylc does in this situations using the stall handler. The workflow will remain running so you can inspect it using the terminal user interface (including cylc tui
) or the GUI as normal.
Your site may configure a timeout which will cause a stalled workflow to shut down if no user input is received during a specified period. If your workflow was shut down due to a timeout, you can restart it using cylc play
, then monitor as usual.
You can inspect your site’s configuration using the cylc config
command. E.G, here’s what I get:
$ cylc config -i '[scheduler][events]'
abort on inactivity timeout = True
abort on stall timeout = True
inactivity timeout = P30D
mail events = inactivity timeout, stall, stall timeout, abort
stall timeout = P3D
Which means:
- My workflows will shut down after 30 days of inactivity.
- Or 3 days after a stall.
- I will get notified by email if the workflow stalls (stall), crashes (abort) or shuts down due to a timeout (inactivity timeout, stall timeout).
If you want to be notified about stall events in a different way you can override the site-default behaviour with your own Cylc global configuration file and configure any arbitrary command for your stall handler.