Workflow stoppage on task 'waiting' with all prereqs met

I am using cylc 8.2.2, I encountered an issue I had not previously seen. A task with two prerequisites was sitting. I used cylc show to query the task, it indicated state: waiting and also showed that both prereqs were met.

I poked around the scheduler log and could see nothing to indicate why the task was not running (but maybe I don’t know what to look for?).

After I did cylc reinstall/cylc reload the task immediately started. Note I did not attempt to manually trigger the task.

Anyone seen this behavior before?

Can you provide a simplified (“toy”) version of the workflow so we can try replicating the problem?

There are three items of task state recorded as flags (as opposed to “states”) which might affect waiting tasks, is_held, is_queued and is_runahead. Is it possible that any of these might apply?

The workflow logs should contain the required information to debug the issue if you still have access to them (see ~/cylc-run/<workflow-id>/log/scheduler/*.log).

Also, @ejhyer, you’re two maintenance releases behind (the latest is 8.2.4) so it’s conceivable that you hit a small bug that’s since been fixed.

OK, I had to wait until it happened again.
Here is the clue from the scheduler log:

2024-02-16T17:43:35Z WARNING - Partially satisfied prerequisites:
      * 20240208T1200Z/scrub_ready is waiting on ['20240208T1200Z/publish_complete:succeeded']
      * 20240208T1200Z/publish_ready is waiting on ['20240208T1200Z/postprocessing_task:succeeded']

These unsatisfied reqs are resulting in the suite stalling because of the runahead limit.
Those tasks have disappeared from the GUI, and cylc show indicates they are not active, but I was able to cylc trigger them and the suite then advanced.
What is the (expected) logic driving when tasks disappear from the “active” state, and (thus?) disappear from the GUI?

Hi @ejhyer

What is the (expected) logic driving when tasks disappear from the “active” state

We need to distinguish between active task states (submitted, running) and the active window of the workflow - i.e., the tasks that the scheduler is currently actively managing, which includes active ones plus waiting tasks that are ready to run according to the graph but held back by (e.g.) queues, runahead limit, xtriggers, or task hold.

Tasks can leave the active window if they are complete - i.e. if they finished (succeeded or failed) AND completed all of their required (as opposed to optional) outputs.

Incomplete tasks stay in the active window until dealt with, and will stall the workflow, because their incomplete status indicates something happened that the graph does not handle automatically.

The only other way a task can leave the active window is if deliberately removed by you (cylc remove). Then, you are saying to the scheduler that you don’t need that task to be completed anymore, and damn the downstream consequences.

and (thus?) disappear from the GUI?

That’s a slightly different thing. The GUI shows the tasks in the active window PLUS future and past tasks out to n (default 1) graph edges out from them.

So whether or not completed tasks are still visible in the GUI depends on the window extent, which you can change via the GUI workflow menu.

[UPDATE] see also this thread

2024-02-16T17:43:35Z WARNING - Partially satisfied prerequisites:
      * 20240208T1200Z/scrub_ready is waiting on ['20240208T1200Z/publish_complete:succeeded']
      * 20240208T1200Z/publish_ready is waiting on ['20240208T1200Z/postprocessing_task:succeeded']

Are you saying that the upstream outputs listed there, which are still being waited on, were actually completed?