What's wrong with my flow.cylc?

My cylc version is 8.2.3, the platform is the UCAR’s derecho supercomputer. I intended to repeat a simple run (using PBS system) 10 times, instead it always runs for only once. What’s wrong with my files below? Below is the content from my ~/.cylc/flow/global.cylc:

[platforms]
  [[derecho_pbs]]
    job runner = pbs
    hosts = localhost
    install target = localhost

Below is my flow.cylc file:

#!jinja2
{% set EXPS = [
  “1993_2000”,
  “2013_2018”
] %}
[scheduler]
  allow implicit tasks = true
[scheduling]
  cycling mode = integer
  initial cycle point = 1
  final cycle point   = 10
[[graph]]
  R1 = “”"
    {% for exp in EXPS %}
    {{exp}}
    {% endfor %}
  “”"
  P1 = “”"
    {% for exp in EXPS %}
    {{ exp }}[-P1]:finish => {{ exp }}
    {% endfor %}
  “”"
[runtime]
  [[root]]
    platform = derecho_pbs
    [[[job]]]
    [[[directives]]]
      -A = UMCP0014
      -l walltime = 00:05:00
      -l select = 1:ncpus=1:mpiprocs=1
      -j = oe
      -q = main
      -l job_priority = premium
      -M = lchen2@umd.edu
  {% for exp in EXPS %}
  [[{{exp}}]]
    inherit = root
    script = “”“cd /glade/u/home/lgchen/lgchen_work/project/OISSH_JEDI/JEDI-SSH-Analysis_repo/script/medianFilterOn2dSshAnalysis_5day5lat5lon/cylc_test/{{exp}}
      ./run_body.tcsh
    “””
  {% endfor %}

For my flow.cylc, I also tried it without the [[[job]]], still runs for only once.
Thanks,

The job is very simple, if without cylc, everytime a job is finished or exited due to walltime limit, I only need to use command “qsub run_body.tcsh“ to resubmit the job again, it’ll continue the run as there’s a text file with the current model run date information.

I don’t have access to Cylc 8.2.3 as it is very old now.

I have tested your workflow with Cylc 8.6.1, and it works correctly (it runs the tasks 10 times), however, I had to make this small change:

 R1 = “”"
 {% for exp in EXPS %}
-{{exp}}
+{{exp}}?
 {% endfor %}
“”"

Thanks Oliver for replying! If I make the same small change as you did, it has error of “GraphParseError: Output 2013_2018:succeeded can’t be both required and optional“.

Below is the output of the cylc-run/cylc_test/runN/log/01-start-01.log which seems useful.

2025-12-01T10:32:07-07:00 INFO - Workflow: cylc_test/run18
2025-12-01T10:32:08-07:00 INFO - Scheduler: url=tcp://derecho1.hsn.de.hpc.ucar.edu:43041 pid=6451
2025-12-01T10:32:08-07:00 INFO - Workflow publisher: url=tcp://derecho1.hsn.de.hpc.ucar.edu:43043
2025-12-01T10:32:08-07:00 INFO - Run: (re)start number=1, log rollover=1
2025-12-01T10:32:08-07:00 INFO - Cylc version: 8.2.3
2025-12-01T10:32:08-07:00 INFO - Run mode: live
2025-12-01T10:32:08-07:00 INFO - Initial point: 1
2025-12-01T10:32:08-07:00 INFO - Final point: 10
2025-12-01T10:32:08-07:00 INFO - Cold start from 1
2025-12-01T10:32:08-07:00 INFO - New flow: 1 (original flow from 1) 2025-12-01 10:32:08
2025-12-01T10:32:08-07:00 INFO - [1/1993_2000 waiting(runahead) job:00 flows:1] => waiting
2025-12-01T10:32:08-07:00 INFO - [1/1993_2000 waiting job:00 flows:1] => waiting(queued)
2025-12-01T10:32:08-07:00 INFO - [1/2013_2018 waiting(runahead) job:00 flows:1] => waiting
2025-12-01T10:32:08-07:00 INFO - [1/2013_2018 waiting job:00 flows:1] => waiting(queued)
2025-12-01T10:32:08-07:00 INFO - [1/1993_2000 waiting(queued) job:00 flows:1] => waiting
2025-12-01T10:32:08-07:00 INFO - [1/2013_2018 waiting(queued) job:00 flows:1] => waiting
2025-12-01T10:32:08-07:00 INFO - [1/2013_2018 waiting job:01 flows:1] => preparing
2025-12-01T10:32:08-07:00 INFO - [1/1993_2000 waiting job:01 flows:1] => preparing
2025-12-01T10:32:10-07:00 INFO - [1/1993_2000 preparing job:01 flows:1] submitted to derecho_pbs:pbs[3794334]
2025-12-01T10:32:10-07:00 INFO - [1/1993_2000 preparing job:01 flows:1] => submitted
2025-12-01T10:32:10-07:00 INFO - [1/1993_2000 submitted job:01 flows:1] health: submission timeout=None, polling intervals=PT15M,…
2025-12-01T10:32:10-07:00 INFO - [1/2013_2018 preparing job:01 flows:1] submitted to derecho_pbs:pbs[3794335]
2025-12-01T10:32:10-07:00 INFO - [1/2013_2018 preparing job:01 flows:1] => submitted
2025-12-01T10:32:10-07:00 INFO - [1/2013_2018 submitted job:01 flows:1] health: submission timeout=None, polling intervals=PT15M,…
2025-12-01T10:47:10-07:00 INFO - [1/1993_2000 submitted job:01 flows:1] poll now, (next in PT15M (after 2025-12-01T11:02:10-07:00))
2025-12-01T10:47:10-07:00 INFO - [1/2013_2018 submitted job:01 flows:1] poll now, (next in PT15M (after 2025-12-01T11:02:10-07:00))
2025-12-01T10:47:17-07:00 INFO - [1/1993_2000 submitted job:01 flows:1] (polled)submission failed at 2025-12-01T10:32:09-07:00
2025-12-01T10:47:17-07:00 CRITICAL - [1/1993_2000 submitted job:01 flows:1] submission failed
2025-12-01T10:47:17-07:00 INFO - [1/1993_2000 submitted job:01 flows:1] => submit-failed
2025-12-01T10:47:17-07:00 WARNING - [1/1993_2000 submit-failed job:01 flows:1] did not complete required outputs: [‘submitted’, ‘succeeded’]
2025-12-01T10:47:17-07:00 INFO - [1/2013_2018 submitted job:01 flows:1] (polled)submission failed at 2025-12-01T10:32:09-07:00
2025-12-01T10:47:17-07:00 CRITICAL - [1/2013_2018 submitted job:01 flows:1] submission failed
2025-12-01T10:47:17-07:00 INFO - [1/2013_2018 submitted job:01 flows:1] => submit-failed
2025-12-01T10:47:17-07:00 WARNING - [1/2013_2018 submit-failed job:01 flows:1] did not complete required outputs: [‘submitted’, ‘succeeded’]
2025-12-01T10:47:17-07:00 ERROR - Incomplete tasks:
* 1/1993_2000 did not complete required outputs: [‘submitted’, ‘succeeded’]
* 1/2013_2018 did not complete required outputs: [‘submitted’, ‘succeeded’]
2025-12-01T10:47:17-07:00 CRITICAL - Workflow stalled
2025-12-01T10:47:17-07:00 WARNING - PT1H stall timer starts NOW

You need to understand the optional output notation.

For your case, foo:finished is short for “foo succeeded OR failed” which necessarily implies that the two outputs are optional. Consequently foo:succeeded needs to be marked optional other places it appears in the graph too:

foo:finished => bar  # means "foo:succeeded? | foo:failed? => bar"
foo? => baz  # the `?` is required here because of the previous line

Your scheduler log appears to show that your two tasks submitted successfully to PBS (as they got PBS job IDs) but then polled as submit-failed shortly after. So Cylc put them in the submit-failed state and stalled the workflow - which is exactly what it should do if tasks fail and consequently there is nothing else it can run.

If a job goes to submit-failed after an apparently successful submission, it should mean that the submitted jobs got cancelled (e.g. qdel) before they could start executing. So now the obvious question is, did that really happen, or is Cylc not properly configured for job polling (i.e. job query) to work correctly on your system?

Are there other users on your platform successfully using Cylc? (Which would indicate it has been installed and configured correctly).

That config section is long deprecated (use global platforms config instead) - but in any case you have no settings in it so it won’t do anything.