Modify or add arguments to the submit command?

Is there a way to modify or add arguments to the submit command (eq qsub for PBS)?

In section 11.1.10 you have:

Or separate the items:

-l select=28
-l ncpus=36
-l mpiprocs=18
-l ompthreads=2
-l walltime=12:00:00

However on my system this results in an error written to the job-activity.log:

[jobs-submit cmd] cylc jobs-submit – /glade/u/home/jedwards/cylc-run/cylctest/log/job 1/run/01

[jobs-submit ret_code] 5

[jobs-submit out] 2019-07-11T14:41:13-06:00|1/run/01|5|None

2019-07-11T14:41:13-06:00 [STDERR] qsub: “-lresource=” cannot be used with “select” or “place”, resource is: ncpus

Looking at the example https://cylc.github.io/doc/built-sphinx/task-job-submission.html#customjobsubmissionmethods
I think that there is an error:
BATCH_SYSTEM_HANDLER should be BATCH_SYS_HANDLER

Looking at the example https://cylc.github.io/doc/built-sphinx/task-job-submission.html#customjobsubmissionmethods
I think that there is an error:
BATCH_SYSTEM_HANDLER should be BATCH_SYS_HANDLER

Hi,

I think this is a bug in the documentation. If you would like to send a pull request to fix it, the offending file is https://github.com/cylc/cylc-doc/blob/master/src/task-job-submission.rst.

Cheers
Bruno

Hi Jim,

qsub: “-lresource=” cannot be used with “select” or “place”, resource is: ncpus

I don’t have access to PBS at the moment, but does that just suggest that PBS (or perhaps just your version of PBS?) doesn’t like use of ncpus and select at the same time? If so, presumably you can just use appropriate directives rather than copying our example ones? But maybe a PBS user can comment.

Is there a way to modify or add arguments to the submit command (eq qsub for PBS)?

To modify the submit command, you would derive a new batch system handler from the PBS one - just as in the example that you’ve found in the user guide, but give a new value for the SUBMIT_CMD_TMPL (submit command template) class variable (that you can see in lib/cylc/batch_sys_handlers/pbs.py).

Hilary

p.s. I’ve put up a fix for the doc typo: https://github.com/cylc/cylc-doc/pull/35

The statement
-l select=28:ncpus=36:mpiprocs=18:ompthreads=2

All needs to be on the same line, while the -l walltime=12:00:00 cannot be on the line. The solution that I found was to specify the walltime on the qsub line by using the
batch submit command template
variable in suite.rc.

Good spotting, I forgot that we still had that capability - definitely easier than deriving a new batch system handler!

https://cylc.github.io/doc/built-sphinx-single/index.html#overriding-the-job-submission-command

1 Like

Hi again @Jim,

Regarding the PBS error that you mentioned above, ideally you should not have to modify the qsub command template here. Is there anything we need to change in our documentation or PBS batch system support, to make it “just work” out of the box in your environment? (The list of directives you quoted from the Cylc User Guide are just a random selection of examples, I think, not necessarily meant to be taken literally as a group - i.e. some may be mutually incompatible). If so, please give an explicit example of what causes the error, and what PBS version you’re using. (Is it just use of walltime specified separately in the suite definition?)

Regarding the PBS walltime directive, if you use Cylc’s built in task execution time limit instead, the walltime directive will be generated automatically for you, and as an additional benefit Cylc will automatically poll jobs that exceed the time limit and yet seem to be still running (e.g. to detect if the job completion message was unable to be sent back due to a network outage or whatever).

Refs:

Re PBS Walltime:
I considered using the ‘execution time limit’ but then I would need to write a translator from HH:MM:SS to
the time format used in that variable.

Re PBS submit error:
Our pbs seems to require
-l select=1:ncpus=36:mpiprocs=36:ompthreads=1
all to be on one line, but
-l walltime=HH:MM:SS
to be listed separately, but as noted in the cylc documentation if I add both of these ‘-l’ options only the second one is honored.
According to the PBS documentation for our system there are several pbs option which must be listed with select, and a host of others which must not be. https://www2.cisl.ucar.edu/resources/computational-systems/cheyenne/running-jobs/submitting-jobs-pbs

I think you just need the directives (the bit before the equals sign) to be different. So, this should work:

[[[directives]]]
-l select = 1:ncpus=36:mpiprocs=36:ompthreads=1
-l walltime = HH:MM:SS

So should this:

[[[directives]]]
-l = select=1:ncpus=36:mpiprocs=36:ompthreads=1
-l walltime = HH:MM:SS

But this won’t work

[[[directives]]]
-l = select=1:ncpus=36:mpiprocs=36:ompthreads=1
-l = walltime=HH:MM:SS