Killing rose tasks

Hiya,

Some more hand-holding questions from me: I need to run some (3rd party) monitoring/post-processing tasks in parallel to the main run task (monitoring tasks handling run restart logic for funny input; post-processing tasks handling run output needing ~continuous post-processing).
When the main run task completes, I want these monitoring/post-processing tasks to stop.
I need to manage this within cylc, as these monitoring/post-processing tasks are unaware of the status of the main run task - they just run permanently/dumbly.

Various questions:

  1. These to-be-killed tasks will already be running, so I believe suicide triggers are irrelevant here, as this isn’t about graph manipulation, right?
  2. All tasks will be rose apps - is the right way to manage the job killing of the to-be-killed tasks still using cylc kill? I can’t see anything in the very handy (thanks!) rose cheat sheet which makes me think otherwise!
  3. Even more paranoid/LBYLish - I’ll be targetting (the MO’s) HPC - so pbs. I see the job killing link above says cylc kill is for “supported batch systems”. Anything I should worry about here?
  4. Finally, I guess I want to be paranoid about only killing the to-be-killed tasks in the current cycle: I don’t think my cycles should overlap, but I might be wrong! Is there a canonical way for targetting named-task-but-only-from-current-cycle?
    • I think I can probably work something out from Hilary’s good suggestion to look at cylc kill --help, and from the nice examples there (thanks!). However I think I’m at risk of coding something clunky, where there may be an elegant way of combining cylc kill with various cylc/rose env vars, or a judicious rose date call. Hence asking about canonical patterns for this!

Thanks, Edmund

Hi Edmund

  1. Suicide triggers don’t kill running tasks so won’t help you here.
    However, once you’ve killed the task it will be in a failed state so, with Cylc 7, you’ll need to use a suicide trigger to remove it and allow the workflow to complete (NB: won’t be an issue with Cylc 8!).

  2. What you’re running in the task is irrelevant - use cylc kill.

  3. I’m not sure why the user guide says “For supported batch systems” - cylc kill works on all submission methods supported by Cylc.

  4. The examples in the help for cylc kill look reasonable to me.
    https://cylc.github.io/cylc-doc/stable/html/appendices/command-ref.html#kill
    All you really need is the cycle point ($CYLC_TASK_CYCLE_POINT).
    https://cylc.github.io/cylc-doc/stable/html/suite-config.html#task-job-script-variables

1 Like

Thanks Dave!
For the record, in case it helps anyone facing this type of issue in future, below was my effective resulting implementation, using family inheritance and defining a “trigger” family for killing off the watchdog post-processing / dir linking tasks

  • Note I intend to update below so the watchdog tasks get initiated by a more-what-I-mean start trigger (on model_run:start), rather than current graph’s “start all run tasks off in parallel”.
graph = """
    [...] =>
    model_precursor=> RUN_TRG
    model_run => model_end_kill_mgmt
    model_end_kill_mgmt => ! RUN_MGMT_TRG
    [...]
"""
[...]
    # Families for family triggering
    [[TRG]]
        [[[meta]]]
            title = Trigger family
            description = Family to trigger related tasks in graphs
            help = """*_TRG members are used to trigger related tasks in graphs
    Useful for simplifying graph, to group logically-related but non-interdependent
    tasks which can be run/handled in parallel."""
[...]
    [[RUN_MGMT_TRG]]
        inherit = TRG
        [[[meta]]]
            title = Trigger for run management tasks
[...]
    [[model_run]]
        inherit = RUN_VIZ, RUN_TSK, RUN_TRG
        [[[meta]]]
            title = Runs the model

    [[model_run_postprocess]]
        inherit = RUN_MGMT_VIZ, RUN_MGMT_POSTPROC_TSK, RUN_TRG, RUN_MGMT_TRG
        [[[meta]]]
            title = Runs model tool to post-process model output

    [[model_run_link_restartdir]]
        inherit = RUN_MGMT_VIZ, RUN_MGMT_RESTART_TSK, RUN_TRG, RUN_MGMT_TRG
        [[[meta]]]
            title = Runs model tool to link most recent restart directory

    [[model_end_kill_mgmt]]
        inherit = RUN_END_VIZ, RUN_END_TSK
        script = cylc kill ${CYLC_SUITE_NAME} RUN_MGMT_TRG.${CYLC_TASK_CYCLE_POINT}
        [[[meta]]]
            title = Kills run management tasks
            description = """Kill to avoid indefinitely running following
            run management tasks:
            * model_run_postprocess
            * model_run_link_restartdir
            """