Cylc xtrigger question

Rosalyn_Hatcher · October 7, 2025, 10:38am

Hi,

I’ve been experimenting using xtriggers to detect the arrival of a data file. The polling xtrigger function correctly detects the incoming data file and returns the contents which is a directory path. Is the result then only broadcast to the first dependent/downstream task in the workflow rather than them all?

The workflow (toy example below) has several downstream tasks which all need the information from the xtrigger function (a directory path to be processed). When run the poll_dataset_dir variable is only available to task1. Is there a way to make it available to task2 and tidy_up as well?

 [[xtriggers]]
     poll = poll_flags(flags_dir="/path/to/incoming", pattern="dataset-*.txt", cycle="%(point)s", sequential=True):PT10S

 [[graph]]
     P1 = """
          @poll => task1 => task2 => tidy_up
     """
 [[COMMON]]
    [[[environment]]]
       DATASET_DIR = $poll_dataset_dir

 [[task1]]
      inherit = COMMON
      platform = plat1
      script = "echo 'Running task1'; echo $DATASET_DIR; sleep 30s"

 [[task2]]
      inherit = COMMON
      platform = plat2
      script = "echo 'Running task2'; echo $DATASET_DIR; sleep 30s"

Not sure if I’m doing something wrong or misunderstanding the documentation. Was trying to avoid doing it the old way and writing the data to the filesystem.

Thanks,

Cheers, Ros.

oliver.sanders · October 7, 2025, 1:22pm

Hmmm,

From an inspection of the code, it looks like the broadcast targets the downstream task(s). So if you want share the result of an xtrigger with a task, add it as a direct dependency.

Let us know if there’s any misleading documentation (likely) and we’ll sort it out.

Note, ext-triggers (not to be mistaken for xtriggers) seem to broadcast their results to the whole cycle (External Triggers — Cylc 8.6.0 documentation).

hilary.j.oliver · October 7, 2025, 8:18pm

The documentation seems to be correct on this:

Xtrigger functions must return a flat dictionary of results to be broadcast to dependent tasks, via environment variables…

The general idea is, tasks that need the xtrigger result should depend on it, and we don’t broadcast more widely - to avoid pumping task environments with information that’s not needed.

So you can either make more tasks depend on the xtrigger, or have the dependent tasks distribute the result more widely. The former is preferable IMO - real dependence should be reflected by dependencies in the graph.

Rosalyn_Hatcher · October 8, 2025, 1:36pm

Thanks Oliver & Hilary.

So I guess the problem I have with the above example is that task2 is dependent on task1 running first and making both directly dependent on the xtrigger doesn’t work (and probably doesn’t really make sense).

[[graph]]
P1 = """
@poll => task1 & task2
task1 => task2 => tidy_up
"""

If I’m understanding correctly, then would need to have an extra local task dependent on the xtrigger that could then do the broadcast to task1 & task2?

hilary.j.oliver · October 8, 2025, 6:47pm

Hi Ros,

No, that should be fine (let us know if it isn’t!) - it just means task2 waits on task1 and the xtrigger.

Hilary

Rosalyn_Hatcher · October 9, 2025, 8:43am

Hi Hilary,

Thanks for verifying that should work. Unfortunately, I’ve just tried it again and it’s not working.

When the xtrigger is satisfied task1 runs. When task1 has completed successfully task2 does not run. The workflow stalls with task2 waiting for the xtrigger to run again.

This is with Cylc-8.5.3

Thanks,
Ros


[scheduling]
    cycling mode = integer
    initial cycle point = 1
    final cycle point = 2

[[xtriggers]]
    poll = poll_flags(flags_dir="/path/to/incoming/data", pattern="dataset-*.txt", cycle="%(point)s", sequential=True):PT10S

[[graph]]
    P1 = """
         @poll => task1 & task2
         task1 => task2
         """
[runtime]
    [[root]
        script = "echo 'Root script called'"
    [[COMMON]]
        [[[environment]]]
            DATASET_DIR = $poll_dataset_dir
    [[task1]]
        inherit = COMMON
        script = "echo 'Running task1'; echo $DATASET_DIR; sleep 5s"
    [[task2]]
        inherit = COMMON
        script = "echo 'Running task2'; echo $DATASET_DIR; sleep 5s"

hilary.j.oliver · October 9, 2025, 9:54pm

Hi Ros,

OK at first glance, some behaviour that I didn’t expect - but it should not be stalling your workflow.

[[xtriggers]]
    poll = poll_flags(..., cycle="%(point)s", ...):PT10S
[[graph]]
    P1 =  @poll => task1 & task2

Here, a single successful @poll call will satisfy both tasks (at one cycle point), but adding this:

"task1 => task2"

results in the xtrigger being re-called later for task2 - because the scheduler housekeeps xtrigger results once no active tasks remain that depend on them - and nowtask2 only gets spawned when task1 succeeds, after the old result is gone - hence the re-call.

However this should not stall your workflow so long as @poll returns the same result on re-call - can you check that’s the case please?

You can avoid the re-call by making task2 enter the active window earlier, if you like:

    dummy => task1 & task2  # spawn both into the active window immediately
...
   [[dummy]]
       run mode = skip

A plot twist - this example worked without re-calling the xtrigger prior to a (valid) bug fix in Cylc 8.5: the scheduler had been checking active tasks for all xtriggers not just unsatisfied ones, so it would retain the result for the already-satisfied task1, not for the (future, unsatified) task2 !

We’ll discuss on the team whether xtrigger housekeeping needs to be tweaked somehow…

hilary.j.oliver · October 10, 2025, 12:09am

Unfortunately a log message is being dropped that would make the re-call more obvious (it should say “commencing xtrigger call sequence…” again before task2) - I’ll get that fixed - but you will see the re-call in debug mode.

Here’s the example I used, with the built-in toy xrandom xtrigger:

[scheduling]
    cycling mode = integer
    initial cycle point = 1
    final cycle point = 1
    [[xtriggers]]
        poll = xrandom(percent=100, _="%(point)s"):PT10
    [[graph]]
        P1 = """
            @poll => task1 & task2
            task1 => task2 
            # dummy => task1 & task2  # spawn both early
        """
[runtime]
    [[dummy]]
        run mode = skip
    [[POLLED]]
        script = "echo ${poll_COLOR}, ${poll_SIZE}"
    [[task1, task2]]
        inherit = POLLED

hilary.j.oliver · October 10, 2025, 12:55am

Follow-up:

github.com/cylc/cylc-flow

Fix small xtrigger housekeeping bug.

master ← hjoliver:xtrigger-housekeeping-fix

opened 12:18AM - 10 Oct 25 UTC

hjoliver

+5 -0

Found while investigating [this forum post](https://cylc.discourse.group/t/cylc-…xtrigger-question/1255/6) The "next call" dict in `xtrigger_mgr` is not being housekept, so if the same xtrigger comes back into the pool later the "commencing xtrigger call" message will not be logged again. **Check List** - [x] I have read `CONTRIBUTING.md` and added my name as a Code Contributor. - [x] Contains logically grouped changes (else tidy your branch by rebase). - [x] Does not contain off-topic changes (use other PRs for other changes). - [x] Applied any dependency changes to both `setup.cfg` (and `conda-environment.yml` if present). - [x] Tests are included (or explain why tests are not needed). - [x] Changelog entry included if this is a change that can affect users - [x] [Cylc-Doc](https://github.com/cylc/cylc-doc) pull request opened if required at cylc/cylc-doc/pull/XXXX. - [x] If this is a bug fix, PR should be raised against the relevant `?.?.x` branch.

github.com/cylc/cylc-flow

xtrigger housekeeping tweak?

opened 12:51AM - 10 Oct 25 UTC

hjoliver

bug?

This [forum post](https://cylc.discourse.group/t/cylc-xtrigger-question/1255/6) …highlighted a "feature" of xtrigger housekeeping that could be improved (and *maybe* it is a bug, on rethink). We claim, *all tasks that depend on the same xtrigger will be satisfied by a single successful call*. But that's not quite true: we housekeep (forget) xtrigger results if there are no active tasks that depend on them. If another dependent task comes along after that, the scheduler will call the same xtrigger again. If an xtrigger function can be expected to stay satisfied if called again, the only consequences of this are: - a small inefficiency (technically re-calling the function should not be necessary) - an unnecessary delay (of the xtrigger call interval) for the later task But this could stall the workflow if the xtrigger does not stay satisfied. E.g. consider an xtrigger that checks for presence of a file and moves it somewhere else if found - that doesn't sound unreasonable 🤔 To change this, I think we'd have to go the DB every time an unsatisfied xtrigger shows up in the pool, to see if it had been satisfied in the past.

Rosalyn_Hatcher · October 14, 2025, 6:53pm

Hi Hilary,

Yes @poll will return a different result on re-call as the xtrigger is being used to detect incoming datasets so each dataset will only be picked up once. So looks like we will have to add the dummy task to avoid the second recall or make the dependent task distribute the result more widely.

Thanks for all you help with this.

Cheers,
Ros

Topic		Replies	Views
Xtriggers functionality Cylc Support	2	586	November 25, 2021
External triggers (xtriggers) and performance Cylc Support	0	413	August 1, 2019
Get xtrigger name from task that was triggered? Cylc Support	1	125	March 24, 2024
Best practice for on demand systems? Cylc Support	14	1218	August 20, 2020
Do xtriggers short-circuit? Cylc Support	1	413	July 25, 2019

Cylc xtrigger question

Related topics