[scheduling]
[[xtriggers]]
# the :PT10M bit specifies how often Cylc should run this
my_poller = file_poll(file='/path/to/file'): PT10M
[[graph]]
P1D = """
@my_poller => do_something
"""
Note: XTriggers run on the scheduler host (the place where the Cylc workflow process lives) so this approach only works if you can see the relevant filesystem from there.
I’ve got xtriggers to handle remote/local file polling. I always meant to share them into the cylc-xtriggers repository, but I’ve not made them cylc8/py3 compatible and so haven’t shared outside of my organisation. I could probably attach them here if you wanted them. The code is a bit complex because I give people the option to:
poll for remote or local files
make sure the appropriate number of files exist if they come in sporadically
not start polling until some time period after a cycle point
finish polling some time after a cycle point to allow things to continue running without data
finish polling some time after the first poll was attempted
check that the file age is recent enough for their purposes (sometimes filenames don’t change, the contents just gets updated)
don’t action the same file multiple times (for ondemand models which get triggered by a file arrival)
don’t poll if a desired previous task has not reached a chosen status
some string replacements for a filename to insert 0 or non-0 padded date information into the filename
There is also a file_contains poll xtrigger I have to check the contents of files for either a regex string and/or number of lines in the file. I think it works, I haven’t got around to using it because I’ve been distracted. It did work once upon a time at least.
They are python2 andonly tested with Cylc 7.8.4. I guarantee they won’t work with Cylc8/python3. If someone could help with that, updating documentation to include descriptions, and figure out some way to test them properly (I do have some very limited pytest which aren’t added in this change as they don’t add much value), I would be happy to see them move into cylc-flow itself. I just don’t have time to figure out good end-to-end automated testing of them which I feel is necessary given the complexity.