Using integer cycle point as an index to a list of parameters

stewells · November 30, 2023, 9:26am

Hi,

I am just getting started using cycle points, and trying to cycle over a workflow where each iteration is assigned a specific variable to be called by the tasks (specifically a date as a string). I can run a single iteration by defining one of the cylc parameters to be the datestring, and then calling this within the tasks. However, I would like to be able to provide a list of dates that could be cycled over one after the other (not all together). Can I use the integer cylcing to index the list of dates and pick one for each cycle. Or perhaps there is a better alternative to achieve the same outcome?
Many thanks,

MetRonnie · November 30, 2023, 11:36am

I am not sure what you are trying to describe here, but here are a couple of possible pointers for help (I am assuming you are using Cylc 8):

The datetime cycling tutorials if you haven’t already gone through them
You can use the CYLC_TASK_CYCLE_POINT environment variable in task runtime scripts. This is the current cycle point of the task at runtime. See here for the full list of environment variables
- You can also use the isodatetime command to fiddle about with datetimes in the task script. Try isodatetime --help for info.

Do you have a workflow file you can share so we can better understand?

wxtim · November 30, 2023, 11:44am

If you want to iterate over a list of dates I’d use date-time cycling, not integer. I’m guessing that your workflow might look a bit like:

[scheduling]
    initial cycle point = 1066
    stop after cycle point = 1086
    [[graph]]
        # Create a dependency on the previous task
        P1Y = my_task[-P1Y] => my_task
[runtime]
    [[my_task]]
        script = """
            # Demo of how wierd a format isodatetime could output.
            TIME_POINT=$(isodatetime "$CYLC_TASK_CYCLE_POINT" --print-format "YEAR=%Y MONTH=%m DAY=%d")
            # Run your script:
            my_script" --date="${TIME_POINT}"

Is there any particular reason why the tasks need to be sequential? Does one produce data for the next or are you just wanting to avoid overwhealming a computer with to many tasks?

stewells · November 30, 2023, 11:59am

Thanks for the quick response. This looks like it would do what I’m after, thank you!

I was originally thinking along the lines of having

[cylc]
   [[parameters]]
      originTime = '20180812', ''20180912',

and then each iteration of the workflow would use the next originTime in the list (using the integer cycle point as a way to pick the right originTime to apply in the tasks. However, the dates are invariably consecutive/regular so using your above suggestion works nicely.

Each cycle currently needs to be sequential to prevent too many simultaneous attempts to access a database by some of the tasks - this is why I haven’t used the above parameter setup and then defined the graph as eg

A<originTime> => B<originTime> => C

However, if I understand correctly, a careful choice of when the next cycle is triggered (ie as soon as the task requiring the database is complete) could streamline the cycling?
Many thanks,

wxtim · November 30, 2023, 12:59pm

Direct answer

However, if I understand correctly, a careful choice of when the next cycle is triggered (ie as soon as the task requiring the database is complete) could streamline the cycling?

Yes

Detailed explanation

If you had something like:

database_heavy[-P1Y] => database_heavy
first_task => database_heavy => something_else

You would end up with only one database_heavy task at a time.

Alternative way of doing it: Queues

Alternatively (if “too many simultaneous attempts to access a database” > 1), you could do something like

[scheduling]
    [[queues]]
        [[[database_use]]]
            limit = 2
            members = database_heavy

Unrelated point: Cycle point format

It also occurs to me to mention that you can set the Cycle point format Workflow Configuration — Cylc 8.2.2 documentation - which might save you some messing around with isodatetime.

Whatever you do, Datetime Cycling is pretty close to Cylc’s Raison d’etre, and using parameters to iterate over datetimes is a bit of an anti-pattern.

stewells · November 30, 2023, 2:13pm

Super, thank you so much for your help.

Topic		Replies	Views
Export a TASK_CYCLE_POINT dependent value to all tasks Cylc Support	4	239	October 13, 2023
Easy way to figure out cycle point name n-cycles back (ISO date cycle format) Cylc Support	9	539	March 22, 2022
Tip: Cycle Point Format Tips	0	226	July 3, 2023
Passing CYLC_TASK_CYCLE_POINT to jinja filter Cylc Support	5	406	November 16, 2022
Scheduling with cycling mode = integer Cylc Support	3	540	July 10, 2019

Using integer cycle point as an index to a list of parameters

Direct answer

Detailed explanation

Alternative way of doing it: Queues

Unrelated point: Cycle point format

Related topics