Using integer cycle point as an index to a list of parameters


I am just getting started using cycle points, and trying to cycle over a workflow where each iteration is assigned a specific variable to be called by the tasks (specifically a date as a string). I can run a single iteration by defining one of the cylc parameters to be the datestring, and then calling this within the tasks. However, I would like to be able to provide a list of dates that could be cycled over one after the other (not all together). Can I use the integer cylcing to index the list of dates and pick one for each cycle. Or perhaps there is a better alternative to achieve the same outcome?
Many thanks,

I am not sure what you are trying to describe here, but here are a couple of possible pointers for help (I am assuming you are using Cylc 8):

  • The datetime cycling tutorials if you haven’t already gone through them
  • You can use the CYLC_TASK_CYCLE_POINT environment variable in task runtime scripts. This is the current cycle point of the task at runtime. See here for the full list of environment variables
    • You can also use the isodatetime command to fiddle about with datetimes in the task script. Try isodatetime --help for info.

Do you have a workflow file you can share so we can better understand?

If you want to iterate over a list of dates I’d use date-time cycling, not integer. I’m guessing that your workflow might look a bit like:

    initial cycle point = 1066
    stop after cycle point = 1086
        # Create a dependency on the previous task
        P1Y = my_task[-P1Y] => my_task
        script = """
            # Demo of how wierd a format isodatetime could output.
            TIME_POINT=$(isodatetime "$CYLC_TASK_CYCLE_POINT" --print-format "YEAR=%Y MONTH=%m DAY=%d")
            # Run your script:
            my_script" --date="${TIME_POINT}"

Is there any particular reason why the tasks need to be sequential? Does one produce data for the next or are you just wanting to avoid overwhealming a computer with to many tasks?

Thanks for the quick response. This looks like it would do what I’m after, thank you!

I was originally thinking along the lines of having

      originTime = '20180812', ''20180912', 

and then each iteration of the workflow would use the next originTime in the list (using the integer cycle point as a way to pick the right originTime to apply in the tasks. However, the dates are invariably consecutive/regular so using your above suggestion works nicely.

Each cycle currently needs to be sequential to prevent too many simultaneous attempts to access a database by some of the tasks - this is why I haven’t used the above parameter setup and then defined the graph as eg

A<originTime> => B<originTime> => C 

However, if I understand correctly, a careful choice of when the next cycle is triggered (ie as soon as the task requiring the database is complete) could streamline the cycling?
Many thanks,

1 Like

Direct answer

However, if I understand correctly, a careful choice of when the next cycle is triggered (ie as soon as the task requiring the database is complete) could streamline the cycling?


Detailed explanation

If you had something like:

database_heavy[-P1Y] => database_heavy
first_task => database_heavy => something_else

You would end up with only one database_heavy task at a time.

Alternative way of doing it: Queues

Alternatively (if “too many simultaneous attempts to access a database” > 1), you could do something like

            limit = 2
            members = database_heavy

Unrelated point: Cycle point format

It also occurs to me to mention that you can set the Cycle point format Workflow Configuration — Cylc 8.2.2 documentation - which might save you some messing around with isodatetime.

Whatever you do, Datetime Cycling is pretty close to Cylc’s Raison d’etre, and using parameters to iterate over datetimes is a bit of an anti-pattern.

Super, thank you so much for your help.

1 Like