Date Time Cycling Recurrence Syntax - Start Date / Stop Date / Increment?

Hi there,

I have a question if there is any support for date time cycling syntax that would resemble a

Start Date / Stop Date / Increment

definition in Cylc 8.3.4. I have read the manual on the Cycling syntax rules and I see that there are currently listed syntaxes following Format 3

Recurrence Limit / Start Date / Increment

and Format 4

Recurrence Limit / Increment / Stop Date

that could produce the same effect, but for a specific use case I am hoping to use the first syntax listed above. I think this could help simplify some workflow templating where I wish to run an offline data assimilation case study in which the experiment goes through three distinct periods:

  • First forecast, cold start.
  • Warmup period in which 6-hourly short-range forecasts are generated with DA cycling.
  • Extended forecast period in which 6-hourly short range forecasts are generated with DA cycling and, on certain cycles, extended forecasts are generated using a restart run from the 6-hourly short range forecast using additional tasks / dependencies.

I can see how I can define each of the periods above by setting Format 3 or Format 4 parameters appropriately using the number of planned recurrences for each period, but for template simplicity in this use case I’d prefer to only set the following parameters:

  • First forecast date
  • Warmup start date
  • Warmup stop date = Extended forecast period start date
  • Extended forecast period stop date

Is this feasible currently?

Thanks for your consideration.

Cheers,
Colin

Hi,

Unfortunately, ISO8601 does not provide a R[INT]/START/END/INTERVAL format, this would make the solution much cleaner, however, it is still possible without this, just a bit awkward.

For my solution I’ve used a couple of tricks to make this work:

  • Cycle arithmetic:
    • Cylc can perform some basic maths, e.g. if you write 2000+P1Y, Cylc will evaluate this as 2001.
    • Note, ^ and $ reference the initial cycle point and final cycle point respectively.
  • Exclusions:
    • Cylc can exclude specific dates or recurrences from a recurrence, e.g. P1Y ! 2000 will run every year, except 2000.
    • Use exceptions sparingly, there is a small performance impact.
#!Jinja2

{% set first_forecast = '2000' %}
{% set warmup_period = 'P2Y' %}
{% set last_forecast = '2006' %}

{% set base_interval = 'P1Y' %}

[scheduler]
    allow implicit tasks = True

[scheduling]
    initial cycle point = {{ first_forecast }}
    final cycle point = {{ last_forecast }}
    [[graph]]
        # spinup
        R1/^ = spinup => pre => forecast => post

        # warmup period
        R/{{ base_interval }}/^+{{ warmup_period }} ! ^ = """
            forecast[-{{ base_interval }}] => pre & da => forecast => post
            forecast[-{{ base_interval }}] => forecast
        """

        # transition into extended forecast
        R1/^+{{ warmup_period }}+{{ base_interval }} = """
            forecast[-{{ base_interval }}] => forecast_extended
        """

        # extended forecast
        R/^+{{ warmup_period }}+{{ base_interval }}/{{ base_interval }} = """
            pre & da => forecast_extended => post
        """
        R/^+{{ warmup_period }}+{{ base_interval }}+{{ base_interval }}/{{ base_interval }} = """
            forecast_extended[-{{ base_interval }}] => forecast_extended
        """

        # teardown
        R1/$ = """
            post[-{{ base_interval }}] => teardown
        """

Note, I’ve combined the warmup start date and warmup stop date inputs into warmup_period to ensure the warmup cycling interval always follows immediately after the first forecast.


Alternative Solition

There is another type of solution, which I’ll include here for completeness…

Cylc is perfectly capable of implementing the R/START/STOP/INCREMENT format you want, unfortunately, it’s just that the ISO8601 format does not provide a syntax for this. We can, however, implement it ourselves:

Here’s the maths bit, stick this in lib/python/generate.py:

from metomi.isodatetime.parsers import TimePointParser, DurationParser


TPP = TimePointParser(assumed_time_zone=(0,0))
DP = DurationParser()


def generate(start, stop, duration, stop_inclusive=False):
    """Generate cycles between start and stop.

    This implements the recurrence format: R/start/stop/interval

    Note, this assumes UTC.

    This approach can be used to pre-generate cycles in Cylc workflows.
    Be aware that explicitly generating cycle points in this way is less
    efficient than allowing Cylc to generate them on the fly so use this
    approach sparingly as it may cause Cylc to use more CPU that you might like
    it to.

    Args:
        start: The start cycle as an ISO8601 date (e.g. 2000 or 20000101T00)
        stop: The stop cycle as an ISO8601 date (e.g. 2000 or 20000101T00)
        duration: The cycling interval as an ISO8601 duration (e.g. P1Y)
        stop_inclusive: If True, then "stop" will be included in the results.

    Yields:
        ISO8601 datetimes.

    """
    start = TPP.parse(start)
    stop = TPP.parse(stop)
    duration = DP.parse(duration)

    pointer = start
    if stop_inclusive:
        while pointer <= stop:
            yield pointer
            pointer = pointer + duration
    else:
        while pointer < stop:
            yield pointer
            pointer = pointer + duration

Then use it in the flow.cylc file like so:

#!Jinja2

{% set first_forecast = '2000' %}
{% set warmup_start = '2002' %}
{% set warmup_stop = '2004' %}
{% set last_forecast = '2006' %}

{% set base_interval = 'P1Y' %}

{% from "generate" import generate %}

[scheduler]
    allow implicit tasks = True

[scheduling]
    initial cycle point = {{ first_forecast }}
    final cycle point = {{ last_forecast }}
    [[graph]]
        # spinup
        R1/^ = spinup

        # cold start
        {% for date in generate(first_forecast, warmup_start, base_interval) %}
            R1/{{ date }} = """
                spinup[^] => pre => forecast => post
                forecast[-{{ base_interval }}] => forecast
            """
        {% endfor %}

        # warmup period
        {% for date in generate(warmup_start, warmup_stop, base_interval, stop_inclusive=True) %}
            R1/{{ date }}-{{ base_interval }} = """
                spinup[^] => pre & da => forecast => post
                forecast[-{{ base_interval }}] => forecast
            """
        {% endfor %}

        R1/{{warmup_stop}} = forecast[-{{ base_interval }}] => forecast_extended

        # extended forecast period
        {% for date in generate(warmup_stop, last_forecast, base_interval) %}
            R1/{{ date }} = """
                pre & da => forecast_extended => post
            {% if loop.index > 1 %}
                forecast_extended[-{{ base_interval }}] => forecast_extended
            {% endif %}
            """
        {% endfor %}


        # teardown
        R1/$ = """
            post[-{{ base_interval }}] => teardown
        """

This allows you to express the cycling the way you want to, however, there is a performance impact because we are pre-generating the cycles upfront, where Cylc would normally generate them on-demand as the workflow runs.

As long as the number of cycles is relatively small (<~200) this pattern is fine. But if you want daily cycling over several years, you might find that Cylc becomes a bit sluggish.

This is a reasonable pattern for implementing short sequences, e.g. those warmup cycles (as the number of these is quite small), but for long-running sequences, you’re best off using plain Cylc solutions.


Long-term Solution

ISO8601 is great for many things, but it has several limitations that make some solutions rather awkward or force us to use Jinja2.

There is an alternative format we could use called RRULE. This is much more powerful, and supports the start/stop/interval pattern you require, however, the syntax is quite a handful (although some people have attempted to simplify this).

I’ve investigated developing RRULE integration for Cylc in the past with reasonable success, it’s perfectly possible (although it would not be able to support Cylc’s alternative calendars). Maybe it will arrive as an experimental feature one day, however, there will be some bridges to cross before we can provide production-ready support for this.

@oliver.sanders, thanks so much for the rapid and detailed reply, these are great examples to see how the more advanced recurrence syntax and Jinja2 can be utilized to achieve a fairly clean / adaptable case study template. Cheers!