Proposed removal of scheduler checkpointing capability

The Cylc scheduler program records and continually updates its state - what tasks are currently active and what state they are in, etc. - in the “run database”. If you stop the run, or the scheduler gets killed (e.g. if the host VM goes down), this is the “latest state” snapshot that we can initialize the scheduler with to do a restart.

Cylc 7 also supports arbitrary named checkpoints via the cylc checkpoint command, so you can restart at predetermined earlier points in the workflow instead of the latest state.

However, we are proposing to remove named checkpointing from Cylc 8. As far as we know it is not well used, and Cylc 8 has something we call “reflow” that should be much better (you can trigger a new “flow” at any point in the graph, with no DB checkpoint needed).

So: if you rely on checkpointing in Cylc 7, please let us know (on this forum) how you use it, so that we can be sure Cylc 8 will support your needs without checkpointing.

Regards,
Hilary

To confirm, this is just for the named checkpoints? Not the checkpointing automatic/recovery capability? We very much use the automatic checkpointing for recovery but not the named checkpoints.

Yes, this is just for named checkpoints created with the cylc checkpoint command, the database would continue to preserve the current state of the workflow allowing for restarts and crash recovery.

1 Like

(OK great, as pretty much expected no one has chimed in in favor of named checkpoints … we’ll update you in the decision in due course.)