Proposed removal of scheduler checkpointing capability

hilary.j.oliver · October 14, 2020, 9:33pm

The Cylc scheduler program records and continually updates its state - what tasks are currently active and what state they are in, etc. - in the “run database”. If you stop the run, or the scheduler gets killed (e.g. if the host VM goes down), this is the “latest state” snapshot that we can initialize the scheduler with to do a restart.

Cylc 7 also supports arbitrary named checkpoints via the cylc checkpoint command, so you can restart at predetermined earlier points in the workflow instead of the latest state.

However, we are proposing to remove named checkpointing from Cylc 8. As far as we know it is not well used, and Cylc 8 has something we call “reflow” that should be much better (you can trigger a new “flow” at any point in the graph, with no DB checkpoint needed).

So: if you rely on checkpointing in Cylc 7, please let us know (on this forum) how you use it, so that we can be sure Cylc 8 will support your needs without checkpointing.

Regards,
Hilary

russb · October 15, 2020, 3:48pm

To confirm, this is just for the named checkpoints? Not the checkpointing automatic/recovery capability? We very much use the automatic checkpointing for recovery but not the named checkpoints.

oliver.sanders · October 15, 2020, 4:15pm

Yes, this is just for named checkpoints created with the cylc checkpoint command, the database would continue to preserve the current state of the workflow allowing for restarts and crash recovery.

hilary.j.oliver · October 22, 2020, 9:56pm

(OK great, as pretty much expected no one has chimed in in favor of named checkpoints … we’ll update you in the decision in due course.)

Topic		Replies	Views
'Checkpointing with a task' db file size Cylc Support	5	495	May 12, 2021
Reset states in cylc 8 Cylc 8 Migration	5	360	April 11, 2023
Can't restart workflow Cylc 8 Migration	5	243	April 20, 2023
Set succeeded (cylc7) vs set-outputs && remove vs skip (8.3+) Cylc Support	4	168	January 31, 2024
Does it make sense for there to be a `Reset state` -> `submitted` option? Cylc Support	2	424	December 14, 2020

Proposed removal of scheduler checkpointing capability

Related topics