Job scripts appearing in wrong suite directory

I’ve got several long running climate suites using cylc 7.8.1. Occasionally a job script from one suite will appear in another suite’s log directory. E.g.

%  ls -l ~/cylc-run/u-bj594/log/job/
total 16
drwxr-x---+ 3 xxxxxxx xxx 4096 Jun 23 17:18 10040101T0000Z
drwxr-x---+ 5 xxxxxxx xxx 4096 Jun 24 08:44 19070701T0000Z
drwxr-x---+ 7 xxxxxxx xxx 4096 Jun 24 11:01 19080101T0000Z
drwxr-x---+ 3 xxxxxxx xxx 4096 Jun 24 11:01 19080701T0000Z

The 1004 directory is from suite u-bj595

% head  cylc-run/u-bj594/log/job/10040101T0000Z/filemove/01/job 
#!/bin/bash -l
#
# ++++ THIS IS A CYLC TASK JOB SCRIPT ++++
# Suite: u-bj595
# Task: filemove.10040101T0000Z

The identical job script also appears at the same time in the u-bj595 directory and the job runs without a problem, so this appears to be simply an extra copy. The suite log files don’t show anything strange.

I only noticed this because of log directories that weren’t getting cleaned up by rose-prune because they were out of sequence.

Any ideas on what extra logging or debugging I should try?

Our site uses ssh communication.

Martin

Hi Martin,

Can’t really see how it happened.

  1. Was there any incorrect setting that put the wrong suite name in the environment?
  2. Some weird file system problem?

A quick find/grep of obvious strings may help. Otherwise, you can probably switch on --debug mode for the suite as well.

Matt

Hi Martin,

I’m struggling to see how that could happen too. Does it always happen to the same task in the suite, or does it seem to occur randomly?

Hilary

It’s random and infrequent. It’s not always the same task or the same suite. I had another instance overnight (this time with all suites running cylc 7.8.3), so rate is perhaps once in 100 cycles (with 5 tasks per cycle).

I’ll try turning on debug mode.