i have two ostensibly comparable workflows; one at cylc 7 and one at cylc 8.
in one task they use the same conda environment on the same host to run a simple python script to generate some regridder files but i get the following at cylc 7…
[snip]
File "src/netCDF4/_netCDF4.pyx", line 2464, in netCDF4._netCDF4.Dataset.__init__
File "src/netCDF4/_netCDF4.pyx", line 2027, in netCDF4._netCDF4._ensure_nc_success
PermissionError: [Errno 13] Permission denied: '/home/williamsjh/cylc-run/u-cy032/share/data/regridders/regridder_face_to_a.nc'
[FAIL] generate_regridders.sh <<'__STDIN__'
the cylc task simply runs a script called generate_regridders.sh.
i am able to run this fine from the command line using python and even from the shell script itself if run manually so i’m out of ideas as to how to get past this very odd error!
I’m not sure that I fully have my head round what’s going on, but my next port of call might be to add python --version and printenv to my script and compare the environment and the Python version.
I’d also double check the permissions on the file in question: Does the script write that file however you run it, or does it write a different file in your manual run tests?
i’m using a pre-script to activate a conda environment and these are the scripts…
> cat app/generate_regridders/bin/generate_regridders.sh
#!/usr/bin/bash -l
set -eu
# Run script to generate regridders
generate_regridders.py --um-resolution=$UM_RESOLUTION \
--lf-mesh=$LFRIC_MESH_FILE \
--regridder-dir=$ROSE_DATA/regridders
> head generate_regridders.py
#!/usr/bin/env python
""" Script to do deal with saving and loading regridders """
import argparse
import itertools
from pathlib import Path
import warnings
import iris
.
.
.
i have actually just fixed this and it turns out that it was because i was submitting it to our heavy compute host rather than the ‘ancillary’ nodes which do data processing and related stuff. i’d be lying if i understood why this is since they share a file system. i’ll take this up with @hilary.j.oliver locally at some point. to be honest i really should’ve been trying the ancillary nodes to begin with but it would be good to understand this for future reference!
Whilst I don’t know the details of your setup, this is likely to be set by a combo of task settings at Cylc 7 ([remote]host and [job]batch system), whilst at Cylc 8 a single setting (platform) does it all[1]. If you are running the same workflow without changes then Cylc 8 will be trying to work out the platform based on your Cylc 7 settings.
It may be worth you having a look at cylc config --platforms to see what your site settings are.
If you can afford the time and energy of maintaining different workflows[2], use the platform setting.
[1] Passes responsibility to the person setting up your site config.
[2] You could use Jinja2 to switch the config:
#!jinja2
{% from 'cylc.flow' import LOG %}
{% if CYLC_VERSION[0] == "8" %}
{% do LOG.info("This is Cylc 8") %}
{% else %}
{% do LOG.info("This is Cylc 7") %}
{% endif %}
Whilst I don’t know your setup, sometimes disks are mounted but only as readonly on different nodes. For example, on our HPC, if you submit to a compute node and aprun a command, it cannot write to the NFS HOME area as the HOME area is mounted as a read-only filesystem in this case. However if you submit to other nodes (or have not aprun’d the command yet), you can write to the NFS HOME area.
thanks @wxtim, i’ve had a look at the platform info and it doesn’t show anything revealing. i’m beginning to think that it’s an issue somehow related to what @TomC discusses in his later comment (thanks tom!).
thankfully this isn’t a big issue because i can just run this on the working platform!