Best practice when using conda in workflow, and a mystery

Kia ora! I’ve run into a pesky problem whereby, when running the following workflow:

#!Jinja2
{% set CYCLE_START_DATE = '20240801' %}
{% set CYCLE_END_DATE = '20240802' %}
[scheduling]
    initial cycle point = {{CYCLE_START_DATE}}
    final cycle point = {{CYCLE_END_DATE}}
    [[graph]]
        P1D = """
            a
        """
[runtime]
   [[a]]
        platform = belenos
        script = """
        set -x
        export MODULEPATH=/home/ext/mr/spsy/mercator/modulefiles:$MODULEPATH
        module -s purge
        module -s load gcc/9.2.0 intel/2018.5.274 intelmpi/2018.5.274 phdf5/1.8.18  netcdf_par/4.7.1_V2 xios-trunk_rev2134
        conda deactivate
        conda activate glo12_ease
        env | grep -i PYTHON
        mpiexec.hydra -n 128 -ppn '32(x4)' noobs_mpi -f /home/ext/mr/smer/turekg/cylc-run/glo12_cylc/run5/work/20201001T0000Z/obsopr/paraminput/FREE/ -cycle 20201003 -L 3  -OO 1 --typ SLA SST SIC VPT VPS -nxios 42
        """

I was getting a no module found error because it appeared to be looking for it in the cylc env rather than the glo12_ease one. We realized that in fact the cylc environment appears before the glo12_ease one.

Adding the deactivate before the activate solved the issue, but:

  1. Other scripts also use the glo12_ease env. but we have not experienced this issue
  2. My colleague does not have this issue. In her case the path order of the environments is correct. (She is using cylc-flow 3.8.3, me 3.8.4.We are both going to update to 3.8.5 just in case)
    3)Both of us initialize conda in our ~/.bashrc

In general, what is the “best practice” when dealing with conda envs within workflows (given that cylc itself lives in its own conda environment)?

Hi,

In general, what is the “best practice” when dealing with conda envs within workflows (given that cylc itself lives in its own conda environment)?

Cylc is usually deployed in a Conda environment, however, this will not have any impact on you or your workflows because our recommended setup does not activate the Cylc environment when running the Cylc command (more info).

So you can use Conda environments in Cylc workflows just as you would use them normally. No special treatment is required. E.g:

[runtime]
  [[foo]]
    script = """
      conda activate foo
      foo
    """
  [[bar]]
    script = """
      conda run -n bar bar
    """

There is one caveat where conda activate commands can sometimes fail (inside or outside of Cylc) due to shell activation scripts. If this happens to you, see this troubleshooting entry.



My colleague does not have this issue. In her case the path order of the environments is correct. (She is using cylc-flow 3.8.3, me 3.8.4.We are both going to update to 3.8.5 just in case)

This issue will not be solved by newer versions of Cylc. I suspect there is something different in how your user accounts are set up. Are any environments being activated in shell profile files (e.g. .bashrc, .bash_profile, .profile)?

Both of us initialize conda in our ~/.bashrc

Worth making sure that Conda isn’t auto-activating the base environment.

1 Like

Yep, it was the auto activate of base. I was going to try that hypothesis next myself.
headbang
As always a big thanx

That one hits a lot of people.

I’ve switched to micromamba which is an alternative Conda client implementation that does not require a base environment, partly due to this sort of thing.

1 Like