Best practice when using conda in workflow, and a mystery

gturek · October 28, 2024, 10:19am

Kia ora! I’ve run into a pesky problem whereby, when running the following workflow:

#!Jinja2
{% set CYCLE_START_DATE = '20240801' %}
{% set CYCLE_END_DATE = '20240802' %}
[scheduling]
    initial cycle point = {{CYCLE_START_DATE}}
    final cycle point = {{CYCLE_END_DATE}}
    [[graph]]
        P1D = """
            a
        """
[runtime]
   [[a]]
        platform = belenos
        script = """
        set -x
        export MODULEPATH=/home/ext/mr/spsy/mercator/modulefiles:$MODULEPATH
        module -s purge
        module -s load gcc/9.2.0 intel/2018.5.274 intelmpi/2018.5.274 phdf5/1.8.18  netcdf_par/4.7.1_V2 xios-trunk_rev2134
        conda deactivate
        conda activate glo12_ease
        env | grep -i PYTHON
        mpiexec.hydra -n 128 -ppn '32(x4)' noobs_mpi -f /home/ext/mr/smer/turekg/cylc-run/glo12_cylc/run5/work/20201001T0000Z/obsopr/paraminput/FREE/ -cycle 20201003 -L 3  -OO 1 --typ SLA SST SIC VPT VPS -nxios 42
        """

I was getting a no module found error because it appeared to be looking for it in the cylc env rather than the glo12_ease one. We realized that in fact the cylc environment appears before the glo12_ease one.

Adding the deactivate before the activate solved the issue, but:

Other scripts also use the glo12_ease env. but we have not experienced this issue
My colleague does not have this issue. In her case the path order of the environments is correct. (She is using cylc-flow 3.8.3, me 3.8.4.We are both going to update to 3.8.5 just in case)
3)Both of us initialize conda in our ~/.bashrc

In general, what is the “best practice” when dealing with conda envs within workflows (given that cylc itself lives in its own conda environment)?

oliver.sanders · October 29, 2024, 9:35am

Hi,

In general, what is the “best practice” when dealing with conda envs within workflows (given that cylc itself lives in its own conda environment)?

Cylc is usually deployed in a Conda environment, however, this will not have any impact on you or your workflows because our recommended setup does not activate the Cylc environment when running the Cylc command (more info).

So you can use Conda environments in Cylc workflows just as you would use them normally. No special treatment is required. E.g:

[runtime]
  [[foo]]
    script = """
      conda activate foo
      foo
    """
  [[bar]]
    script = """
      conda run -n bar bar
    """

There is one caveat where conda activate commands can sometimes fail (inside or outside of Cylc) due to shell activation scripts. If this happens to you, see this troubleshooting entry.

My colleague does not have this issue. In her case the path order of the environments is correct. (She is using cylc-flow 3.8.3, me 3.8.4.We are both going to update to 3.8.5 just in case)

This issue will not be solved by newer versions of Cylc. I suspect there is something different in how your user accounts are set up. Are any environments being activated in shell profile files (e.g. .bashrc, .bash_profile, .profile)?

Both of us initialize conda in our ~/.bashrc

Worth making sure that Conda isn’t auto-activating the base environment.

gturek · October 29, 2024, 10:02am

Yep, it was the auto activate of base. I was going to try that hypothesis next myself.
headbang
As always a big thanx

oliver.sanders · October 29, 2024, 10:04am

That one hits a lot of people.

I’ve switched to micromamba which is an alternative Conda client implementation that does not require a base environment, partly due to this sort of thing.

Topic		Replies	Views
`conda: command not found` inside workflow task Cylc Support	2	72	June 5, 2024
Task subprocess isn't run in correct conda env... Or is it ?! 😵‍💫 Cylc Support	7	275	March 4, 2024
How do I use `conda activate` in Cylc? Cylc Support	3	1173	April 22, 2021
Conda environment Cylc	3	841	June 8, 2020
"hello world" workflow generates command not found error Cylc Support	5	243	May 16, 2023

Best practice when using conda in workflow, and a mystery

Related topics