I’m running the Cylc hub, and all user UI servers on one VM, with the actual Cylc workflows running on other VMs.
My Hub VM is x86_64, has 8 CPUs, 32GB of memory and 8GB of swap. It runs the hub, sudospawners and hubapps as user processes, as well as fairly normal OS stuff (including auditd, kauditd, fapolicyd).
We keep running out of swap and the VM system administrators are blaming python/Cylc. This is not entirely unreasonable, on equivalent servers where I’m only running the hub but have no users, we’re never running out of swap.
Tasks: 328 total, 3 running, 325 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.6 us, 0.9 sy, 0.0 ni, 96.8 id, 0.5 wa, 0.2 hi, 0.2 si, 0.0 st
MiB Mem : 31928.6 total, 282.6 free, 30663.4 used, 982.6 buff/cache
MiB Swap: 8192.0 total, 413.4 free, 7778.6 used. 829.0 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15531 user1 20 0 10.5g 4.1g 8888 S 3.6 13.1 230:01.01 python
23034 user2 20 0 6192940 4.4g 7296 S 3.0 14.0 53:24.01 python
1430 root 16 -4 75128 1060 772 D 2.6 0.0 110:40.08 auditd
2721 user3 20 0 11.4g 6.3g 0 S 1.7 20.3 86:52.76 python
3451 user4 20 0 2777116 253700 0 S 1.3 0.8 53:02.96 python
26112 user5 20 0 1792924 129116 0 S 1.3 0.4 21:53.31 python
1472 fapolic+ 6 -14 312252 41004 9128 S 0.7 0.1 33:40.86 fapolicyd
2807 user6 20 0 2872276 444260 0 S 0.7 1.4 55:30.52 python
11427 user7 20 0 1863620 189160 2152 S 0.7 0.6 17:25.85 python
111515 user8 20 0 5940704 1.5g 10784 S 0.7 4.8 45:05.81 python
69 root 20 0 0 0 0 R 0.3 0.0 13:47.13 kauditd
...
Is this to be expected? We had a smaller VM where users ran cylc gui
, cylc gscan
and workflows for Cylc 7 (6 CPUS, 20GB memory, 4GB swap) and never had the same problems with swap, but we also weren’t running a mini webserver for each user. Here’s an example snapshot from top for one of our Cylc 7 hosts.
Tasks: 370 total, 3 running, 366 sleeping, 0 stopped, 1 zombie
%Cpu(s): 24.0 us, 7.1 sy, 0.0 ni, 59.3 id, 9.0 wa, 0.0 hi, 0.6 si, 0.0 st
KiB Mem : 20393736 total, 718976 free, 7174120 used, 12500640 buff/cache
KiB Swap: 4194300 total, 4095228 free, 99072 used. 11847216 avail Mem
Has anyone done any load testing? Given I would like to be hosting ~ 70 users on this host, how much memory should I expect to need? If it makes a difference, assume on average 20 Cylc workflows per user.
On a different host, I expect to have many users (could be as high as 100) using the hub who don’t have any workflows, looking at UI Servers owned by a smaller set of users (~ 10-15) (so 115 UI servers, but only 15 needing to generate graphql etc). The smaller set of users will probably own at most 20 workflows each. I would welcome any suggestions on what level of provisioning I might need for that host.