Long-time Cylc user here, previously at the Met Office. I’m now at a new organisation and working on my first proper Cylc installation (I’ve only ever used it as an end user, or run it on my personal laptop where everything worked out of the box).
I’m running into an issue where the TUI shows ‘loading…’ indefinitely and the GUI only shows workflows after they have completed. I’ve done quite a bit of digging and would appreciate fresh eyes.
Environment
Cylc 8.6.3, installed via conda
Single-server setup (all components on one machine)
Linux
Symptoms
Workflows run correctly and complete successfully
TUI shows the workflow as ‘running’ but displays ‘loading…’ instead of live task status
The uiserver is running (files exist under ~/.cylc/uiserver/) but it does not seem to be able to connect to the scheduler despite the port being open. The False in register_workflow suggests it finds the workflow but cannot establish a live connection.
Is there a known issue with uiserver ↔ scheduler authentication in 8.6.3 on a fresh single-server conda install? Could there be missing uiserver-side certificates that are not generated automatically? Is there any additional configuration which I need to add somewhere?
Happy to provide any additional logs or config output. Thanks in advance! Björn
I haven’t heard of these symptoms before, single-server setups are usually straight-forward. However, this is what I would expect if network communication was broken for some reason.
Is there a known issue with uiserver ↔ scheduler authentication in 8.6.3 on a fresh single-server conda install?
No.
Could there be missing uiserver-side certificates that are not generated automatically?
No (HTTPS certs are required for secure Cylc Hub use, but that doesn’t affect workflows).
Is there any additional configuration which I need to add somewhere?
Shouldn’t be.
It might be worth running a command against the workflow using the --debug option, e.g:
If everything is working as expected, we should see:
DEBUG - zmq:send and DEBUG - zmq:recv messages in the output of the cylc ping command.
DEBUG - ALLOWED (CURVE) and DEBUG - ZAP reply messages in the workflow log.
The scheduler port is open and reachable (confirmed with nc)
Note that each Cylc scheduler uses two ports, these ports are selected at random from the range configured in the global.cylc file:
It might be the case that some ports within the range are ok, but others are blocked.
~/.cylc/flow/global.cylc` is configured with:
If the hostname is in /etc/hosts and the hostname command is returning the expected value, then you shouldn’t need to configure [[host self-identification]].
There’s a chance (albeit a very small one) that this could be causing issues if localhost isn’t in /etc/hosts or is routing in a strange way for some reason.
If the above doesn’t help, here’s a simple test which might help to identify any network communication problems.
These two Python files implement a “hello-world” TCP networking example using ZMQ:
server.py
import zmq
import time
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://*:5555")
while True:
# Wait for next request from client
message = socket.recv()
print(f"Received request: {message}")
# Do some 'work'
time.sleep(1)
# Send reply back to client
socket.send_string("World")
client.py
import zmq
context = zmq.Context()
# Socket to talk to server
print("Connecting to hello world server...")
socket = context.socket(zmq.REQ)
socket.connect("tcp://localhost:5555")
# Do 10 requests, waiting each time for a response
for request in range(10):
print(f"Sending request {request} ...")
socket.send_string("Hello")
# Get the reply.
message = socket.recv()
print(f"Received reply {request} [ {message} ]")
Run the test using two terminals, here’s a sample of the expected output:
terminal 1
$ python server.py
Received request: b'Hello'
Received request: b'Hello'
Received request: b'Hello'
terminal 2
$ python client.py
Connecting to hello world server...
Sending request 0 ...
Received reply 0 [ b'World' ]
Sending request 1 ...
Received reply 1 [ b'World' ]
Sending request 2 ...
Received reply 2 [ b'World' ]
Sending request 3 ...
Try changing the hostname and port in client.py to the values Cylc is using (e.g, look at the top of the log of a stopped workflow).
My example workflow is here: ctn05_clock-triggers/flow.cylc. It worked fine when I originally put it together as training material on a personal laptop, so hopefully it should run here as well.
From your suggestion, I understand that I should be able to remove my ~/.cylc/flow/global.cylc file entirely, and then check with our IT department whether the ports 43001–43101 are accessible.
When I run the server.py and client.py test programs, the output suggests that some security policy or a problem with the local ZeroMQ installation might be interfering:
./server.py
import-im6.q16: attempt to perform an operation not allowed by the security policy `PS' @ error/constitute.c/IsCoderAuthorized/426.
import-im6.q16: attempt to perform an operation not allowed by the security policy `PS' @ error/constitute.c/IsCoderAuthorized/426.
./server.py: line 4: syntax error near unexpected token `('
./server.py: line 4: `context = zmq.Context()'
Are there any additional things I should ask my IT department to check, or anything else in my installation that might cause this?
or anything else in my installation that might cause this?
If that hello-world example doesn’t work, it’s definitely a network issue. Once that example works, Cylc should be ok.
./server.py: line 4: syntax error near unexpected token ('`
Make sure you’re calling that script with Python (i.e, python server.py), you can activate the Cylc environment to pick up the installation of ZMQ it includes.
import-im6.q16: attempt to perform an operation not allowed by the security policy `PS’ @ error/constitute.c/IsCoderAuthorized/426.
I wouldn’t be surprised if security policy is the cause here. Firewall/blocked ports are a common issue, if you’re using Security Enhanced Linux (SELinux) that might also be a factor.
Good catch. Calling the scripts as “python server.py” and “python client.py” prints the hello world messages in the two terminals. So that part is working at least.