Unclear on how Cylc 8 components work together

I’m trying to understand how a Cylc 8 distributed deployment would look, with monitoring workstations being outside of the platform networks and how a JupyterHub, if we need one, fits into the picture.

Currently all our platforms could be considered external to our monitoring workstations, and although we have direct access via ssh to the platforms, we have to use functional accounts that we share and typically sudo into. We are able to set up passwordless SSH via authorized_keys, but I don’t see any mechanism in the Cylc 8 platform config to support alternate user for authentication.

I see the JupyterHub allows non-owner users if so configured to manage other workflows but where does the hub need to run? On the same network as the workflows? One of our clusters has an explicit “no web services” rule so that won’t work there, so we’d need to set it up elsewhere and allow it to communicate via SSH or do we need to open the workflow port range in the cluster firewall? I’m not sure I’m expressing the issue properly…

For multi-user installations:

  • Users authenticate with JupyterHub, via the appropriate authentication plugin
  • JupyterHub, as the only privileged part of the Cylc system, must be able to spawn Cylc UI Servers on target back end user accounts. There are various “spawners” available for spawning local and remote servers by various means. Custom spawners might be need in some cases.
  • The hub passes user credentials to the UI Server, which handles authorization - i.e., is the authenticated user allowed to perform a requested action on the account owner’s workflows. If so, the authenticated user will be logged with the action, for traceability
  • Authorization is configured at site and user level - users can delegate authority for workflow actions to other users and groups, within the bounds set at site level

Initially, you should be able to start (as we have) with no hub: like Cylc 7 everything (i.e., schedulers and UI Servers) runs as the user, and users start their own UI Server with the cylc gui command (instead of spawning them via the hub).

What you don’t get without the hub is the authorization, to see and interact with other users’ workflows. But you still have your direct access to the functional accounts via sudo, of course.

The hub can run anywhere, so long as it has the right access to the back end, and the right spawner, to be able to spawn UI Servers and proxy network traffic.

You should be able to use ssh port forwarding. I’m not quite sure what implications there are, if any, for the hub, but for the GUI it is easy enough (copied from our local docs):

First open an ssh tunnel, so that a given port on your local machine (e.g. your laptop) maps to the Cylc UI Server’s port on the HPC. On your local machine, type

$ ssh -N -L PORT:localhost:PORT HOST

where PORT is a valid port number and HOST is on the HPC. You will need to know the range of allowed ports (e.g.1024-49151). Choose any number in this range but make sure your port number is fairly unique to avoid clashing with other users. (Note the option -N opens the connection without logging you into the shell).

Then ssh to the host:

$ ssh HOST

and add the following to $HOME/.cylc/uiserver/jupyter_config.py on the HOST.

c.ServerApp.open_browser=False
c.ServerApp.port=PORT

where PORT and HOST match the values you selected when opening the ssh tunnel.

You’re now ready to fire up the web graphical interface

$ cylc gui

Just copy the URL that looks like

http://127.0.0.1:PORT/cylc?token=TOKEN

into your web browser. (Again substitute HOST and PORT with the values chosen above.)

Ok, I think the OpenSSH forwarding will do the trick to hold us over until we can learn more about JupyterHub. I was able to get it working as you described, thanks!

This has become more urgent as newer distributions are finally dropping Python 2 support and PyGTK as a precompiled package is becoming scarce.

1 Like