Skip to content

Cylc 8 architecture security model and design decisions

Jacinta Richardson edited this page Feb 21, 2020 · 7 revisions

Cylc 8 architecture

There are several components involved in the cylc-8 architecture. These are as follows:

  • Proxy
  • Hub
  • UI Server
  • Workflow hosts
  • Job hosts
  • ZeroMQ

Proxy

A configurable HTTP proxy that provides access to the UI Servers.

Hub

Currently an un-modified Jupyter Hub, the hub exists for the following purposes.

  • Authenticating users and identifying their roles/permissions
  • Re-authenticating users where applicable
  • Spawning UI servers belonging to specific users

UI Server

A Jupyter-notebook inspired custom UI server, that runs with the permissions of a regular system user. Provides the HTML+ web UI to the user's workflows. UI Servers may be located on the same host as the Hub or on other hosts. One UI Server exists per user. The UI server:

  • Lists workflows
  • Allows interaction with specific workflows owned by the same user as the UI Server owner (stop, start, hold, edit triggers etc) by both the UI Server owner and anyone authenticated with a role that allows that interaction.
  • Provides access to workflow logs
  • Provides 'rose edit' functionality, to allow editing of workflow parameters.

Workflow host

Host and file system where the workflow files have been installed, and where cylc runs the workflows. A UI Server may have workflows across multiple hosts, but each workflow is only on one host.

A workflow is the same as a "cylc suite" and performs as defined in the workflow's suite.rc.

Job host

Host and file system where a workflow's jobs run. A workflow may have jobs across multiple hosts, including background jobs run on the same host as the workflow is defined on.

ZeroMQ

ZeroMQ is used to provide reliable communication between a workflow's jobs and itself, and between a workflow and its UI Server. By utilizing a messaging queue, messages are robust against network hiccoughs.

Architectural considerations

Two primary principles have lay behind decisions in making this architecture:

  1. Workflows have to be able to run, and submit their tasks.
  2. Users have to be able to find, start, stop, edit all of workflows they have permission to interact with from a single location.

In every case, tried-and-proven technologies have been preferred over custom-work and non-privileged actions have been preferred over privileged actions. Intra-workflow permissions rely on UNIX file system permissions, for example the UI Server acts on a workflow as its user, workflows run only as their user, and jobs run only as their user. Only files which have the execute bit set for the user can be executed, only files and directories which have the write bit set can be written to and so forth. Inter-workflow permissions rely on authentication at the hub and authorization at the UI server.

Component security

User connection to proxy/hub

By default

Hub

As an unedited version of the Jupyter Hub, the Jupyter Hub Security Overview is generally relevant.

Authentication is performed by the use of a Jupyter Hub authentication plugin to the organisation's host or site identity management eg PAM, LDAP, OAuth (GitHub and Google accounts), etc. See Jupyter's Authenticators page for more detail.

With one partial exception, different user's UI Server's are fully independent of each other, and cannot share HTML fragments or code. Unlike Jupyter notebooks, the HTML from UI Server is not generated by users, and indeed all user input displayed on the UI Server (such as workflow and task names) are HTML-escaped before display.

The partial exception

Questions:

  • how are the workflow files actually deployed onto the workflow server?
  • if a workflow is started manually, but in an equivalent way to the UI Server's starting them, does the "contact" file have to be registered with the UI server in some way or will it just scan over the equivalent of ~/cylc-run/*/ looking for contact files? (Is this how it will find stopped suites? Can we therefore just delete/move old ones when we don't want those suites to show up as existing and stopped?))
  • how are command-line level interactions managed?
Clone this wiki locally