alan-turing-institute
diff --git a/‎.github/workflows/docs.yaml‎
Lines changed: 6 additions & 3 deletions b/‎.github/workflows/docs.yaml‎
Lines changed: 6 additions & 3 deletions
diff --git a/‎.pre-commit-config.yaml‎
Lines changed: 3 additions & 0 deletions b/‎.pre-commit-config.yaml‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/.gitignore‎
Lines changed: 2 additions & 0 deletions b/‎docs/.gitignore‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎docs/abbreviations.yml‎
Lines changed: 4 additions & 0 deletions b/‎docs/abbreviations.yml‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/architecture/architecture.md‎
Lines changed: 51 additions & 0 deletions b/‎docs/architecture/architecture.md‎
Lines changed: 51 additions & 0 deletions
diff --git a/‎docs/architecture/defence_in_depth.md‎ b/‎docs/architecture/defence_in_depth.md‎
diff --git a/‎docs/architecture/introduction.md‎
Lines changed: 75 additions & 0 deletions b/‎docs/architecture/introduction.md‎
Lines changed: 75 additions & 0 deletions
diff --git a/‎docs/architecture/lifecycle.md‎
Lines changed: 20 additions & 0 deletions b/‎docs/architecture/lifecycle.md‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎docs/architecture/responsibility_and_5_safes.md‎
Lines changed: 5 additions & 0 deletions b/‎docs/architecture/responsibility_and_5_safes.md‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/architecture/roles.md‎
Lines changed: 32 additions & 0 deletions b/‎docs/architecture/roles.md‎
Lines changed: 32 additions & 0 deletions
@@ -9,6 +9,9 @@ on:
       - main
   workflow_dispatch:
 
+env:
+  BASE_URL: /${{ github.event.repository.name }}
+
 permissions:
   contents: read
   pages: write
@@ -26,15 +29,15 @@ jobs:
         uses: actions/checkout@v4
       - name: Install dependencies
         working-directory: ./docs
-        run: pip install -r requirements.txt
+        run: npm ci
       - name: Build docs
         working-directory: ./docs
-        run: mkdocs build
+        run: npm run build
       - name: Upload artifact
         if: github.event_name != 'pull_request'
         uses: actions/upload-pages-artifact@v4
         with:
-          path: ./docs/site/
+          path: ./docs/_build/html
           name: github-pages
   deploy:
     needs: build
 
@@ -7,8 +7,11 @@ repos:
       - id: check-yaml
         args: [--allow-multiple-documents]
       - id: end-of-file-fixer
+        exclude_types: [svg]
+        exclude: '.excalidraw$'
       - id: mixed-line-ending
       - id: trailing-whitespace
+        exclude_types: [svg]
   - repo: https://github.com/psf/black
     rev: 22.10.0
     hooks:
 
@@ -0,0 +1,2 @@
+# MyST build outputs
+_build
@@ -0,0 +1,4 @@
+version: 1
+project:
+  abbreviations:
+    TRE: Trusted Research Environment
@@ -0,0 +1,51 @@
+# Architecture
+
+## Tenancy
+
+![](../static/FRIDGE_tenancy.svg)
+
+- For researchers
+    - transparent proxy to fridge API
+- Security
+    - We feel the likelihood of "container breakout" is too high to let pod configuration be the _only_ barrier to privileged host access
+        - Arrived at the "two cluster" tenancy
+        - Isolated network, is heavily restricted. The _only_ outbound connection possible is to the container repository, to fetch container images
+            - No other "external" connection (_e.g._ internet)
+            - No access to other machines, VMs, nodes, in the target infrastructure/cloud
+    - Separation of "admin" and "researcher" areas
+        - different proxies
+        - possible to isolate the TRE operators from TRE researchers
+    - Defence in depth
+        - at K8s
+            - PSS
+            - Network (Cilium)
+            - RBAC
+            - Data-at-rest encryption (protects from bad actors, compromise at infrastructure provider)
+        - at infrastructure
+            - network (vnet) isolation (out of band!)
+        - Compromising the host is very unlikely
+            - And even if it does happen, and privilege escalation occurs, there is no access to other networks
+            - No access to data from other projects
+            - The worst that can happen is a researcher trashes their own environment
+    - Connection between TRE and FRIDGE
+        - pub/private key
+
+## FRIDGE internal
+
+- Update figure to cover access/isolated clusters, remove unused components
+- Access cluster
+    - user stuff
+        - kube proxy (sshd)
+        - FRIDGE proxy (sshd)
+        - container repository (harbor)
+    - others
+        - Network policy (cilium)
+- Isolated cluster
+    - user stuff
+        - FRIDGE API (fast api)
+        - workflow manager (argo workloads)
+        - job namespace
+        - object storage (minio)
+        - block storage (longhorn/CSI driver PVCs)
+    - others
+        - Network policy (cilium)
@@ -0,0 +1,75 @@
+---
+short_title: Introduction
+---
+# Architecture and Governance
+
+This section details the architecture and governance of a FRIDGE deployment.
+It explains the basic concepts underpinning FRIDGE, details of the governance roles and processes, and the technical architecture.
+
+## Executive Summary
+
+TREs can be constrained by the computing resources available to them.
+This could hinder research.
+FRIDGE enables the use of computing power from external resources, such as cloud or HPC, in an existing TRE.
+
+The approach to solving this problem is to extend the governance boundary of an existing TRE to the external resource.
+This is achieved by provisioning a secure enclave to the external infrastructure into which the FRIDGE satellite TRE is deployed.
+In effect, a portion of the external system is borrowed by the TRE and brought under the control of the TRE, and its existing governance and administrators.
+
+The advantage of this approach is, as we can formally consider the FRIDGE deployment part of an existing TRE, there is no need to involve the external infrastructure provider in data sharing agreements or rewrite existing agreements with data owners.
+
+## Key Concepts
+
+There are a number of concepts central to FRIDGE that are important to understand.
+
+:::{glossary}
+Trusted Research Environment
+: A secure computational environment designed for conducting research on sensitive data.
+  TREs must balance an appropriate level of security for the data being processed,
+  and usability to enable researchers to work effectively and efficiently.
+
+  FRIDGE assumes an existing TRE, with its own security controls and governance.
+  FRIDGE places few requirements on that TRE other than establishing a connection to a FRIDGE instance.
+
+Satellite TRE
+: An adjunct to an existing TRE that provides extra functionality or computational resources.
+  A satellite TRE is not a full TRE itself, and requires a home TRE to operate.
+
+  FRIDGE is an example of a satellite TRE.
+
+Roles
+: The operation of a FRIDGE instance depends on people, most likely, split between multiple organisations.
+  Their responsibilities and access form a key part of FRIDGE's security and governance.
+  As such, we have defined [roles](#arch-roles) for FRIDGE and outlined their scope and purpose.
+  This scheme helps clarify who is responsible for infrastructure or processes irrespective of which organisation they belong to or what their job title is.
+
+Governance Boundary Extension
+: The integration of resources owned by an external organisation into the control and governance of an existing TRE.
+  Asserting the TREs governance over the satellite TRE is essential for the operation of FRIDGE, and avoiding complex data sharing agreements between the {term}`Data Owner`, {term}`TRE Operator Organisation`, and {term}`FRIDGE Hosting Organisation`.
+  This arrangement is enabled by FRIDGE's technical and governance security controls.
+
+TRE Tenancy
+: The components that form the secure enclave into which the satellite TRE can be deployed.
+  The tenancy must ensure that data, processes and network traffic inside are opaque to,
+
+  1. the host system
+  1. other tenancies.
+
+Shared Responsibility
+: As the FRIDGE infrastructure, is split between organisations, there is no single organisation that takes sole responsibility for a FRIDGE instance and its operation.
+  The shared responsibility model defines the boundaries of each organisations responsibility.
+  For efficient operation it is important that the {term}`TRE Operator Organisation` retains responsibility for data processing.
+  However, it is not possible to remove all responsibility from the {term}`FRIDGE Hosting Organisation`, for example in deploying a secure {term}`TRE Tenancy`.
+
+  The general principle of FRIDGE is to minimise the responsibility of the {term}`FRIDGE Hosting Organisation`.
+
+Defence in Depth
+: The use of multiple layers of security control so that if one control is broken or circumvented, the overall system remains secure.
+  The worst-case scenario is the unauthorised egress of data from a FRIDGE instance.
+  In FRIDGE we aim for no single point of failure and have ensured that no single failure would, on its own, lead to that outcome.
+
+Five Safes
+: The [Five Safes Framework](https://ukdataservice.ac.uk/help/secure-lab/what-is-the-five-safes-framework/) sets out a method for understanding the range of factors affecting data security in research.
+  It can be used as a guide to designing or assessing TREs.
+  The framework splits data security into five components, safe data, safe projects, safe people, safe settings, and safe outputs.
+:::
@@ -0,0 +1,20 @@
+---
+short_title: Lifecycle
+---
+# FRIDGE Lifecycle and Data Flow
+
+- Diagram of data flow
+    - immutable inputs
+    - working space backed by encrypted performant block storage
+    - write-only outputs deposited in bucket
+    - Workflow templates
+- Security
+    - Buckets for input and outputs
+    - Different permissions based on RBAC
+    - input
+        - API:  RW
+        - Service AC (researchers): RO
+    - egress
+        - API: RW
+        - Service AC (researchers):  WO
+    - inputs are immutable
@@ -0,0 +1,5 @@
+---
+short_title: Responsibility Mapping
+---
+
+# Responsibility and 5 Safes
@@ -0,0 +1,32 @@
+(arch-roles)=
+# Roles
+
+:::{glossary}
+Data Owner
+: …
+
+TRE Operator Organisation
+: …
+
+TRE Administrators
+: Members of {term}`TRE Operator Organisation`
+
+FRIDGE Hosting Organisation
+: …
+
+Hosting Provider Administrators
+: Members of {term}`FRIDGE Hosting Organisation`
+
+Resource Allocators
+: …
+
+Principal Investigators
+: …
+
+Safe Researchers
+: …
+
+Job Submitters
+: A subset of {term}`Safe Researchers`.
+  These Researchers are users of the TRE who are permitted to dispatch jobs to the FRIDGE instance.
+:::
-Original file line number
+Diff line change
@@ @@ -0,0 +1,5 @@ @@
 +---
 +short_title: Responsibility Mapping
 +---
++
 +# Responsibility and 5 Safes