Skip to content
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
301 changes: 301 additions & 0 deletions docs/docs/references/container-sources.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylergraff
The tested updates for featuring a nebari-config.yaml that enables options to override, exhaustively, every Nebari container image were never merged for the following reasons:

  • A more concise approach for mirroring images is available using containerD config overrides/imports, which does not require specifying the mirrored name for each individual container in nebari-config.yaml
  • Discussion took place regarding possible migration from Helm to kustomize and kustomization files, which could deem the terraform/helm override method of mirroring obsolete.

We took a different approach towards mirroring container images based on pointing to default mirrors for private registries (e.g. ECR, GitLab, etc.) as overrides/imports to the EKS nodes' containerD configs.
The enabling PR for this approach was PR#2668, which added the feature to run pre_bootstrap_command on nodes.

The following config options are examples of mirroring container images by means of customizing ContainerD at the k8s node:

# Set ECR as default container registry mirror
amazon_web_services:
  node_groups:
    general:
      instance: m5.2xlarge
      min_nodes: 1
      max_nodes: 1
      gpu: false
      single_subnet: false
      permissions_boundary:
      launch_template:
        pre_bootstrap_command: |
            #!/bin/bash
            # Verify that IP forwarding is enabled for worker nodes, as is required for containerd
            if [[ $(sysctl net.ipv4.ip_forward | grep "net.ipv4.ip_forward = 1") ]]; then echo "net.ipv4.ip_forward is on"; else sysctl -w net.ipv4.ip_forward=1; fi
            # Set ECR as default container registry mirror
            mkdir -p /etc/containerd/certs.d/_default
            ECR_TOKEN="$(aws ecr get-login-password --region us-east-1)"
            BASIC_AUTH="$(echo -n "AWS:$ECR_TOKEN" | base64 -w 0)"
            cat <<-EOT > /etc/containerd/certs.d/_default/hosts.toml
            [host."https://xxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com"]
              capabilities = ["pull", "resolve"]
              [host."https://xxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com".header]
                authorization = "Basic $BASIC_AUTH"
            EOT


# Set GitLab CR as default container registry mirror in hosts.toml; 
# must have override_path set if project/group names don't match upstream container
amazon_web_services:
  node_groups:
    general:
      instance: m5.2xlarge
      min_nodes: 1
      max_nodes: 1
      gpu: false
      single_subnet: false
      permissions_boundary:
      launch_template:
        pre_bootstrap_command: |
            #!/bin/bash
            # Verify that IP forwarding is enabled for worker nodes, as is required for containerd
            if [[ $(sysctl net.ipv4.ip_forward | grep "net.ipv4.ip_forward = 1") ]]; then echo "net.ipv4.ip_forward is on"; else sysctl -w net.ipv4.ip_forward=1; fi
            # Set default container registry mirror in hosts.toml; must have override_path set if project/group names don't match upstream container
            CONTAINER_REGISTRY_URL="gitlab-registry.link.net"
            CONTAINER_REGISTRY_USERNAME="project_2744_bot_xxxxxxxxxxxxxx"
            CONTAINER_REGISTRY_TOKEN="xxxxxxxxxxx"
            CONTAINER_REGISTRY_GROUP=as-nebari
            CONTAINER_REGISTRY_PROJECT=nebari-test
            mkdir -p /etc/containerd/certs.d/_default
            cat <<-EOT > /etc/containerd/certs.d/_default/hosts.toml
            [host."https://$CONTAINER_REGISTRY_URL/v2/$CONTAINER_REGISTRY_GROUP/$CONTAINER_REGISTRY_PROJECT"]
              override_path = true
              capabilities = ["pull", "resolve"]
            EOT
            # Set containerd registry config auth in config.d .toml import dir
            mkdir -p /etc/containerd/config.d
            cat <<EOT | sudo tee /etc/containerd/config.d/config-import.toml
            version = 2
            [plugins."io.containerd.grpc.v1.cri".registry]
              config_path = "/etc/containerd/certs.d:/etc/docker/certs.d"
              [plugins."io.containerd.grpc.v1.cri".registry.auths]
              [plugins."io.containerd.grpc.v1.cri".registry.configs]
                [plugins."io.containerd.grpc.v1.cri".registry.configs."$CONTAINER_REGISTRY_URL".auth]
                  username = "$CONTAINER_REGISTRY_USERNAME"
                  password = "$CONTAINER_REGISTRY_TOKEN"
            EOT


# Set GitLab CR as default container registry mirror in hosts.toml; 
# must have override_path set if project/group names don't match upstream container
# Also add/set GitLab Client SSL/TLS Certificate for Containerd
amazon_web_services:
  node_groups:
    general:
      instance: m5.2xlarge
      min_nodes: 1
      max_nodes: 1
      gpu: false
      single_subnet: false
      permissions_boundary:
      launch_template:
        pre_bootstrap_command: |
            #!/bin/bash
            # Verify that IP forwarding is enabled for worker nodes, as is required for containerd
            if [[ $(sysctl net.ipv4.ip_forward | grep "net.ipv4.ip_forward = 1") ]]; then echo "net.ipv4.ip_forward is on"; else sysctl -w net.ipv4.ip_forward=1; fi
            # Set default container registry mirror in hosts.toml; must have override_path set if project/group names don't match upstream container
            CONTAINER_REGISTRY_URL="gitlab-registry.link.net"
            CONTAINER_REGISTRY_USERNAME="project_2744_bot_xxxxxxxxxxxxxx"
            CONTAINER_REGISTRY_TOKEN="xxxxxxxxxxx"
            CONTAINER_REGISTRY_GROUP=as-nebari
            CONTAINER_REGISTRY_PROJECT=nebari-test
            mkdir -p /etc/containerd/certs.d/_default
            cat <<-EOT > /etc/containerd/certs.d/_default/hosts.toml
            [host."https://$CONTAINER_REGISTRY_URL/v2/$CONTAINER_REGISTRY_GROUP/$CONTAINER_REGISTRY_PROJECT"]
              override_path = true
              capabilities = ["pull", "resolve"]
              client = ["/etc/containerd/certs.d/$CONTAINER_REGISTRY_URL/client.pem"]
            EOT
            # Set containerd registry config auth in config.d .toml import dir
            mkdir -p /etc/containerd/config.d
            cat <<EOT | sudo tee /etc/containerd/config.d/config-import.toml
            version = 2
            [plugins."io.containerd.grpc.v1.cri".registry]
              config_path = "/etc/containerd/certs.d:/etc/docker/certs.d"
              [plugins."io.containerd.grpc.v1.cri".registry.auths]
              [plugins."io.containerd.grpc.v1.cri".registry.configs]
                [plugins."io.containerd.grpc.v1.cri".registry.configs."$CONTAINER_REGISTRY_URL".auth]
                  username = "$CONTAINER_REGISTRY_USERNAME"
                  password = "$CONTAINER_REGISTRY_TOKEN"
            EOT
            # Add client key/cert to containerd
            mkdir -p /etc/containerd/certs.d/$CONTAINER_REGISTRY_URL
            cat <<-EOT >> /etc/containerd/certs.d/$CONTAINER_REGISTRY_URL/client.pem
            -----BEGIN CERTIFICATE-----
            XzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxzxzxxzxzZx
            ZxyzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxzxzxxzxzXz
            -----END CERTIFICATE-----
            -----BEGIN PRIVATE KEY-----
            XzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxzxzxxzxzZx
            ZxyzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxzxzxxzxzXz
            -----END PRIVATE KEY-----
            EOT

Original file line number Diff line number Diff line change
@@ -0,0 +1,301 @@
## Deploying and Running Nebari from a Private Container Repository

Nebari deploys and runs FOSS components as containers running in Kubernetes.
By default, Nebari sources each container from the container's respective public repository, typically `docker.io` or `quay.io`.
This introduces supply-chain concerns for security-focused customers.

One solution to these supply-chain concerns is to deploy Nebari from private locally-mirrored containers:

- Create a controlled private container repository (e.g. ECR or GitLab Container Repo)
- Mirror all containers used by Nebari into this private container repository
- Use the `overrides` mechanism in `nebari-config.yaml` to specify the mirrored container sources

Deploying Nebari in this fashion eliminates significant supply chain surface-area, but requires identifying all containers used by Nebari.

The following configuration enumerates all container images used by Nebari 2024-9-1 and demonstrates how to source them from a private repo denoted by the string `[LOCAL_REPO]`.
The commented-out elements document the original public sources from which the container images are to be mirrored.

### Nebari 2024-9-1 Containers

```
default_images:
#jupyterhub: quay.io/nebari/nebari-jupyterhub:2024.5.1
jupyterhub: [LOCAL_REPO]/quay.io/nebari/nebari-jupyterhub:2024.5.1
#jupyterlab: quay.io/nebari/nebari-jupyterlab:2024.5.1
jupyterlab: [LOCAL_REPO]/quay.io/nebari/nebari-jupyterlab:2024.5.1
#dask_worker: quay.io/nebari/nebari-dask-worker:2024.5.1
dask_worker: [LOCAL_REPO]/quay.io/nebari/nebari-dask-worker:2024.5.1

security:
keycloak:
overrides:
image:
# Keycloak image repository
#repository: quay.io/keycloak/keycloak # default
repository: [LOCAL_REPO]/quay.io/keycloak/keycloak
# Overrides the Keycloak image tag whose default is the chart version
#tag: "15.0.2" # default
tag: ""

# This container is used at deploy-time to download keycloak-metrics-spi
extraInitContainers: |
- command:
- sh
- -c
- | wget --no-check-certificate https://github.com/aerogear/keycloak-metrics-spi/releases/download/2.5.3/keycloak-metrics-spi-2.5.3.jar -P /data/ &&
export SHA256SUM=9b3f52f842a66dadf5ff3cc3a729b8e49042d32f84510a5d73d41a2e39f29a96 &&
if ! (echo "$SHA256SUM /data/keycloak-metrics-spi-2.5.3.jar" | sha256sum -c)
then
echo "Error: Checksum not verified" && exit 1
else
chown 1000:1000 /data/keycloak-metrics-spi-2.5.3.jar &&
chmod 777 /data/keycloak-metrics-spi-2.5.3.jar
fi
image: [LOCAL_REPO]/alpine:latest
name: initialize-spi-metrics-jar
pgchecker:
image:
# repository: docker.io/busybox
repository: [LOCAL_REPO]/docker.io/busybox
tag: 1.32
postgresql:
image:
#registry: docker.io
registry: [LOCAL_REPO]
#repository: bitnami/postgresql
repository: docker.io/bitnami/postgresql
tag: 11.11.0-debian-10-r31
digest: ""

cluster_autoscaler:
overrides:
image:
#repository: k8s.gcr.io/autoscaling/cluster-autoscaler
repository: [LOCAL_REPO]/k8s.gcr.io/autoscaling/cluster-autoscaler
tag: v1.23.0

ingress:
traefik-image:
image: [LOCAL_REPO]/traefik
tag: 2.9.1

conda_store:
image: [LOCAL_REPO]/quansight/conda-store-server
image_tag: 2024.3.1

conda_store:
nfs_server_image: [LOCAL_REPO]/gcr.io/google_containers/volume-nfs
nfs_server_image_tag: "0.8"
overrides:
minio:
image:
#registry: docker.io
registry: [LOCAL_REPO]
#repository: bitnami/minio
repository: docker.io/bitnami/minio
tag: 2021.4.22-debian-10-r0
postgresql:
image:
#registry: docker.io
registry: [LOCAL_REPO]
#repository: bitnami/postgresql
repository: docker.io/bitnami/postgresql
tag: 11.14.0-debian-10-r17
digest: ""
redis:
image:
#registry: docker.io
registry: [LOCAL_REPO]
#repository: bitnami/redis
repository: docker.io/bitnami/redis
tag: 7.0.4-debian-11-r4
digest: ""

argo_workflows:
overrides:
controller:
image:
#registry: quay.io
registry: [LOCAL_REPO]
#repository: argoproj/workflow-controller
repository: quay.io/argoproj/workflow-controller
tag: ""
server:
image:
#registry: quay.io
registry: [LOCAL_REPO]
#repository: argoproj/argocli
repository: quay.io/argoproj/argocli
tag: "v3.4.4"
nebari_workflow_controller:
enabled: true
image: [LOCAL_REPO]/quay.io/nebari/nebari-workflow-controller
image_tag: 2024.5.1

monitoring:
overrides:
prometheus:
alertmanager:
alertmanagerSpec:
## Image of Alertmanager
image:
#registry: quay.io
#repository: prometheus/alertmanager
registry: [LOCAL_REPO]
repository: quay.io/prometheus/alertmanager
tag: v0.27.0
sha: ""
grafana:
image:
#registry: docker.io
registry: [LOCAL_REPO]
#repository: grafana/grafana
repository: docker.io/grafana/grafana
tag: ""
sha: ""
pullPolicy: IfNotPresent
sidecar:
image:
#registry: quay.io
registry: [LOCAL_REPO]
#repository: kiwigrid/k8s-sidecar
repository: quay.io/kiwigrid/k8s-sidecar
tag: 1.26.1
sha: ""
prometheusOperator:
image:
#registry: quay.io
registry: [LOCAL_REPO]
#repository: prometheus-operator/prometheus-operator
repository: quay.io/prometheus-operator/prometheus-operator
tag: ""
sha: ""
prometheusConfigReloader:
image:
#registry: quay.io
registry: [LOCAL_REPO]
#repository: prometheus-operator/prometheus-config-reloader
repository: quay.io/prometheus-operator/prometheus-config-reloader
tag: ""
sha: ""
kube-state-metrics:
image:
#registry: registry.k8s.io
registry: [LOCAL_REPO]
#repository: kube-state-metrics/kube-state-metrics
repository: registry.k8s.io/kube-state-metrics/kube-state-metrics
tag: ""
sha: ""
pullPolicy: IfNotPresent
prometheus-node-exporter:
image:
#registry: quay.io
registry: [LOCAL_REPO]
#repository: prometheus/node-exporter
repository: quay.io/prometheus/node-exporter
tag: ""
pullPolicy: IfNotPresent
digest: ""
prometheus:
prometheusSpec:
image:
#registry: quay.io
registry: [LOCAL_REPO]
#repository: prometheus/prometheus
repository: quay.io/prometheus/prometheus
tag: v2.51.2
sha: ""
loki:
loki:
image:
#registry: docker.io
registry: [LOCAL_REPO]
#repository: grafana/loki
repository: docker.io/grafana/loki
tag: null
lokiCanary:
image:
#registry: docker.io
registry: [LOCAL_REPO]
#repository: grafana/loki-canary
repository: docker.io/grafana/loki-canary
tag: null
gateway:
image:
#registry: docker.io
registry: [LOCAL_REPO]
#repository: nginxinc/nginx-unprivileged
repository: docker.io/nginxinc/nginx-unprivileged
tag: 1.24-alpine
sidecar:
image:
#repository: kiwigrid/k8s-sidecar
repository: [LOCAL_REPO]/kiwigrid/k8s-sidecar
tag: 1.24.3
promtail:
image:
#registry: docker.io
registry: [LOCAL_REPO]
#repository: grafana/promtail
repository: docker.io/grafana/promtail
tag: null
minio:
image:
#registry: docker.io
registry: [LOCAL_REPO]
#repository: bitnami/minio
repository: docker.io/bitnami/minio
tag: 2021.4.22-debian-10-r0

jupyterhub:
#volume_mount_init_image: "busybox:1.31"
volume_mount_init_image: [LOCAL_REPO]/busybox:1.31
proxy:
chp:
image:
#name: quay.io/jupyterhub/configurable-http-proxy
name: [LOCAL_REPO]/quay.io/jupyterhub/configurable-http-proxy
tag: 4.6.1
scheduling:
userScheduler:
enabled: true
image:
#name: registry.k8s.io/kube-scheduler
name: [LOCAL_REPO]/registry.k8s.io/kube-scheduler
tag: "v1.28.10"
singleuser:
networkTools:
image:
#name: quay.io/jupyterhub/k8s-network-tools
name: [LOCAL_REPO]/quay.io/jupyterhub/k8s-network-tools
tag: 4.0.0-0.dev.git.6548.h9b2dfe22
prePuller:
pause:
image:
#name: registry.k8s.io/pause
name: [LOCAL_REPO]/registry.k8s.io/pause
tag: "3.10"
jupyterhub_ssh:
jupyterhub_ssh_image:
name: [LOCAL_REPO]/quay.io/jupyterhub-ssh/ssh
tag: 0.0.1-0.dev.git.136.ha610981
jupyterhub_sftp_image:
name: [LOCAL_REPO]/quay.io/jupyterhub-ssh/sftp
tag: 0.0.1-0.dev.git.142.h402a3d6

dask_gateway:
dask_gateway_image:
#name: ghcr.io/dask/dask-gateway-server
name: [LOCAL_REPO]/ghcr.io/dask/dask-gateway-server
tag: "2022.4.0"
dask_controller_image:
#name: ghcr.io/dask/dask-gateway-server
name: [LOCAL_REPO]/ghcr.io/dask/dask-gateway-server
tag: "2022.4.0"

forward_auth:
traefik_forwardauth_image:
#name: maxisme/traefik-forward-auth
name: [LOCAL_REPO]/maxisme/traefik-forward-auth
tag: "sha-a98e568"
```
61 changes: 61 additions & 0 deletions docs/docs/references/enhanced-security.md
Copy link
Contributor

@joneszc joneszc Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylergraff
The following feature, although tested, was never merged
Lines 9-15

amazon_web_services:
  ec2_keypair_name: [example_keypair_name] # Name, not ARN

The following feature amazon_web_services.extra_ssl_certificates was tested but not merged:
Lines 17-28

  extra_ssl_certificates: |
    -----BEGIN CERTIFICATE-----
    MIIF...<snip>...ABCD
    -----END CERTIFICATE-----
    -----BEGIN CERTIFICATE-----
    MIIF...<snip>...EF01
    -----END CERTIFICATE-----

...Instead, the same feature can be implemented since PR#2668 as follows:

# Add client certificate to CA trust on node
amazon_web_services:
  node_groups:
    general:
      instance: m5.2xlarge
      min_nodes: 1
      max_nodes: 1
      gpu: false
      single_subnet: false
      permissions_boundary:
      launch_template:
        pre_bootstrap_command: |
            #!/bin/bash
            cat <<-EOT >> /etc/pki/ca-trust/source/anchors/client.pem
            -----BEGIN CERTIFICATE-----
            XzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxzxzxxzxzZx
            ZxyzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxxzxzxzxzxzxzxzxzxzxxzxzXz
            -----END CERTIFICATE-----
            EOT
            sudo update-ca-trust extract

Also, the Private EKS endpoint configuration feauture (Lines 30-36) was implemented in PR#2618 but needs to be configured as follows, with a string value as 1 of [public, private, public_and_private]:

amazon_web_services:
  eks_endpoint_access: private

Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
## Nebari Security Considerations

The security of _AWS Nebari_ deployments can be enhanced through the following deployment configuration options in `nebari-config.yaml`:

- **Explicit definition of container sources**
This option allows for the use of locally mirrored, security-hardened, or otherwise customized container images in place of the containers used by default.
See: [container-sources](container-sources.md)

- **Definition of an ssh key that can access EKS hosts**
EKS hosts by default cannot be accessed via ssh. This configuration item allows ssh access into EKS hosts, which can be useful for troubleshooting or external monitoring and auditing purposes.

```
amazon_web_services:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylergraff
The following feature, although tested, was never merged
Lines 9-15

amazon_web_services:
  ec2_keypair_name: [example_keypair_name] # Name, not ARN

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed!

ec2_keypair_name: [example_keypair_name] # Name, not ARN
```

- **Installation of custom SSL certificate(s) into EKS hosts**
Install private certificates used by (e.g.) in-line content inspection engines which re-encrypt traffic.

```
extra_ssl_certificates: |
-----BEGIN CERTIFICATE-----
MIIF...<snip>...ABCD
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIF...<snip>...EF01
-----END CERTIFICATE-----
```

- **Private EKS endpoint configuration**
Mirrors the corresponding AWS console option, which routes all EKS traffic within the VPC.

```
eks_endpoint_private_access: true
eks_endpoint_public_access: false
```

- **Deploy into existing subnets**
Instructs Nebari to be deployed into existing subnets, rather than creating its own new subnets.

```
existing_subnet_ids:
- subnet-0123456789abcdef
- subnet-abcdef0123456789
existing_security_group_id: sg-0123456789abcdef
ingress:
terraform_overrides:
load-balancer-annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: "true"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tylergraff
I think a note should be added here to clarify that setting the load balancer schema to internal type should be set only when pointing Nebari to private subnets

# Ensure the subnet IDs are also set below
service.beta.kubernetes.io/aws-load-balancer-subnets: "subnet-0123456789abcdef,subnet-abcdef0123456789"
```

- **Use existing SSL certificate**
Instructs Nebari to use the SSL certificate specified by `[k8s-custom-secret-name]`

```
certificate:
type: existing
secret_name: [k8s-custom-secret-name]
```
4 changes: 3 additions & 1 deletion docs/docs/references/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ import {useCurrentSidebarCategory} from '@docusaurus/theme-common';
/>
</div>

Nitty-gritty technical descriptions of how Nebari works.
Technical descriptions of how Nebari works.

- [Enhanced Security](enhanced-security.md) - Nebari security configuration guide
- [Local Container Repo](container-sources.md) - Deploying Nebari from a Local Container Repo
<DocCardList items={useCurrentSidebarCategory().items}/>
Loading