serverless-rag/infrastructure/README.md at main · leon-liang/serverless-rag

Install CUDA Toolkit 12.8

Installation link: https://developer.nvidia.com/cuda-12-8-0-download-archive

Base Installer

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-ubuntu2404.pin

sudo mv cuda-ubuntu2404.pin /etc/apt/preferences.d/cuda-repository-pin-600

wget https://developer.download.nvidia.com/compute/cuda/12.8.0/local_installers/cuda-repo-ubuntu2404-12-8-local_12.8.0-570.86.10-1_amd64.deb

sudo dpkg -i cuda-repo-ubuntu2404-12-8-local_12.8.0-570.86.10-1_amd64.deb

sudo cp /var/cuda-repo-ubuntu2404-12-8-local/cuda-*-keyring.gpg /usr/share/keyrings/

sudo apt-get update

sudo apt-get -y install cuda-toolkit-12-8

Driver Installer

sudo apt-get install -y cuda-drivers

Add to Path

CUDA_PATH=/usr/local/cuda

CUDA_PATH_LINE="export PATH=\$CUDA_PATH/bin:\$PATH"

CUDA_LD_LIBRARY_PATH_LINE="export LD_LIBRARY_PATH=\$CUDA_PATH/lib64:\$LD_LIBRARY_PATH"

echo "CUDA_PATH=${CUDA_PATH}" >> ~/.bashrc

echo "$CUDA_PATH_LINE" >> ~/.bashrc

echo "$CUDA_LD_LIBRARY_PATH_LINE" >> ~/.bashrc

source ~/.bashrc

Create a Kubernetes Cluster using Containerd Runtime

Source: Step-by-Step Guide to Creating a Kubernetes Cluster on Ubuntu 22.04 Using Containerd Runtime

CP	apply on control-plane node only
W	apply on worker node only
CP-W	apply on both control-plane and worker node

1. Disable Ubuntu Firewall/ CP-W

sudo ufw disable

Verify that status with:

sudo ufw status

2. Update and ugrade the system/ CP-W

sudo apt update

sudo apt -y full-upgrade

3. Enable time-sync with an NTP server/ CP-W

sudo apt install systemd-timesyncd

sudo timedatectl set-ntp true

Check status with

sudo timedatectl status

and verify that NTP services is marked as active: "NTP service: active"

4. Turn off the swap/ CP-W

sudo swapoff -a
sudo sed -i.bak -r 's/(.+ swap .+)/#\1/' /etc/fstab

Check the status with the free -m command.

free -m

Swap values should have been set to 0. Check the fstab file as well. Otherwise swap will be turned on automatically on reboot.

cat /etc/fstab | grep swap

The 'swapfile' entry should be commented out.

5. Configure required kernel modules/ CP-W

sudo nano /etc/modules-load.d/k8s.conf

Add the following content to the file, save and close it

overlay
br_netfilter

Load the modules above into the current session

sudo modprobe overlay
sudo modprobe br_netfilter

Check the status

lsmod | grep "overlay\|br_netfilter"

6. Configure network parameters/ CP-W

sudo nano /etc/sysctl.d/k8s.conf

Add the following content to the file, save and close it

net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1

Apply the newly added network params

sudo sysctl --system

7. Install necessary software tools to continue/ CP-W

sudo apt-get install -y apt-transport-https ca-certificates curl gpg gnupg2 software-properties-common

Now both nodes are ready to install the Kubernetes tools and runtime.

8. Install Kubernetes Tools/ CP-W

8.1 Add Kubernetes repository and keys

First, check whether the /etc/apt/keyrings directory is present on your nodes. If not, create the directory using the command below

sudo mkdir -m 755 /etc/apt/keyrings

8.2 Download and add the k8s repository key

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.32/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

8.3 Add the Kubernetes repository in the source list

echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.32/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

8.4 Update the package manager and install Kubernetes tools

sudo apt update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

9. Install containerd runtime/ CP-W

9.1 Add the containerd repository key and add the repository to the source list

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

sudo chmod a+r /etc/apt/keyrings/docker.gpg

echo "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu "$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

9.2 Install containerd

sudo apt update
sudo apt install -y containerd.io

9.3 Configure containerd

sudo mkdir -p /etc/containerd

Generate the default config toml file

sudo containerd config default|sudo tee /etc/containerd/config.toml

Open the generated file in any text editor and verify the following settings:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  runtime_type = "io.containerd.runc.v2"  # <- note that this line might have been missed
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true # <- note that this could be set as false in the default configuration, please set it to true

Additionally, at the beginning of this config file, you should see the "disabled_plugins" set as an empty list ([]). Please verify that it is indeed empty. Now save the config file, close it and restart the containerd service

sudo systemctl restart containerd
sudo systemctl enable containerd
systemctl status containerd

9.4 Setup `crictl` for inspecting containers

sudo apt install cri-tools

sudo nano /etc/crictl.yaml

Add the follwing content to the file, save and exit

runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 2
debug: true
pull-image-on-create: false

9.5 Enable kubelet service

sudo systemctl enable kubelet

10. Initialising the Control-Plane Node/ CP

10.1 Pull necessary Kubernets images

sudo kubeadm config images pull --cri-socket unix:///var/run/containerd/containerd.sock

10.2 Initialise the Control-Plane

sudo kubeadm init \
  --pod-network-cidr=10.244.0.0/16 \
  --cri-socket unix:///var/run/containerd/containerd.sock \
  --v=5

Do not skip this step!

 mkdir -p $HOME/.kube
 sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
 sudo chown $(id -u):$(id -g) $HOME/.kube/config

Optional: Remove taints from the node

kubectl taint nodes --all node-role.kubernetes.io/control-plane-

This enables us to schedule workloads on the node, even though it's part of the control-plane.

10.3 Add network add-on

kubectl apply -f infrastructure/install/kube-flannel.yaml

11. Join the Worker Node to Kubernets Control-Plane

11.1 Generate the join command/ CP

kubeadm token create --print-join-command

11.2 Run the generated command on the Worker Node/ W

11.3 Verify using the following command/ CP

kubectl get nodes -o wide

Install the NVIDIA Container Toolkit

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update

sudo apt-get install -y nvidia-container-toolkit

Install K8s Device Plugin

1. Preparing GPU Nodes

This must be done on all GPU nodes in the cluster!

sudo nvidia-ctk runtime configure --runtime=containerd --set-as-default

sudo systemctl restart containerd

2. Enabling GPU Support in Kubernetes

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.0/deployments/static/nvidia-device-plugin.yml

3. Verify by running a GPU job

kubectl apply -f infrastructure/install/gpu-pod.yaml

kubectl logs gpu-pod

Expected result:

[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

Install Helm

curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3

chmod 700 get_helm.sh

./get_helm.sh

Install KEDA

1. Add Helm repo

helm repo add kedacore https://kedacore.github.io/charts

2. Update Helm repo

helm repo update

3. Install keda Helm chart

helm install keda kedacore/keda --namespace keda --create-namespace

FilesExpand file tree

README.md

Latest commit

History