|
| 1 | +--- |
| 2 | +title: Deploy the Vector Database |
| 3 | +linkTitle: 8. Deploy the Vector Database |
| 4 | +weight: 8 |
| 5 | +time: 10 minutes |
| 6 | +--- |
| 7 | + |
| 8 | +In this step, we'll deploy a vector database to the AI POD and populate it with |
| 9 | +test data. |
| 10 | + |
| 11 | +## What is a Vector Database? |
| 12 | + |
| 13 | +A vector database stores and indexes data as numerical "vector embeddings," which capture |
| 14 | +the semantic meaning of information like text or images. Unlike traditional databases, |
| 15 | +they excel at "similarity searches," finding conceptually related data points rather |
| 16 | +than exact matches. |
| 17 | + |
| 18 | +## How is a Vector Database Used? |
| 19 | + |
| 20 | +Vector databases play a key role in a pattern called |
| 21 | +Retrieval Augmented Generation (RAG), which is widely used by |
| 22 | +applications that leverage Large Language Models (LLMs). |
| 23 | + |
| 24 | +The pattern is as follows: |
| 25 | + |
| 26 | +* The end-user asks a question to the application |
| 27 | +* The application takes the question and calculates a vector embedding for it |
| 28 | +* The app then performs a similarity search, looking for related documents in the vector database |
| 29 | +* The app then takes the original question and the related documents, and sends it to the LLM as context |
| 30 | +* The LLM reviews the context and returns a response to the application |
| 31 | + |
| 32 | +## Deploy a Vector Database |
| 33 | + |
| 34 | +For the workshop, we'll deploy an open-source vector database named |
| 35 | +[Weaviate](https://weaviate.io/). |
| 36 | + |
| 37 | +First, add the Weaviate helm repo that contains the Weaviate helm chart: |
| 38 | + |
| 39 | +``` bash |
| 40 | +helm repo add weaviate https://weaviate.github.io/weaviate-helm |
| 41 | +helm repo update |
| 42 | +``` |
| 43 | + |
| 44 | +The `weaviate/weaviate-values.yaml` file includes the configuration we'll use to deploy |
| 45 | +the Weviate vector database. |
| 46 | + |
| 47 | +We've set the following environment variables to `TRUE`, to ensure Weaviate exposes |
| 48 | +metrics that we can scrape later with the Prometheus receiver: |
| 49 | + |
| 50 | +```` |
| 51 | + PROMETHEUS_MONITORING_ENABLED: true |
| 52 | + PROMETHEUS_MONITORING_GROUP: true |
| 53 | +```` |
| 54 | + |
| 55 | +Review [Weaviate documentation](https://docs.weaviate.io/deploy/installation-guides/k8s-installation) |
| 56 | +to explore additional customization options available. |
| 57 | + |
| 58 | +Let's create a new namespace: |
| 59 | + |
| 60 | +``` bash |
| 61 | +oc create namespace weaviate |
| 62 | +``` |
| 63 | + |
| 64 | +Then deploy Weaviate: |
| 65 | + |
| 66 | +``` bash |
| 67 | +helm upgrade --install \ |
| 68 | + "weaviate" \ |
| 69 | + weaviate/weaviate \ |
| 70 | + --namespace "weaviate" \ |
| 71 | + --values ./weaviate/weaviate-values.yaml |
| 72 | +``` |
| 73 | + |
| 74 | +## Capture Weaviate Metrics with Prometheus |
| 75 | + |
| 76 | +Now that Weaviate is installed in our OpenShift cluster, let's modify the |
| 77 | +OpenTelemetry collector configuration to scrape Weaviate's Prometheus |
| 78 | +metrics. |
| 79 | + |
| 80 | +To do so, let's add an additional Prometheus receiver to the `otel-collector-values.yaml` file: |
| 81 | + |
| 82 | +``` yaml |
| 83 | + prometheus/weaviate: |
| 84 | + config: |
| 85 | + config: |
| 86 | + scrape_configs: |
| 87 | + - job_name: weaviate-metrics |
| 88 | + scrape_interval: 10s |
| 89 | + static_configs: |
| 90 | + - targets: |
| 91 | + - '`endpoint`:2112' |
| 92 | + rule: type == "pod" && labels["app"] == "weaviate" |
| 93 | +``` |
| 94 | +
|
| 95 | +We'll need to ensure that Weaviate's metrics are added to the `filter/metrics_to_be_included` filter |
| 96 | +processor configuration as well: |
| 97 | + |
| 98 | +``` yaml |
| 99 | + processors: |
| 100 | + filter/metrics_to_be_included: |
| 101 | + metrics: |
| 102 | + # Include only metrics used in charts and detectors |
| 103 | + include: |
| 104 | + match_type: strict |
| 105 | + metric_names: |
| 106 | + - DCGM_FI_DEV_FB_FREE |
| 107 | + - ... |
| 108 | + - object_count |
| 109 | + - vector_index_size |
| 110 | + - vector_index_operations |
| 111 | + - vector_index_tombstones |
| 112 | + - vector_index_tombstone_cleanup_threads |
| 113 | + - vector_index_tombstone_cleanup_threads |
| 114 | + - requests_total |
| 115 | + - objects_durations_ms_sum |
| 116 | + - objects_durations_ms_count |
| 117 | + - batch_delete_durations_ms_sum |
| 118 | + - batch_delete_durations_ms_count |
| 119 | +``` |
| 120 | + |
| 121 | +We also want to add a Resource processor to the configuration file, with the following configuration: |
| 122 | + |
| 123 | +``` yaml |
| 124 | + resource/weaviate: |
| 125 | + attributes: |
| 126 | + - key: weaviate.instance.id |
| 127 | + from_attribute: service.instance.id |
| 128 | + action: insert |
| 129 | +``` |
| 130 | + |
| 131 | +This processor takes the `service.instance.id` property on the Weaviate metrics |
| 132 | +and copies it into a new property called `weaviate.instance.id`. This is done so |
| 133 | +that we can more easily distinguish Weaviate metrics from other metrics that use |
| 134 | +`service.instance.id`, which is a standard OpenTelemetry property used in |
| 135 | +Splunk Observability Cloud. |
| 136 | + |
| 137 | +We'll need to add this Resource processor to the metrics pipeline as well: |
| 138 | + |
| 139 | +``` yaml |
| 140 | + service: |
| 141 | + pipelines: |
| 142 | + metrics/nvidia-metrics: |
| 143 | + exporters: |
| 144 | + - signalfx |
| 145 | + processors: |
| 146 | + - memory_limiter |
| 147 | + - filter/metrics_to_be_included |
| 148 | + - resource/weaviate |
| 149 | + - batch |
| 150 | + - resourcedetection |
| 151 | + - resource |
| 152 | + receivers: |
| 153 | + - receiver_creator/nvidia |
| 154 | +``` |
| 155 | + |
| 156 | +Before applying the configuration changes to the collector, take a moment to compare the |
| 157 | +contents of your modified `otel-collector-values.yaml` file with the |
| 158 | +`otel-collector-values-with-weaviate.yaml` file. |
| 159 | +Update your file as needed to ensure the contents match. Remember that indentation is important |
| 160 | +for `yaml` files, and needs to be precise. |
| 161 | + |
| 162 | +Now we can update the OpenTelemetry collector configuration by running the |
| 163 | +following Helm command: |
| 164 | + |
| 165 | +``` bash |
| 166 | +helm upgrade splunk-otel-collector \ |
| 167 | + --set="clusterName=$CLUSTER_NAME" \ |
| 168 | + --set="environment=$ENVIRONMENT_NAME" \ |
| 169 | + --set="splunkObservability.accessToken=$SPLUNK_ACCESS_TOKEN" \ |
| 170 | + --set="splunkObservability.realm=$SPLUNK_REALM" \ |
| 171 | + --set="splunkPlatform.endpoint=$SPLUNK_HEC_URL" \ |
| 172 | + --set="splunkPlatform.token=$SPLUNK_HEC_TOKEN" \ |
| 173 | + --set="splunkPlatform.index=$SPLUNK_INDEX" \ |
| 174 | + -f ./otel-collector/otel-collector-values.yaml \ |
| 175 | + -n otel \ |
| 176 | + splunk-otel-collector-chart/splunk-otel-collector |
| 177 | +``` |
| 178 | + |
| 179 | +In Splunk Observability Cloud, navigate to `Infrastructure` -> `AI Frameworks` -> `Weaviate`. |
| 180 | +Filter on the `k8s.cluster.name` of interest, and ensure the navigator is populated as in the |
| 181 | +following example: |
| 182 | + |
| 183 | + |
| 184 | + |
| 185 | +## Populate the Vector Database |
| 186 | + |
| 187 | +Now that Weaviate is up and running, and we're capturing metrics from it |
| 188 | +to ensure it's healthy, let's add some data to it that we'll use in the next part |
| 189 | +of the workshop with a custom application. |
| 190 | + |
| 191 | +The application used to do this is based on |
| 192 | +[LangChain Playbook for NeMo Retriever Text Embedding NIM](https://docs.nvidia.com/nim/nemo-retriever/text-embedding/latest/playbook.html#generate-embeddings-with-text-embedding-nim). |
| 193 | + |
| 194 | +We'll deploy a Kubernetes Job to our OpenShift cluster to load the embeddings. |
| 195 | +A job is used rather than a pod to ensure that this process runs only once: |
| 196 | + |
| 197 | +``` bash |
| 198 | +oc apply -f k8s-job.yaml |
| 199 | +``` |
| 200 | + |
| 201 | +> Note: to build a Docker image for the Python application that loads the embeddings |
| 202 | +> into Weaviate, we executed the following commands: |
| 203 | +> ``` bash |
| 204 | +> cd workshop/cisco-ai-pods/load-embeddings |
| 205 | +> docker build --platform linux/amd64 -t derekmitchell399/load-embeddings:1.0 . |
| 206 | +> docker push derekmitchell399/load-embeddings:1.0 |
| 207 | +> ``` |
0 commit comments