open-edge-platform
diff --git a/‎usecases/ai/rag-toolkit/Dockerfile‎
Lines changed: 0 additions & 37 deletions b/‎usecases/ai/rag-toolkit/Dockerfile‎
Lines changed: 0 additions & 37 deletions
diff --git a/‎usecases/ai/rag-toolkit/README.md‎
Lines changed: 32 additions & 40 deletions b/‎usecases/ai/rag-toolkit/README.md‎
Lines changed: 32 additions & 40 deletions
diff --git a/‎usecases/ai/rag-toolkit/backend/requirements.txt‎
Lines changed: 5 additions & 5 deletions b/‎usecases/ai/rag-toolkit/backend/requirements.txt‎
Lines changed: 5 additions & 5 deletions
diff --git a/‎usecases/ai/rag-toolkit/docker-compose.yml‎
Lines changed: 52 additions & 11 deletions b/‎usecases/ai/rag-toolkit/docker-compose.yml‎
Lines changed: 52 additions & 11 deletions
diff --git a/‎usecases/ai/rag-toolkit/docker/Dockerfile‎
Lines changed: 49 additions & 0 deletions b/‎usecases/ai/rag-toolkit/docker/Dockerfile‎
Lines changed: 49 additions & 0 deletions
@@ -19,59 +19,59 @@ Please ensure that you have these ports available before running the application
 | Serving | 8012 |
 
 ## Quick Start
-### Prerequisite
-If you are using this bundle without any finetuned model, you **must** follow the steps below before running the setup.
-
 ### 1. Install operating system
 Install the latest [Ubuntu* 22.04 LTS Desktop](https://releases.ubuntu.com/jammy/). Refer to [Ubuntu Desktop installation tutorial](https://ubuntu.com/tutorials/install-ubuntu-desktop#1-overview) if needed.
 
+### 2. Install GPU driver (Optional)
+If you plan to use GPU to perform inference, please install the GPU driver according to your GPU version.
+* Intel® Arc™ A-Series Graphics: [link](https://github.com/intel/edge-developer-kit-reference-scripts/tree/main/gpu/arc/dg2)
+* Intel® Data Center GPU Flex Series: [link](https://github.com/intel/edge-developer-kit-reference-scripts/tree/main/gpu/flex/ats)
+
+### 3. Setup docker
+Refer to [here](https://docs.docker.com/engine/install/) to setup docker and docker compose.
+
 <a name="hf-token-anchor"></a>
-### 2. Create a Hugging Face account and generate an access token. For more information, please refer to [link](https://huggingface.co/docs/hub/en/security-tokens).
+### 4. Create a Hugging Face account and generate an access token. For more information, please refer to [link](https://huggingface.co/docs/hub/en/security-tokens).
 
 <a name="hf-access-anchor"></a>
-### 3. Login to your Hugging Face account and browse to [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) and click on the `Agree and access repository` button.
+### 5. Login to your Hugging Face account and browse to [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) and click on the `Agree and access repository` button.
 
-### 4. Run the setup script
-This step will download all the dependencies needed to run the application.
+### 6. Build docker images with with your preference inference backend
+This step will download all the necessary files online, please ensure you have a valid network connection.
 ```bash
-./install.sh
+# OLLAMA GPU backend
+docker compose build --build-arg INSTALL_OPTION=2 
+
+# OpenVINO CPU backend (OpenVINO backend required Hugging Face token to be provided to download the model)
+docker compose build --build-arg INSTALL_OPTION=1 \
+  --build-arg HF_TOKEN=<your-huggingface-token>
 ```
 
-### 5. Start all the services
-Run the script to start all the services. During the first time running, the script will download some assets required to run the services, please ensure you have internet connection.
+### 7. Start docker container
 ```bash
-./run.sh
+docker compose up -d
 ```
-## Docker Setup
-### Prerequisite
-1. Docker and docker compose should be setup before running the commands below. Refer to [here](https://docs.docker.com/engine/install/) to setup docker.
-1. Install necessary GPU drivers.
-   - Refer to [here](../../../gpu/arc/dg2/README.md) to setup GPU drivers
-
-
-### 1. Setup env
-Set the INSTALL_OPTION in env file. 
 
-1 = VLLM (OpenVINO - CPU)
-  - Please also provide HF_TOKEN if using this option. Refer [here](#hf-token-anchor) to create a token.
-  - Ensure the hugging face token has access to Mistral 7b instruct v0.3 model. Refer [here](#hf-access-anchor) to get access to model.
-
-2 [default] = OLLAMA (SYCL LLAMA.CPP - CPU/GPU)
+## Development
+On host installation can be done by following the steps below:
+### 1. Run the setup script
+This step will download all the dependencies needed to run the application.
 ```bash
-cp .env.template .env
+./install.sh
 ```
 
-### 2. Build docker container
+### 2. Start all the services
+Run the script to start all the services. During the first time running, the script will download some assets required to run the services, please ensure you have internet connection.
 ```bash
-docker compose build
+./run.sh
 ```
- 
-### 3. Start docker container
+
+## FAQ
+### Uninstall the app
 ```bash
-docker compose up -d
+./uninstall.sh
 ```
 
-## FAQ
 ### Utilize NPU in AI PC
 The Speech to Text model inference can be offloaded on the NPU device on an AI PC. Edit the `ENCODER_DEVICE` to *NPU* in `backend/config.yaml` to run the encoder model on NPU. *Currently only encoder model is supported to run on NPU device*
 ```
@@ -82,11 +82,6 @@ STT:
   DECODER_DEVICE: CPU
 ```
 
-### Uninstall the app
-```bash
-./uninstall.sh
-```
-
 ### Environmental variables
 You can change the port of the backend server api to route to specific OpenAI compatible server running as well as the serving port.
 | Environmental variable |       Default Value      |
@@ -98,6 +93,3 @@ You can change the port of the backend server api to route to specific OpenAI co
 ## Limitations
 1. Current speech-to-text feature only work with localhost.
 2. RAG documents will use all the documents that are uploaded.
-
-## Troubleshooting
-1. If you have error to run the applications, you can refer to the log files in the logs folder.
@@ -9,10 +9,10 @@ numpy==1.26.4
 openai==1.39.0
 pyyaml==6.0.1
 pypdf==5.0.0
-chromadb==0.5.5
-langchain-chroma==0.1.2
-langchain==0.3.5
-langchain-community==0.3.5
+langchain==0.3.7
+langchain-chroma==0.1.4
+langchain-community===0.3.5
+chromadb==0.5.18
 huggingface_hub>=0.23.0
 botocore==1.34.88
 cached_path==1.6.3
@@ -21,4 +21,4 @@ cached_path==1.6.3
 torch==2.4.0
 torchaudio==2.4.0
 openvino==2024.3.0
-optimum[openvino,nncf]
+optimum[openvino,nncf]
@@ -1,22 +1,63 @@
-# Copyright (C) 2024 Intel Corporation
-# SPDX-License-Identifier: Apache-2.0
-
 services:
   backend:
     build:
       context: .
-      dockerfile: Dockerfile
+      dockerfile: ./docker/Dockerfile
       args: 
-        - INSTALL_OPTION=${INSTALL_OPTION}
-        - HF_TOKEN=${HF_TOKEN}
-    image: rag-toolkit.deployment
-    container_name: rag-toolkit.deployment
+        - INSTALL_OPTION
+        - HF_TOKEN
+    image: edge-ai-development-assistance-tool.deployment
+    container_name: edge-ai-development-assistance-tool.deployment-backend
     privileged: true
     ipc: host
     network_mode: host
+    depends_on:
+      serving:
+        condition: service_healthy
     volumes:
-      - /home/intel/edge-ui/.next
+      - app-data:/home/intel/data
     devices:
-      - /dev:/dev:rw
+      - /dev/dri:/dev/dri:rw
       - /lib/modules:/lib/modules:rw
-    command: './run.sh'
+    working_dir: /home/intel/backend
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:8011/healthcheck"]
+      interval: 30s
+      timeout: 10s
+      retries: 5
+    command: ["/bin/bash", "-c", "source /home/intel/.venv/bin/activate && uvicorn app:app --host localhost --port 8011"]
+
+  serving:
+    image: edge-ai-development-assistance-tool.deployment
+    container_name: edge-ai-development-assistance-tool.deployment-serving
+    privileged: true
+    ipc: host
+    network_mode: host
+    volumes:
+      - app-data:/home/intel/data
+    devices:
+      - /dev/dri:/dev/dri:rw
+      - /lib/modules:/lib/modules:rw
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:8012/v1/models"]
+      interval: 30s
+      timeout: 10s
+      retries: 5
+    command: './run-serving.sh'
+
+  frontend:
+    image: edge-ai-development-assistance-tool.deployment
+    container_name: edge-ai-development-assistance-tool.deployment-frontend
+    network_mode: host
+    depends_on:
+      backend:
+        condition: service_healthy
+    volumes:
+      - /home/intel/edge-ui/.next
+    working_dir: /home/intel/edge-ui
+    command: npm run start
+
+volumes:
+  app-data:
+
+    
@@ -0,0 +1,49 @@
+FROM intel/oneapi-basekit:2024.0.1-devel-ubuntu22.04
+
+USER root
+
+RUN rm /etc/apt/sources.list.d/intel-graphics.list
+RUN wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
+RUN echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | tee /etc/apt/sources.list.d/oneAPI.list
+RUN apt update && apt upgrade -y && apt install -y sudo software-properties-common pciutils
+
+# Install python 3.11
+ARG DEBIAN_FRONTEND=noninteractive
+RUN apt update && apt install -y python3.11 python3.11-dev python3.11-venv python3-pip
+RUN python3.11 -m pip install --upgrade pip
+
+# Create new user
+RUN groupadd -g 110 render
+RUN useradd -m intel
+RUN usermod -aG sudo intel
+RUN usermod -aG render intel
+# Set user to have sudo privileges
+RUN echo "intel ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
+
+# GPU Driver Installation
+RUN wget -qO - https://repositories.intel.com/gpu/intel-graphics.key | \
+        gpg --yes --dearmor --output /usr/share/keyrings/intel-graphics.gpg && \
+    echo "deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/gpu/ubuntu jammy client" | \
+        tee /etc/apt/sources.list.d/intel-gpu-jammy.list && \
+    apt update && \
+    apt-get install -y --no-install-recommends libze1 intel-level-zero-gpu intel-opencl-icd clinfo libze-dev intel-ocloc
+
+USER intel
+WORKDIR /home/intel
+ENV USER=intel
+
+# Install deps
+COPY --chown=intel  . .
+RUN mv docker/run-serving.sh ./
+RUN chown -R intel:intel /home/intel
+
+ARG INSTALL_OPTION=2
+ENV INSTALL_OPTION=$INSTALL_OPTION
+ARG HF_TOKEN=""
+ENV HF_TOKEN=$HF_TOKEN
+RUN ./install.sh
+# Unset the HF_TOKEN after installation
+ENV HF_TOKEN=
+
+HEALTHCHECK --interval=30s --timeout=30s --start-period=5s --retries=3 \
+    CMD wget --no-verbose -O /dev/null --tries=1 http://localhost:8011/healthcheck || exit 1