DefangSamples
diff --git a/‎.github/workflows/deploy.yaml‎
Lines changed: 21 additions & 0 deletions b/‎.github/workflows/deploy.yaml‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 52 additions & 0 deletions b/‎README.md‎
Lines changed: 52 additions & 0 deletions
diff --git a/‎compose.yaml‎
Lines changed: 41 additions & 0 deletions b/‎compose.yaml‎
Lines changed: 41 additions & 0 deletions
diff --git a/‎ui/.dockerignore‎
Lines changed: 2 additions & 0 deletions b/‎ui/.dockerignore‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎ui/.env‎
Lines changed: 1 addition & 0 deletions b/‎ui/.env‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎ui/.eslintrc.json‎
Lines changed: 3 additions & 0 deletions b/‎ui/.eslintrc.json‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎ui/.gitignore‎
Lines changed: 36 additions & 0 deletions b/‎ui/.gitignore‎
Lines changed: 36 additions & 0 deletions
diff --git a/‎ui/Dockerfile‎
Lines changed: 15 additions & 0 deletions b/‎ui/Dockerfile‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎ui/next.config.mjs‎
Lines changed: 4 additions & 0 deletions b/‎ui/next.config.mjs‎
Lines changed: 4 additions & 0 deletions
@@ -0,0 +1,21 @@
+name: Deploy
+
+on:
+  push:
+    branches:
+      - main
+
+jobs:
+  deploy:
+    environment: playground
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      id-token: write
+
+    steps:
+    - name: Checkout Repo
+      uses: actions/checkout@v4
+
+    - name: Deploy
+      uses: DefangLabs/[email protected]
@@ -0,0 +1,52 @@
+# Mistral & vLLM
+
+[![1-click-deploy](https://defang.io/deploy-with-defang.svg)](https://portal.defang.dev/redirect?url=https%3A%2F%2Fgithub.com%2Fnew%3Ftemplate_name%3Dsample-vllm-template%26template_owner%3DDefangSamples)
+
+This guide demonstrates how to deploy Mistral using vLLM. You'll need a Hugging Face token to begin.
+
+## Prerequisites
+
+- Hugging Face token
+
+## Steps
+
+1. **Set the Hugging Face Token**
+
+   First, set the Hugging Face token using the `defang config` command.
+
+   ```bash
+   defang config set --name HF_TOKEN
+   ```
+
+2. **Launch with Defang Compose**
+
+   Run the following command to start the services:
+
+   ```bash
+   defang compose up
+   ```
+
+   The provided `compose.yaml` file includes the Mistral service. It's configured to run on an AWS instance with GPU support. The file also includes a UI service built with Next.js, utilizing Vercel's AI SDK.
+
+   > **OpenAI SDK:** We use the OpenAI sdk, but set the `baseURL` to our Mistral endpoint.
+
+   > **Note:** The API route does not use a system prompt, as the Mistral model we're deploying currently does not support this feature. To get around this we inject a couple messages at the front of the conversation providing the context (see the `ui/src/app/api/chat/route.ts` file). Other than that, the integration with the OpenAI SDK should be structured as expected.
+
+   > **Changing the content:** The content for the bot is set in the `ui/src/app/api/chat/route.ts` file. You can edit the prompt in there to change the behaviour. You'll notice that it also pulls from `ui/src/app/docs.md` to provide content for the bot to use. You can edit this file to change its "knowledge".
+
+## Configuration
+
+- The Docker Compose file is ready to deploy Mistral and the UI service.
+- The UI uses Next.js and Vercel's AI SDK for seamless integration.
+
+By following these steps, you should be able to deploy Mistral along with a custom UI on AWS, using GPU capabilities for enhanced performance.
+
+---
+
+Title: Mistral & vLLM
+
+Short Description: Deploy Mistral with a custom UI using vLLM.
+
+Tags: Mistral, vLLM, AI, Nextjs, GPU, Node.js, TypeScript, JavaScript
+
+Languages: nodejs
@@ -0,0 +1,41 @@
+services:
+  mistral:
+    restart: unless-stopped
+    image: ghcr.io/mistralai/mistral-src/vllm:latest
+    ports:
+      - mode: host
+        target: 8000
+    command: ["--host", "0.0.0.0", "--model", "TheBloke/Mistral-7B-Instruct-v0.2-AWQ", "--quantization", "awq", "--dtype", "auto", "--tensor-parallel-size", "1", "--gpu-memory-utilization", ".95", "--max-model-len", "8000"]
+    deploy:
+      resources:
+        reservations:
+          cpus: '2.0'
+          memory: 8192M
+          devices:
+            - capabilities: ["gpu"]
+              count: 1
+    healthcheck:
+      test: ["CMD", "python3", "-c", "import sys, urllib.request;urllib.request.urlopen(sys.argv[1]).read()", "http://localhost:8000/health"]
+      interval: 1m
+    environment:
+      - HF_TOKEN
+  ui:
+    restart: unless-stopped
+    build:
+      context: ui
+      dockerfile: Dockerfile
+    ports:
+      - mode: ingress
+        target: 3000
+        published: 3000
+    deploy:
+      resources:
+        reservations:
+          memory: 256M
+    healthcheck:
+      test: ["CMD", "wget", "--spider", "http://localhost:3000"]
+      interval: 10s
+      timeout: 2s
+      retries: 10
+    environment:
+      - OPENAI_BASE_URL=http://mistral:8000/v1/
@@ -0,0 +1,2 @@
+node_modules
+.next
@@ -0,0 +1 @@
+OPENAI_BASE_URL=https://mistral.myusername.defang.app/v1/
@@ -0,0 +1,3 @@
+{
+  "extends": "next/core-web-vitals"
+}
@@ -0,0 +1,36 @@
+# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
+
+# dependencies
+/node_modules
+/.pnp
+.pnp.js
+.yarn/install-state.gz
+
+# testing
+/coverage
+
+# next.js
+/.next/
+/out/
+
+# production
+/build
+
+# misc
+.DS_Store
+*.pem
+
+# debug
+npm-debug.log*
+yarn-debug.log*
+yarn-error.log*
+
+# local env files
+.env*.local
+
+# vercel
+.vercel
+
+# typescript
+*.tsbuildinfo
+next-env.d.ts
@@ -0,0 +1,15 @@
+FROM node:20-alpine
+
+WORKDIR /app
+
+COPY package.json package-lock.json ./
+
+RUN npm ci
+
+COPY . .
+
+RUN npm run build
+
+EXPOSE 3000
+
+CMD [ "npm", "run", "start" ]
@@ -0,0 +1,4 @@
+/** @type {import('next').NextConfig} */
+const nextConfig = {};
+
+export default nextConfig;
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	`+OPENAI_BASE_URL=https://mistral.myusername.defang.app/v1/`
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+{`
	`2`	`+ "extends": "next/core-web-vitals"`
	`3`	`+}`