Skip to content

Commit e982b30

Browse files
authored
Merge pull request #336 from DefangLabs/linda-nounly-go
Nounly Sample
0 parents  commit e982b30

22 files changed

+7320
-0
lines changed

.github/workflows/deploy.yaml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
name: Deploy
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
8+
jobs:
9+
deploy:
10+
environment: playground
11+
runs-on: ubuntu-latest
12+
permissions:
13+
contents: read
14+
id-token: write
15+
16+
steps:
17+
- name: Checkout Repo
18+
uses: actions/checkout@v4
19+
20+
- name: Deploy
21+
uses: DefangLabs/[email protected]

README.md

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# Mistral & vLLM
2+
3+
[![1-click-deploy](https://defang.io/deploy-with-defang.svg)](https://portal.defang.dev/redirect?url=https%3A%2F%2Fgithub.com%2Fnew%3Ftemplate_name%3Dsample-vllm-template%26template_owner%3DDefangSamples)
4+
5+
This guide demonstrates how to deploy Mistral using vLLM. You'll need a Hugging Face token to begin.
6+
7+
## Prerequisites
8+
9+
- Hugging Face token
10+
11+
## Steps
12+
13+
1. **Set the Hugging Face Token**
14+
15+
First, set the Hugging Face token using the `defang config` command.
16+
17+
```bash
18+
defang config set --name HF_TOKEN
19+
```
20+
21+
2. **Launch with Defang Compose**
22+
23+
Run the following command to start the services:
24+
25+
```bash
26+
defang compose up
27+
```
28+
29+
The provided `compose.yaml` file includes the Mistral service. It's configured to run on an AWS instance with GPU support. The file also includes a UI service built with Next.js, utilizing Vercel's AI SDK.
30+
31+
> **OpenAI SDK:** We use the OpenAI sdk, but set the `baseURL` to our Mistral endpoint.
32+
33+
> **Note:** The API route does not use a system prompt, as the Mistral model we're deploying currently does not support this feature. To get around this we inject a couple messages at the front of the conversation providing the context (see the `ui/src/app/api/chat/route.ts` file). Other than that, the integration with the OpenAI SDK should be structured as expected.
34+
35+
> **Changing the content:** The content for the bot is set in the `ui/src/app/api/chat/route.ts` file. You can edit the prompt in there to change the behaviour. You'll notice that it also pulls from `ui/src/app/docs.md` to provide content for the bot to use. You can edit this file to change its "knowledge".
36+
37+
## Configuration
38+
39+
- The Docker Compose file is ready to deploy Mistral and the UI service.
40+
- The UI uses Next.js and Vercel's AI SDK for seamless integration.
41+
42+
By following these steps, you should be able to deploy Mistral along with a custom UI on AWS, using GPU capabilities for enhanced performance.
43+
44+
---
45+
46+
Title: Mistral & vLLM
47+
48+
Short Description: Deploy Mistral with a custom UI using vLLM.
49+
50+
Tags: Mistral, vLLM, AI, Nextjs, GPU, Node.js, TypeScript, JavaScript
51+
52+
Languages: nodejs

compose.yaml

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
services:
2+
mistral:
3+
restart: unless-stopped
4+
image: ghcr.io/mistralai/mistral-src/vllm:latest
5+
ports:
6+
- mode: host
7+
target: 8000
8+
command: ["--host", "0.0.0.0", "--model", "TheBloke/Mistral-7B-Instruct-v0.2-AWQ", "--quantization", "awq", "--dtype", "auto", "--tensor-parallel-size", "1", "--gpu-memory-utilization", ".95", "--max-model-len", "8000"]
9+
deploy:
10+
resources:
11+
reservations:
12+
cpus: '2.0'
13+
memory: 8192M
14+
devices:
15+
- capabilities: ["gpu"]
16+
count: 1
17+
healthcheck:
18+
test: ["CMD", "python3", "-c", "import sys, urllib.request;urllib.request.urlopen(sys.argv[1]).read()", "http://localhost:8000/health"]
19+
interval: 1m
20+
environment:
21+
- HF_TOKEN
22+
ui:
23+
restart: unless-stopped
24+
build:
25+
context: ui
26+
dockerfile: Dockerfile
27+
ports:
28+
- mode: ingress
29+
target: 3000
30+
published: 3000
31+
deploy:
32+
resources:
33+
reservations:
34+
memory: 256M
35+
healthcheck:
36+
test: ["CMD", "wget", "--spider", "http://localhost:3000"]
37+
interval: 10s
38+
timeout: 2s
39+
retries: 10
40+
environment:
41+
- OPENAI_BASE_URL=http://mistral:8000/v1/

ui/.dockerignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
node_modules
2+
.next

ui/.env

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
OPENAI_BASE_URL=https://mistral.myusername.defang.app/v1/

ui/.eslintrc.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
{
2+
"extends": "next/core-web-vitals"
3+
}

ui/.gitignore

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
2+
3+
# dependencies
4+
/node_modules
5+
/.pnp
6+
.pnp.js
7+
.yarn/install-state.gz
8+
9+
# testing
10+
/coverage
11+
12+
# next.js
13+
/.next/
14+
/out/
15+
16+
# production
17+
/build
18+
19+
# misc
20+
.DS_Store
21+
*.pem
22+
23+
# debug
24+
npm-debug.log*
25+
yarn-debug.log*
26+
yarn-error.log*
27+
28+
# local env files
29+
.env*.local
30+
31+
# vercel
32+
.vercel
33+
34+
# typescript
35+
*.tsbuildinfo
36+
next-env.d.ts

ui/Dockerfile

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
FROM node:20-alpine
2+
3+
WORKDIR /app
4+
5+
COPY package.json package-lock.json ./
6+
7+
RUN npm ci
8+
9+
COPY . .
10+
11+
RUN npm run build
12+
13+
EXPOSE 3000
14+
15+
CMD [ "npm", "run", "start" ]

ui/next.config.mjs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
/** @type {import('next').NextConfig} */
2+
const nextConfig = {};
3+
4+
export default nextConfig;

0 commit comments

Comments
 (0)