-
Notifications
You must be signed in to change notification settings - Fork 0
docker #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
docker #11
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
8137cd7
fix(serve): reduce server shutdown timeout from 3 seconds to 1 second…
david20571015 77714cd
fix(gen_protos): update command to use python3 and add main guard for…
david20571015 da75100
feat(docker): add Docker files
david20571015 f4b96a0
docs(docker): enhance documentation with overview, installation, usag…
david20571015 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| # Git | ||
| .git/ | ||
| .gitignore | ||
| .gitmodules | ||
|
|
||
| # Python | ||
| __pycache__/ | ||
| *.py[cod] | ||
| *$py.class | ||
| *.so | ||
| .Python | ||
| .venv/ | ||
| venv/ | ||
| *.egg-info/ | ||
| .installed.cfg | ||
| *.egg | ||
|
|
||
| # Cache | ||
| .mypy_cache/ | ||
| .pytest_cache/ | ||
| .coverage | ||
| htmlcov/ | ||
| .cache/ | ||
|
|
||
| # Build artifacts | ||
| build/ | ||
| dist/ | ||
| *.manifest | ||
| *.spec | ||
|
|
||
| # Logs | ||
| *.log | ||
| logs/ | ||
|
|
||
| # Local development files | ||
| .pytest_cache | ||
| .coverage | ||
| *.swp | ||
| .DS_Store | ||
|
|
||
| # IDE | ||
| .idea/ | ||
| .vscode/ | ||
| *.sublime-project | ||
| *.sublime-workspace | ||
|
|
||
| # Project specific | ||
| .python-version | ||
| .pre-commit-config.yaml | ||
| .github/ | ||
|
|
||
| # Environment | ||
| .env | ||
| .env.* | ||
| env/ | ||
|
|
||
| # Generated files | ||
| llm_backend/protos/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| FROM ghcr.io/astral-sh/uv:bookworm-slim AS builder | ||
|
|
||
| ENV UV_COMPILE_BYTECODE=1\ | ||
| UV_LINK_MODE=copy \ | ||
| UV_PYTHON_INSTALL_DIR=/python \ | ||
| UV_PYTHON_PREFERENCE=only-managed | ||
|
|
||
| RUN uv python install 3.12 | ||
|
|
||
| WORKDIR /app | ||
|
|
||
| RUN --mount=type=cache,target=/root/.cache/uv \ | ||
| --mount=type=bind,source=uv.lock,target=uv.lock \ | ||
| --mount=type=bind,source=pyproject.toml,target=pyproject.toml \ | ||
| uv sync --frozen --no-dev --no-install-project | ||
|
|
||
| COPY . /app | ||
|
|
||
| RUN --mount=type=cache,target=/root/.cache/uv \ | ||
| uv sync --frozen --no-dev | ||
|
|
||
| FROM debian:bookworm-slim | ||
|
|
||
| COPY --from=builder --chown=python:python /python /python | ||
| COPY --from=builder --chown=app:app /app /app | ||
|
|
||
| ENV PATH="/app/.venv/bin:$PATH" | ||
|
|
||
| WORKDIR /app | ||
|
|
||
| # Generate the protos | ||
| RUN ["python3", "scripts/gen_protos.py"] | ||
|
|
||
| # Run the application | ||
| ENTRYPOINT ["python3", "scripts/serve.py", "--config", "configs/config.toml"] | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,32 +1,70 @@ | ||
| # SYNC Server LLM | ||
|
|
||
| ## Overview | ||
|
|
||
| SYNC Server LLM is a gRPC-based server that performs document retrieval and summarization. It leverages Qdrant for vector search and OpenAI models to generate summaries of retrieved content based on user-provided keywords. | ||
|
|
||
| ## Installation | ||
|
|
||
| ```shell | ||
| git clone --recurse-submodules https://github.com/NCTU-SYNC/sync-server-llm.git | ||
| cd sync-server-llm | ||
|
|
||
| uv sync --no-dev --frozen | ||
|
|
||
| uv run gen-protos | ||
| ``` | ||
|
|
||
| ## Usage | ||
|
|
||
| Please configure the `configs/config.toml` file. | ||
| The following environment variables are required (`export` them or place them in a `.env` file): | ||
| This section explains how to run the SYNC Server LLM using different methods. | ||
|
|
||
| - `OPENAI_API_KEY`: Your ChatGPT API key. | ||
| - `QDRANT_HOST`: The Qdrant host address. | ||
| - `QDRANT_PORT`: The Qdrant host port. | ||
| - `QDRANT_COLLECTION`: The Qdrant collection name. | ||
| 1. Configure the server by editing `configs/config.toml` | ||
|
|
||
| ```shell | ||
| python3 scripts/serve.py --config configs/config.toml | ||
| ``` | ||
| 2. Set up the required environment variables by adding them to a `.env` file | ||
|
|
||
| | Variable | Description | | ||
| | ------------------- | ----------------------------- | | ||
| | `OPENAI_API_KEY` | Your ChatGPT API key | | ||
| | `QDRANT_HOST` | The Qdrant host address | | ||
| | `QDRANT_PORT` | The Qdrant host REST API port | | ||
| | `QDRANT_COLLECTION` | The Qdrant collection name | | ||
|
|
||
| 3. Start the server: | ||
|
|
||
| - To run the server locally: | ||
|
|
||
| ```shell | ||
| uv run scripts/serve.py --config configs/config.toml | ||
| ``` | ||
|
|
||
| - To run the server using Docker: | ||
|
|
||
| Build the Docker image: | ||
|
|
||
| ```shell | ||
| docker build -t sync/backend-llm . | ||
| ``` | ||
|
|
||
| Run the container: | ||
|
|
||
| ```shell | ||
| docker run -p 50051:50051 \ | ||
| --env-file .env \ | ||
| -v $(pwd)/path/to/configs:/app/configs/config.toml \ | ||
| -v $(pwd)/path/to/hf_cache:/tmp/llama_index \ | ||
| sync/backend-llm | ||
| ``` | ||
|
|
||
| > 1. If you are using Windows, you can add `--gpus=all` to the `docker run` command. Ensure that your Docker installation supports GPU usage. | ||
| > 2. It is strongly recommended to mount the `hf_cache` directory to a persistent volume to avoid re-downloading the Hugging Face models every time the container is started. | ||
|
|
||
| ## Client Example | ||
|
|
||
| You can refer to `scripts/client.py` for an example implementation of a client: | ||
|
|
||
| ```shell | ||
| python3 scripts/client.py | ||
| uv run scripts/client.py | ||
|
Comment on lines
66
to
+67
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| ``` | ||
|
|
||
| ## Features | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding error handling to ensure the proto generation succeeds. If
gen_protos.pyfails, the Docker build should fail as well. This can be achieved by adding|| exit 1to the command.