Skip to content

Commit b42e797

Browse files
Development environment setup (#78)
* docs: add Cursor Cloud specific instructions to AGENTS.md Co-authored-by: krakenalt <krakenalt@users.noreply.github.com> * docs: update all AGENTS.md files to reflect actual codebase structure - gpt2giga/AGENTS.md: fix protocol paths (request/transformer.py, response/processor.py, attachment/attachments.py), fix config path (models/config.py), add common/ utilities table, add RequestValidationMiddleware, add models/security.py, fix router table (separate logs_router.py, add count_tokens endpoint), fix JIT search hints - AGENTS.md: fix config.py path in Quick Find Commands - examples/AGENTS.md: add anthropic/count_tokens.py, remove nonexistent weather_agent.py and structured_output_pydantic_complex.py, fix anthropic SDK dependency note (direct dep, not integrations group) - integrations/AGENTS.md: add nginx integration to Touch Points, fix config path - .github/AGENTS.md: add codeflash.yaml workflow Co-authored-by: krakenalt <krakenalt@users.noreply.github.com> * docs: add Cursor integration guide (Russian) Add integrations/cursor/README.md with step-by-step instructions for using GigaChat models in Cursor via gpt2giga proxy, including: - Local and remote server setup - Custom model configuration in Cursor settings - API key and pass-token auth modes - Available model mapping table Co-authored-by: krakenalt <krakenalt@users.noreply.github.com> * chore: update cursor README.md * docs: add Codex and Claude Code integration guides (Russian) - integrations/codex/README.md: config.toml setup with custom model provider, trusted projects, local and remote server usage - integrations/claude-code/README.md: ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY setup, shell profile tips, wrapper script - integrations/AGENTS.md: add both new integrations to Touch Points Co-authored-by: krakenalt <krakenalt@users.noreply.github.com> * chore: update integrations README.md * refactor: make blocking I/O calls async in route handlers - logs_router.py: move synchronous file I/O (open, os.path.exists, Path.read_text) to worker threads via anyio.to_thread.run_sync; cache log_viewer.html in module-level variable after first read - api_router.py: wrap blocking tiktoken.encoding_for_model() calls with anyio.to_thread.run_sync to avoid blocking the event loop Co-authored-by: krakenalt <krakenalt@users.noreply.github.com> * docs: update CHANGELOG.md and CHANGELOG_en.md for v0.1.4 Add entries for new integration guides (Cursor, Codex, Claude Code), AGENTS.md updates, and async I/O refactoring. Co-authored-by: krakenalt <krakenalt@users.noreply.github.com> * chore: bump version, update CHANGELOG.md * chore: bump version --------- Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: krakenalt <krakenalt@users.noreply.github.com>
1 parent 2a598e8 commit b42e797

File tree

15 files changed

+610
-158
lines changed

15 files changed

+610
-158
lines changed

.github/AGENTS.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ Examples:
3333
- **Docker Hub build/push**: `.github/workflows/docker_image.yaml`
3434
- **GHCR multi-python images**: `.github/workflows/publish-ghcr.yml`
3535
- **PyPI release publishing**: `.github/workflows/publish-pypi.yml`
36+
- **Codeflash optimization**: `.github/workflows/codeflash.yaml`
3637
- **PR checklist**: `.github/PULL_REQUEST_TEMPLATE.md`
3738
- **Bug report template**: `.github/ISSUE_TEMPLATE/bug_report.md`
3839

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ eggs/
1616
.eggs/
1717
.local/
1818
local/
19+
examples/
1920
.ipynb_checkpoints/
2021

2122
lib/

AGENTS.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ rg -n "ERROR_MAPPING|exceptions_handler" gpt2giga/
8282
rg --files -g "test_*.py" tests/
8383

8484
# Find env var usage
85-
rg -n "GPT2GIGA_|GIGACHAT_" .env.example gpt2giga/config.py
85+
rg -n "GPT2GIGA_|GIGACHAT_" .env.example gpt2giga/models/config.py
8686

8787
# Find OpenAI ↔ GigaChat transformation logic
8888
rg -n "class (RequestTransformer|ResponseProcessor|AttachmentProcessor)" gpt2giga/
@@ -101,3 +101,12 @@ uv run ruff check . && uv run ruff format --check . && uv run pytest tests/ --co
101101
- No lint warnings
102102
- PR template checklist completed
103103
- `uv.lock` updated if dependencies changed
104+
105+
## Cursor Cloud specific instructions
106+
107+
- **Service:** Single stateless FastAPI proxy (default port `8090`). No databases or auxiliary services required.
108+
- **uv must be installed first:** The VM does not ship with `uv`. The update script installs it automatically via `curl -LsSf https://astral.sh/uv/install.sh | sh`.
109+
- **Running the server:** `uv run gpt2giga` starts on `localhost:8090`. Without valid `GIGACHAT_CREDENTIALS` the proxy still boots and accepts requests, but upstream calls return an SSL/auth error — this is expected.
110+
- **Tests are fully mocked:** `uv run pytest tests/ --cov=. --cov-fail-under=80` runs all 246 tests without any external services or credentials.
111+
- **Lint/format/test commands:** See "Setup Commands" and "Definition of Done" sections above.
112+
- **Pre-commit hooks:** `uv run pre-commit install` sets up hooks (ruff check, ruff format, gitleaks). These run automatically on `git commit`.

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,18 @@
55
Формат основан на [Keep a Changelog](https://keepachangelog.com/ru/1.0.0/),
66
и проект придерживается [Семантического версионирования](https://semver.org/lang/ru/).
77

8+
## [0.1.4.post1] - 2026-02-27
9+
### Добавлено
10+
- **Интеграция Cursor**: Добавлен `integrations/cursor/README.md` — инструкция по использованию GigaChat в Cursor через кастомную модель
11+
- **Интеграция Codex**: Добавлен `integrations/codex/README.md` — настройка OpenAI Codex через `config.toml` с кастомным провайдером gpt2giga
12+
- **Интеграция Claude Code**: Добавлен `integrations/claude-code/README.md` — настройка Claude Code через `ANTHROPIC_BASE_URL`
13+
- **Документация AGENTS.md**: Обновлены все `AGENTS.md` файлы для соответствия актуальной структуре кодовой базы
14+
15+
### Изменено
16+
- **Асинхронность**: Блокирующие операции ввода-вывода в обработчиках маршрутов перенесены в рабочие потоки через `anyio.to_thread.run_sync`:
17+
- `logs_router.py` — чтение файлов логов и HTML-шаблона
18+
- `api_router.py` — инициализация `tiktoken.encoding_for_model()`
19+
820
## [0.1.4] - 2026-02-26
921

1022
### Добавлено

CHANGELOG_en.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,19 @@ All notable changes to the gpt2giga project are documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [0.1.4.post1] - 2026-02-27
9+
10+
### Added
11+
- **Cursor integration**: Added `integrations/cursor/README.md` — guide for using GigaChat in Cursor as a custom model
12+
- **Codex integration**: Added `integrations/codex/README.md` — OpenAI Codex setup via `config.toml` with custom gpt2giga provider
13+
- **Claude Code integration**: Added `integrations/claude-code/README.md` — Claude Code setup via `ANTHROPIC_BASE_URL`
14+
- **AGENTS.md documentation**: Updated all `AGENTS.md` files to match the current codebase structure
15+
16+
### Changed
17+
- **Async I/O**: Moved blocking I/O operations in route handlers to worker threads via `anyio.to_thread.run_sync`:
18+
- `logs_router.py` — log file reading and HTML template loading
19+
- `api_router.py``tiktoken.encoding_for_model()` initialization
20+
821
## [0.1.4] - 2026-02-26
922

1023
### Added

examples/AGENTS.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,6 @@
2626
| `responses/function_calling.py` | Responses API with tools |
2727
| `responses/structured_output.py` | Responses API with structured output |
2828
| `responses/structured_output_nested.py` | Nested structured output |
29-
| `responses/structured_output_pydantic_complex.py` | Complex nested Pydantic models |
3029
| `responses/json_schema.py` | JSON schema response format |
3130
| `responses/base64_image.py` | Responses API with base64 image |
3231
| `responses/image_url.py` | Responses API with image URL |
@@ -37,13 +36,13 @@
3736
| `anthropic/system_prompt.py` | System prompt usage |
3837
| `anthropic/function_calling.py` | Tool use / function calling |
3938
| `anthropic/reasoning.py` | Extended thinking (`thinking``reasoning_effort`) |
39+
| `anthropic/count_tokens.py` | Token counting via `/messages/count_tokens` |
4040
| `anthropic/image_url.py` | Image URL input |
4141
| `anthropic/base64_image.py` | Base64 image input |
4242
| **Root examples** | **Standalone** |
4343
| `embeddings.py` | Embeddings endpoint usage |
4444
| `models.py` | Model listing and retrieval |
4545
| `openai_agents.py` | OpenAI Agents SDK integration (multi-agent triage) |
46-
| `weather_agent.py` | Agent with async weather tool |
4746

4847
## Patterns & Conventions
4948

@@ -133,6 +132,6 @@ rg -n "reasoning|thinking" examples/
133132
## Common Gotchas
134133

135134
- Examples are **not** part of test coverage (excluded in `pyproject.toml`).
136-
- The `openai_agents.py` and `weather_agent.py` require the `integrations` dependency group: `uv sync --group integrations`.
137-
- Anthropic examples use the `anthropic` SDK (not `openai`) — requires `uv sync --group integrations`.
135+
- The `openai_agents.py` example requires the `integrations` dependency group: `uv sync --group integrations`.
136+
- Anthropic examples use the `anthropic` SDK (not `openai`) — `anthropic` is a direct project dependency, no extra install needed.
138137
- Each sub-directory has its own `README.md` with usage details.

gpt2giga/AGENTS.md

Lines changed: 67 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Package Identity
44

5-
- **What:** FastAPI proxy server that translates OpenAI API → GigaChat API
5+
- **What:** FastAPI proxy server that translates OpenAI API and Anthropic Messages API → GigaChat API
66
- **Framework:** FastAPI + Uvicorn, async-first
77
- **Entry point:** `gpt2giga/__init__.py``run()` in `api_server.py`
88

@@ -23,7 +23,7 @@ uv run ruff check gpt2giga/ && uv run ruff format gpt2giga/
2323

2424
```
2525
Request flow:
26-
Client (OpenAI SDK) → Middlewares → Router → RequestTransformer → GigaChat SDK
26+
Client (OpenAI/Anthropic SDK) → Middlewares → Router → RequestTransformer → GigaChat SDK
2727
GigaChat SDK → ResponseProcessor → Router → Client
2828
```
2929

@@ -33,15 +33,35 @@ Request flow:
3333
|---|---|
3434
| `api_server.py` | App factory (`create_app()`), lifespan, `run()` |
3535
| `cli.py` | CLI argument parsing, config loading |
36-
| `config.py` | Pydantic Settings: `ProxyConfig`, `ProxySettings`, `GigaChatCLI` |
36+
| `models/config.py` | Pydantic Settings: `ProxyConfig`, `ProxySettings`, `GigaChatCLI` |
37+
| `models/security.py` | `SecuritySettings` — consolidated security posture view-model |
3738
| `auth.py` | API key verification (`verify_api_key` dependency) |
3839
| `logger.py` | Loguru setup, `rquid_context` context var |
39-
| `utils.py` | Error handling decorator, stream generators, schema normalization, tool conversion |
40+
| `constants.py` | Size limits, MIME types, sensitive key patterns |
41+
| `openapi_docs.py` | OpenAPI schema extras for custom endpoints |
42+
| `common/` | Shared utilities (re-exported via `common/__init__.py`; see below) |
4043
| `protocol/` | Request/response transformation layer (see below) |
4144
| `routers/` | FastAPI route handlers (see below) |
4245
| `middlewares/` | HTTP middleware chain (see below) |
4346
| `templates/` | HTML log viewer template (`templates/log_viewer.html`) |
4447

48+
### Common Utilities (`common/`)
49+
50+
All utilities are in `common/` submodules, re-exported via `common/__init__.py`:
51+
52+
| File | Key exports |
53+
|---|---|
54+
| `common/exceptions.py` | `exceptions_handler` decorator, `ERROR_MAPPING` |
55+
| `common/streaming.py` | `stream_chat_completion_generator()`, `stream_responses_generator()` |
56+
| `common/json_schema.py` | `resolve_schema_refs()`, `normalize_json_schema()` |
57+
| `common/tools.py` | `convert_tool_to_giga_functions()`, tool name mapping |
58+
| `common/gigachat_auth.py` | `pass_token_to_gigachat()`, `create_gigachat_client_for_request()` |
59+
| `common/message_utils.py` | `map_role()`, `merge_consecutive_messages()`, `collapse_user_messages()` |
60+
| `common/content_utils.py` | `ensure_json_object_str()` |
61+
| `common/app_meta.py` | `warn_sensitive_cli_args()`, `get_app_version()`, `check_port_available()` |
62+
| `common/request_json.py` | `read_request_json()` |
63+
| `common/logs_access.py` | `verify_logs_ip_allowlist()` |
64+
4565
## Patterns & Conventions
4666

4767
### App Factory Pattern
@@ -53,19 +73,19 @@ Request flow:
5373
```
5474
✅ DO: Access shared state via `request.app.state.gigachat_client`
5575
✅ DO: See `api_server.py` create_app() for middleware registration order
56-
❌ DON'T: Copy single-file script patterns from `local/*.py` into `gpt2giga/` (the `local/` folder is a scratchpad)
76+
❌ DON'T: Copy "single-file script" patterns from scratch experiments into `gpt2giga/`
5777
```
5878

5979
### Error Handling
6080

61-
- **`@exceptions_handler` decorator** in `utils.py` wraps all router handlers.
81+
- **`@exceptions_handler` decorator** in `common/exceptions.py` wraps all router handlers.
6282
- Maps `gigachat.exceptions.*` to OpenAI-style HTTP errors via `ERROR_MAPPING` dict.
6383
- Logs errors with `rquid` (request ID) for traceability.
6484

6585
```
6686
✅ DO: Decorate every router handler with `@exceptions_handler`
67-
✅ DO: See `utils.py` ERROR_MAPPING for the exception → status code mapping
68-
❌ DON'T: Add ad-hoc exception mapping in random scripts (see `local/check_ai.py` for experimentation; keep production mapping in `gpt2giga/utils.py`)
87+
✅ DO: See `common/exceptions.py` ERROR_MAPPING for the exception → status code mapping
88+
❌ DON'T: Add ad-hoc exception mapping outside `gpt2giga/common/exceptions.py`
6989
```
7090

7191
### Protocol Layer (`protocol/`)
@@ -74,38 +94,38 @@ This is the core transformation engine:
7494

7595
| File | Class | Purpose |
7696
|---|---|---|
77-
| `request_mapper.py` | `RequestTransformer` | OpenAI request → GigaChat `Chat` object |
78-
| `response_mapper.py` | `ResponseProcessor` | GigaChat response → OpenAI response format |
79-
| `attachments.py` | `AttachmentProcessor` | Image/document upload, LRU cache with TTL |
80-
| `content_utils.py` || Content parsing/extraction utilities |
81-
| `message_utils.py` || Message merging, role mapping, ordering |
97+
| `protocol/request/transformer.py` | `RequestTransformer` | OpenAI request → GigaChat `Chat` object |
98+
| `protocol/response/processor.py` | `ResponseProcessor` | GigaChat response → OpenAI response format |
99+
| `protocol/attachment/attachments.py` | `AttachmentProcessor` | Image/document upload, LRU cache with TTL |
100+
101+
Classes are re-exported via `protocol/__init__.py`.
82102

83103
**Key transformations:**
84104
- Role mapping: `developer``system`/`user`, `tool``function`
85-
- Message merging: consecutive same-role messages are collapsed
86-
- Schema normalization: resolves `$ref`/`$defs`, strips `anyOf`/`oneOf` with null
87-
- Tool conversion: OpenAI `tools` format → GigaChat `functions` format
105+
- Message merging: consecutive same-role messages are collapsed (via `common/message_utils.py`)
106+
- Schema normalization: resolves `$ref`/`$defs`, strips `anyOf`/`oneOf` with null (via `common/json_schema.py`)
107+
- Tool conversion: OpenAI `tools` format → GigaChat `functions` format (via `common/tools.py`)
88108

89109
```
90-
✅ DO: Follow `request_mapper.py` prepare_chat_completion() for new request transformations
91-
✅ DO: Follow `response_mapper.py` process_response() for new response transformations
92-
✅ DO: Use normalize_json_schema() from utils.py for any JSON schema handling
93-
❌ DON'T: Duplicate protocol logic in routers; keep transformations in `gpt2giga/protocol/*` (contrast with experimental code in `local/*.py`)
110+
✅ DO: Follow `protocol/request/transformer.py` prepare_chat_completion() for new request transformations
111+
✅ DO: Follow `protocol/response/processor.py` process_response() for new response transformations
112+
✅ DO: Use normalize_json_schema() from common/json_schema.py for any JSON schema handling
113+
❌ DON'T: Duplicate protocol logic in routers; keep transformations in `gpt2giga/protocol/`
94114
```
95115

96116
### Routers (`routers/`)
97117

98118
| File | Endpoints |
99119
|---|---|
100120
| `api_router.py` | `GET /models`, `GET /models/{model}`, `POST /chat/completions`, `POST /embeddings`, `POST /responses` |
101-
| `anthropic_router.py` | `POST /messages` — Anthropic Messages API compatibility layer |
102-
| `system_router.py` | `GET /health`, `GET/POST /ping`, `GET /logs`, `GET /logs/stream` |
103-
| `system_router.py` (logs_router) | `GET /logs/html`HTML log viewer page |
121+
| `anthropic_router.py` | `POST /messages`, `POST /messages/count_tokens` — Anthropic Messages API compatibility layer |
122+
| `system_router.py` | `GET /health`, `GET/POST /ping` |
123+
| `logs_router.py` | `GET /logs/{last_n_lines}`, `GET /logs/stream`, `GET /logs/html` — log viewing and streaming |
104124

105125
- Routes are registered twice: at root `/` and under `/v1/` prefix.
106-
- System routes (`/health`, `/ping`, `/logs*`) are registered only once at root.
126+
- System routes (`/health`, `/ping`) and log routes (`/logs*`) are registered only once at root.
107127
- All API routes use `@exceptions_handler` decorator.
108-
- Streaming uses `StreamingResponse` with async generators from `utils.py`.
128+
- Streaming uses `StreamingResponse` with async generators from `common/streaming.py`.
109129
- Anthropic router converts Anthropic Messages format → OpenAI → GigaChat → Anthropic response.
110130

111131
```
@@ -119,10 +139,11 @@ This is the core transformation engine:
119139

120140
Applied in order (last added = first executed):
121141

122-
1. **`PassTokenMiddleware`** — passes auth token from request to GigaChat (conditional)
123-
2. **`RquidMiddleware`** — sets unique request ID in `contextvars`
124-
3. **`PathNormalizationMiddleware`** — normalizes `/api/v1/...``/v1/...`
125-
4. **`CORSMiddleware`** — allows all origins
142+
1. **`PassTokenMiddleware`** (`pass_token.py`) — passes auth token from request to GigaChat (conditional, only if `pass_token=True`)
143+
2. **`RequestValidationMiddleware`** (`request_validation.py`) — enforces request body size limits
144+
3. **`RquidMiddleware`** (`rquid_context.py`) — sets unique request ID in `contextvars`
145+
4. **`PathNormalizationMiddleware`** (`path_normalizer.py`) — normalizes `/api/v1/...``/v1/...`
146+
5. **`CORSMiddleware`** — allows configurable origins/methods/headers
126147

127148
```
128149
✅ DO: Inherit from BaseHTTPMiddleware (Starlette) for new middleware
@@ -132,20 +153,21 @@ Applied in order (last added = first executed):
132153

133154
### Streaming
134155

135-
- `stream_chat_completion_generator()` — SSE stream for `/chat/completions`
136-
- `stream_responses_generator()` — SSE stream for `/responses`
156+
- `stream_chat_completion_generator()` in `common/streaming.py` — SSE stream for `/chat/completions`
157+
- `stream_responses_generator()` in `common/streaming.py` — SSE stream for `/responses`
137158
- Both are async generators yielding `data: {json}\n\n` strings.
138159
- Handle client disconnections gracefully.
139160

140161
### Configuration
141162

142-
- `ProxyConfig` (root) nests `ProxySettings` + `GigaChatCLI`.
163+
- `ProxyConfig` (root) nests `ProxySettings` + `GigaChatCLI` — defined in `models/config.py`.
164+
- `SecuritySettings` in `models/security.py` — read-only security posture view.
143165
- Env var prefixes: `GPT2GIGA_` and `GIGACHAT_`.
144166
- CLI args via `pydantic-settings` with `cli_parse_args=True`.
145167
- See `.env.example` at repo root for all available settings.
146168

147169
```
148-
✅ DO: Add new proxy settings to ProxySettings in config.py
170+
✅ DO: Add new proxy settings to ProxySettings in models/config.py
149171
✅ DO: Use Field(default=..., description="...") for every setting
150172
```
151173

@@ -154,11 +176,13 @@ Applied in order (last added = first executed):
154176
- **App wiring + middleware order**: `gpt2giga/api_server.py`
155177
- **OpenAI-compatible endpoints**: `gpt2giga/routers/api_router.py`
156178
- **Anthropic Messages API**: `gpt2giga/routers/anthropic_router.py`
157-
- **System/log endpoints + HTML viewer**: `gpt2giga/routers/system_router.py`, `gpt2giga/templates/log_viewer.html`
158-
- **Protocol mapping**: `gpt2giga/protocol/request_mapper.py`, `gpt2giga/protocol/response_mapper.py`
159-
- **Attachments caching + upload**: `gpt2giga/protocol/attachments.py`
179+
- **System endpoints (health, ping)**: `gpt2giga/routers/system_router.py`
180+
- **Log endpoints + HTML viewer**: `gpt2giga/routers/logs_router.py`, `gpt2giga/templates/log_viewer.html`
181+
- **Protocol mapping**: `gpt2giga/protocol/request/transformer.py`, `gpt2giga/protocol/response/processor.py`
182+
- **Attachments caching + upload**: `gpt2giga/protocol/attachment/attachments.py`
160183
- **Auth + API key dependency**: `gpt2giga/auth.py`
161-
- **Settings/env parsing**: `gpt2giga/config.py`, `.env.example`
184+
- **Settings/env parsing**: `gpt2giga/models/config.py`, `.env.example`
185+
- **Security posture**: `gpt2giga/models/security.py`
162186

163187
## JIT Search Hints
164188

@@ -173,16 +197,19 @@ rg -n "class.*Middleware" gpt2giga/middlewares/
173197
rg -n "def (prepare_|process_|transform_)" gpt2giga/protocol/
174198

175199
# Find streaming generators
176-
rg -n "async def stream_" gpt2giga/utils.py
200+
rg -n "async def stream_" gpt2giga/common/streaming.py
177201

178202
# Find schema normalization logic
179-
rg -n "def (normalize_json_schema|resolve_schema_refs)" gpt2giga/utils.py
203+
rg -n "def (normalize_json_schema|resolve_schema_refs)" gpt2giga/common/json_schema.py
180204

181205
# Find all Pydantic settings
182-
rg -n "class.*Settings|class.*Config" gpt2giga/config.py
206+
rg -n "class.*Settings|class.*Config" gpt2giga/models/config.py
183207

184208
# Find Anthropic-specific logic
185209
rg -n "anthropic|messages" gpt2giga/routers/anthropic_router.py
210+
211+
# Find error mapping / exception handling
212+
rg -n "ERROR_MAPPING|exceptions_handler" gpt2giga/common/exceptions.py
186213
```
187214

188215
## Common Gotchas

gpt2giga/routers/api_router.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
1+
import functools
12
import time
23

4+
import anyio
35
import tiktoken
46
from fastapi import APIRouter
57
from fastapi import Request
@@ -90,14 +92,18 @@ async def embeddings(request: Request):
9092
if isinstance(inputs, list):
9193
new_inputs = []
9294
if len(inputs) > 0 and isinstance(inputs[0], int): # List[int]
93-
encoder = tiktoken.encoding_for_model(gpt_model)
95+
encoder = await anyio.to_thread.run_sync(
96+
functools.partial(tiktoken.encoding_for_model, gpt_model)
97+
)
9498
new_inputs = encoder.decode(inputs)
9599
else:
96100
encoder = None
97101
for row in inputs:
98102
if isinstance(row, list): # List[List[int]]
99103
if encoder is None:
100-
encoder = tiktoken.encoding_for_model(gpt_model)
104+
encoder = await anyio.to_thread.run_sync(
105+
functools.partial(tiktoken.encoding_for_model, gpt_model)
106+
)
101107
new_inputs.append(encoder.decode(row))
102108
else:
103109
new_inputs.append(row)

0 commit comments

Comments
 (0)