ai-forever
diff --git a/‎.github/AGENTS.md‎
Lines changed: 1 addition & 0 deletions b/‎.github/AGENTS.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎AGENTS.md‎
Lines changed: 10 additions & 1 deletion b/‎AGENTS.md‎
Lines changed: 10 additions & 1 deletion
diff --git a/‎CHANGELOG.md‎
Lines changed: 12 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎CHANGELOG_en.md‎
Lines changed: 13 additions & 0 deletions b/‎CHANGELOG_en.md‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎examples/AGENTS.md‎
Lines changed: 3 additions & 4 deletions b/‎examples/AGENTS.md‎
Lines changed: 3 additions & 4 deletions
diff --git a/‎gpt2giga/AGENTS.md‎
Lines changed: 67 additions & 40 deletions b/‎gpt2giga/AGENTS.md‎
Lines changed: 67 additions & 40 deletions
diff --git a/‎gpt2giga/routers/api_router.py‎
Lines changed: 8 additions & 2 deletions b/‎gpt2giga/routers/api_router.py‎
Lines changed: 8 additions & 2 deletions
@@ -33,6 +33,7 @@ Examples:
 - **Docker Hub build/push**: `.github/workflows/docker_image.yaml`
 - **GHCR multi-python images**: `.github/workflows/publish-ghcr.yml`
 - **PyPI release publishing**: `.github/workflows/publish-pypi.yml`
+- **Codeflash optimization**: `.github/workflows/codeflash.yaml`
 - **PR checklist**: `.github/PULL_REQUEST_TEMPLATE.md`
 - **Bug report template**: `.github/ISSUE_TEMPLATE/bug_report.md`
 
 
@@ -16,6 +16,7 @@ eggs/
 .eggs/
 .local/
 local/
+examples/
 .ipynb_checkpoints/
 
 lib/
 
@@ -82,7 +82,7 @@ rg -n "ERROR_MAPPING|exceptions_handler" gpt2giga/
 rg --files -g "test_*.py" tests/
 
 # Find env var usage
-rg -n "GPT2GIGA_|GIGACHAT_" .env.example gpt2giga/config.py
+rg -n "GPT2GIGA_|GIGACHAT_" .env.example gpt2giga/models/config.py
 
 # Find OpenAI ↔ GigaChat transformation logic
 rg -n "class (RequestTransformer|ResponseProcessor|AttachmentProcessor)" gpt2giga/
@@ -101,3 +101,12 @@ uv run ruff check . && uv run ruff format --check . && uv run pytest tests/ --co
 - No lint warnings
 - PR template checklist completed
 - `uv.lock` updated if dependencies changed
+
+## Cursor Cloud specific instructions
+
+- **Service:** Single stateless FastAPI proxy (default port `8090`). No databases or auxiliary services required.
+- **uv must be installed first:** The VM does not ship with `uv`. The update script installs it automatically via `curl -LsSf https://astral.sh/uv/install.sh | sh`.
+- **Running the server:** `uv run gpt2giga` starts on `localhost:8090`. Without valid `GIGACHAT_CREDENTIALS` the proxy still boots and accepts requests, but upstream calls return an SSL/auth error — this is expected.
+- **Tests are fully mocked:** `uv run pytest tests/ --cov=. --cov-fail-under=80` runs all 246 tests without any external services or credentials.
+- **Lint/format/test commands:** See "Setup Commands" and "Definition of Done" sections above.
+- **Pre-commit hooks:** `uv run pre-commit install` sets up hooks (ruff check, ruff format, gitleaks). These run automatically on `git commit`.
@@ -5,6 +5,18 @@
 Формат основан на [Keep a Changelog](https://keepachangelog.com/ru/1.0.0/),
 и проект придерживается [Семантического версионирования](https://semver.org/lang/ru/).
 
+## [0.1.4.post1] - 2026-02-27
+### Добавлено
+- **Интеграция Cursor**: Добавлен `integrations/cursor/README.md` — инструкция по использованию GigaChat в Cursor через кастомную модель
+- **Интеграция Codex**: Добавлен `integrations/codex/README.md` — настройка OpenAI Codex через `config.toml` с кастомным провайдером gpt2giga
+- **Интеграция Claude Code**: Добавлен `integrations/claude-code/README.md` — настройка Claude Code через `ANTHROPIC_BASE_URL`
+- **Документация AGENTS.md**: Обновлены все `AGENTS.md` файлы для соответствия актуальной структуре кодовой базы
+
+### Изменено
+- **Асинхронность**: Блокирующие операции ввода-вывода в обработчиках маршрутов перенесены в рабочие потоки через `anyio.to_thread.run_sync`:
+  - `logs_router.py` — чтение файлов логов и HTML-шаблона
+  - `api_router.py` — инициализация `tiktoken.encoding_for_model()`
+
 ## [0.1.4] - 2026-02-26
 
 ### Добавлено
 
@@ -5,6 +5,19 @@ All notable changes to the gpt2giga project are documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [0.1.4.post1] - 2026-02-27
+
+### Added
+- **Cursor integration**: Added `integrations/cursor/README.md` — guide for using GigaChat in Cursor as a custom model
+- **Codex integration**: Added `integrations/codex/README.md` — OpenAI Codex setup via `config.toml` with custom gpt2giga provider
+- **Claude Code integration**: Added `integrations/claude-code/README.md` — Claude Code setup via `ANTHROPIC_BASE_URL`
+- **AGENTS.md documentation**: Updated all `AGENTS.md` files to match the current codebase structure
+
+### Changed
+- **Async I/O**: Moved blocking I/O operations in route handlers to worker threads via `anyio.to_thread.run_sync`:
+  - `logs_router.py` — log file reading and HTML template loading
+  - `api_router.py` — `tiktoken.encoding_for_model()` initialization
+
 ## [0.1.4] - 2026-02-26
 
 ### Added
 
@@ -26,7 +26,6 @@
 | `responses/function_calling.py` | Responses API with tools |
 | `responses/structured_output.py` | Responses API with structured output |
 | `responses/structured_output_nested.py` | Nested structured output |
-| `responses/structured_output_pydantic_complex.py` | Complex nested Pydantic models |
 | `responses/json_schema.py` | JSON schema response format |
 | `responses/base64_image.py` | Responses API with base64 image |
 | `responses/image_url.py` | Responses API with image URL |
@@ -37,13 +36,13 @@
 | `anthropic/system_prompt.py` | System prompt usage |
 | `anthropic/function_calling.py` | Tool use / function calling |
 | `anthropic/reasoning.py` | Extended thinking (`thinking` → `reasoning_effort`) |
+| `anthropic/count_tokens.py` | Token counting via `/messages/count_tokens` |
 | `anthropic/image_url.py` | Image URL input |
 | `anthropic/base64_image.py` | Base64 image input |
 | **Root examples** | **Standalone** |
 | `embeddings.py` | Embeddings endpoint usage |
 | `models.py` | Model listing and retrieval |
 | `openai_agents.py` | OpenAI Agents SDK integration (multi-agent triage) |
-| `weather_agent.py` | Agent with async weather tool |
 
 ## Patterns & Conventions
 
@@ -133,6 +132,6 @@ rg -n "reasoning|thinking" examples/
 ## Common Gotchas
 
 - Examples are **not** part of test coverage (excluded in `pyproject.toml`).
-- The `openai_agents.py` and `weather_agent.py` require the `integrations` dependency group: `uv sync --group integrations`.
-- Anthropic examples use the `anthropic` SDK (not `openai`) — requires `uv sync --group integrations`.
+- The `openai_agents.py` example requires the `integrations` dependency group: `uv sync --group integrations`.
+- Anthropic examples use the `anthropic` SDK (not `openai`) — `anthropic` is a direct project dependency, no extra install needed.
 - Each sub-directory has its own `README.md` with usage details.
@@ -2,7 +2,7 @@
 
 ## Package Identity
 
-- **What:** FastAPI proxy server that translates OpenAI API → GigaChat API
+- **What:** FastAPI proxy server that translates OpenAI API and Anthropic Messages API → GigaChat API
 - **Framework:** FastAPI + Uvicorn, async-first
 - **Entry point:** `gpt2giga/__init__.py` → `run()` in `api_server.py`
 
@@ -23,7 +23,7 @@ uv run ruff check gpt2giga/ && uv run ruff format gpt2giga/
 
 ```
 Request flow:
-  Client (OpenAI SDK) → Middlewares → Router → RequestTransformer → GigaChat SDK
+  Client (OpenAI/Anthropic SDK) → Middlewares → Router → RequestTransformer → GigaChat SDK
   GigaChat SDK → ResponseProcessor → Router → Client
 ```
 
@@ -33,15 +33,35 @@ Request flow:
 |---|---|
 | `api_server.py` | App factory (`create_app()`), lifespan, `run()` |
 | `cli.py` | CLI argument parsing, config loading |
-| `config.py` | Pydantic Settings: `ProxyConfig`, `ProxySettings`, `GigaChatCLI` |
+| `models/config.py` | Pydantic Settings: `ProxyConfig`, `ProxySettings`, `GigaChatCLI` |
+| `models/security.py` | `SecuritySettings` — consolidated security posture view-model |
 | `auth.py` | API key verification (`verify_api_key` dependency) |
 | `logger.py` | Loguru setup, `rquid_context` context var |
-| `utils.py` | Error handling decorator, stream generators, schema normalization, tool conversion |
+| `constants.py` | Size limits, MIME types, sensitive key patterns |
+| `openapi_docs.py` | OpenAPI schema extras for custom endpoints |
+| `common/` | Shared utilities (re-exported via `common/__init__.py`; see below) |
 | `protocol/` | Request/response transformation layer (see below) |
 | `routers/` | FastAPI route handlers (see below) |
 | `middlewares/` | HTTP middleware chain (see below) |
 | `templates/` | HTML log viewer template (`templates/log_viewer.html`) |
 
+### Common Utilities (`common/`)
+
+All utilities are in `common/` submodules, re-exported via `common/__init__.py`:
+
+| File | Key exports |
+|---|---|
+| `common/exceptions.py` | `exceptions_handler` decorator, `ERROR_MAPPING` |
+| `common/streaming.py` | `stream_chat_completion_generator()`, `stream_responses_generator()` |
+| `common/json_schema.py` | `resolve_schema_refs()`, `normalize_json_schema()` |
+| `common/tools.py` | `convert_tool_to_giga_functions()`, tool name mapping |
+| `common/gigachat_auth.py` | `pass_token_to_gigachat()`, `create_gigachat_client_for_request()` |
+| `common/message_utils.py` | `map_role()`, `merge_consecutive_messages()`, `collapse_user_messages()` |
+| `common/content_utils.py` | `ensure_json_object_str()` |
+| `common/app_meta.py` | `warn_sensitive_cli_args()`, `get_app_version()`, `check_port_available()` |
+| `common/request_json.py` | `read_request_json()` |
+| `common/logs_access.py` | `verify_logs_ip_allowlist()` |
+
 ## Patterns & Conventions
 
 ### App Factory Pattern
@@ -53,19 +73,19 @@ Request flow:
 ```
 ✅ DO: Access shared state via `request.app.state.gigachat_client`
 ✅ DO: See `api_server.py` create_app() for middleware registration order
-❌ DON'T: Copy “single-file script” patterns from `local/*.py` into `gpt2giga/` (the `local/` folder is a scratchpad)
+❌ DON'T: Copy "single-file script" patterns from scratch experiments into `gpt2giga/`
 ```
 
 ### Error Handling
 
-- **`@exceptions_handler` decorator** in `utils.py` wraps all router handlers.
+- **`@exceptions_handler` decorator** in `common/exceptions.py` wraps all router handlers.
 - Maps `gigachat.exceptions.*` to OpenAI-style HTTP errors via `ERROR_MAPPING` dict.
 - Logs errors with `rquid` (request ID) for traceability.
 
 ```
 ✅ DO: Decorate every router handler with `@exceptions_handler`
-✅ DO: See `utils.py` ERROR_MAPPING for the exception → status code mapping
-❌ DON'T: Add ad-hoc exception mapping in random scripts (see `local/check_ai.py` for experimentation; keep production mapping in `gpt2giga/utils.py`)
+✅ DO: See `common/exceptions.py` ERROR_MAPPING for the exception → status code mapping
+❌ DON'T: Add ad-hoc exception mapping outside `gpt2giga/common/exceptions.py`
 ```
 
 ### Protocol Layer (`protocol/`)
@@ -74,38 +94,38 @@ This is the core transformation engine:
 
 | File | Class | Purpose |
 |---|---|---|
-| `request_mapper.py` | `RequestTransformer` | OpenAI request → GigaChat `Chat` object |
-| `response_mapper.py` | `ResponseProcessor` | GigaChat response → OpenAI response format |
-| `attachments.py` | `AttachmentProcessor` | Image/document upload, LRU cache with TTL |
-| `content_utils.py` | — | Content parsing/extraction utilities |
-| `message_utils.py` | — | Message merging, role mapping, ordering |
+| `protocol/request/transformer.py` | `RequestTransformer` | OpenAI request → GigaChat `Chat` object |
+| `protocol/response/processor.py` | `ResponseProcessor` | GigaChat response → OpenAI response format |
+| `protocol/attachment/attachments.py` | `AttachmentProcessor` | Image/document upload, LRU cache with TTL |
+
+Classes are re-exported via `protocol/__init__.py`.
 
 **Key transformations:**
 - Role mapping: `developer` → `system`/`user`, `tool` → `function`
-- Message merging: consecutive same-role messages are collapsed
-- Schema normalization: resolves `$ref`/`$defs`, strips `anyOf`/`oneOf` with null
-- Tool conversion: OpenAI `tools` format → GigaChat `functions` format
+- Message merging: consecutive same-role messages are collapsed (via `common/message_utils.py`)
+- Schema normalization: resolves `$ref`/`$defs`, strips `anyOf`/`oneOf` with null (via `common/json_schema.py`)
+- Tool conversion: OpenAI `tools` format → GigaChat `functions` format (via `common/tools.py`)
 
 ```
-✅ DO: Follow `request_mapper.py` prepare_chat_completion() for new request transformations
-✅ DO: Follow `response_mapper.py` process_response() for new response transformations
-✅ DO: Use normalize_json_schema() from utils.py for any JSON schema handling
-❌ DON'T: Duplicate protocol logic in routers; keep transformations in `gpt2giga/protocol/*` (contrast with experimental code in `local/*.py`)
+✅ DO: Follow `protocol/request/transformer.py` prepare_chat_completion() for new request transformations
+✅ DO: Follow `protocol/response/processor.py` process_response() for new response transformations
+✅ DO: Use normalize_json_schema() from common/json_schema.py for any JSON schema handling
+❌ DON'T: Duplicate protocol logic in routers; keep transformations in `gpt2giga/protocol/`
 ```
 
 ### Routers (`routers/`)
 
 | File | Endpoints |
 |---|---|
 | `api_router.py` | `GET /models`, `GET /models/{model}`, `POST /chat/completions`, `POST /embeddings`, `POST /responses` |
-| `anthropic_router.py` | `POST /messages` — Anthropic Messages API compatibility layer |
-| `system_router.py` | `GET /health`, `GET/POST /ping`, `GET /logs`, `GET /logs/stream` |
-| `system_router.py` (logs_router) | `GET /logs/html` — HTML log viewer page |
+| `anthropic_router.py` | `POST /messages`, `POST /messages/count_tokens` — Anthropic Messages API compatibility layer |
+| `system_router.py` | `GET /health`, `GET/POST /ping` |
+| `logs_router.py` | `GET /logs/{last_n_lines}`, `GET /logs/stream`, `GET /logs/html` — log viewing and streaming |
 
 - Routes are registered twice: at root `/` and under `/v1/` prefix.
-- System routes (`/health`, `/ping`, `/logs*`) are registered only once at root.
+- System routes (`/health`, `/ping`) and log routes (`/logs*`) are registered only once at root.
 - All API routes use `@exceptions_handler` decorator.
-- Streaming uses `StreamingResponse` with async generators from `utils.py`.
+- Streaming uses `StreamingResponse` with async generators from `common/streaming.py`.
 - Anthropic router converts Anthropic Messages format → OpenAI → GigaChat → Anthropic response.
 
 ```
@@ -119,10 +139,11 @@ This is the core transformation engine:
 
 Applied in order (last added = first executed):
 
-1. **`PassTokenMiddleware`** — passes auth token from request to GigaChat (conditional)
-2. **`RquidMiddleware`** — sets unique request ID in `contextvars`
-3. **`PathNormalizationMiddleware`** — normalizes `/api/v1/...` → `/v1/...`
-4. **`CORSMiddleware`** — allows all origins
+1. **`PassTokenMiddleware`** (`pass_token.py`) — passes auth token from request to GigaChat (conditional, only if `pass_token=True`)
+2. **`RequestValidationMiddleware`** (`request_validation.py`) — enforces request body size limits
+3. **`RquidMiddleware`** (`rquid_context.py`) — sets unique request ID in `contextvars`
+4. **`PathNormalizationMiddleware`** (`path_normalizer.py`) — normalizes `/api/v1/...` → `/v1/...`
+5. **`CORSMiddleware`** — allows configurable origins/methods/headers
 
 ```
 ✅ DO: Inherit from BaseHTTPMiddleware (Starlette) for new middleware
@@ -132,20 +153,21 @@ Applied in order (last added = first executed):
 
 ### Streaming
 
-- `stream_chat_completion_generator()` — SSE stream for `/chat/completions`
-- `stream_responses_generator()` — SSE stream for `/responses`
+- `stream_chat_completion_generator()` in `common/streaming.py` — SSE stream for `/chat/completions`
+- `stream_responses_generator()` in `common/streaming.py` — SSE stream for `/responses`
 - Both are async generators yielding `data: {json}\n\n` strings.
 - Handle client disconnections gracefully.
 
 ### Configuration
 
-- `ProxyConfig` (root) nests `ProxySettings` + `GigaChatCLI`.
+- `ProxyConfig` (root) nests `ProxySettings` + `GigaChatCLI` — defined in `models/config.py`.
+- `SecuritySettings` in `models/security.py` — read-only security posture view.
 - Env var prefixes: `GPT2GIGA_` and `GIGACHAT_`.
 - CLI args via `pydantic-settings` with `cli_parse_args=True`.
 - See `.env.example` at repo root for all available settings.
 
 ```
-✅ DO: Add new proxy settings to ProxySettings in config.py
+✅ DO: Add new proxy settings to ProxySettings in models/config.py
 ✅ DO: Use Field(default=..., description="...") for every setting
 ```
 
@@ -154,11 +176,13 @@ Applied in order (last added = first executed):
 - **App wiring + middleware order**: `gpt2giga/api_server.py`
 - **OpenAI-compatible endpoints**: `gpt2giga/routers/api_router.py`
 - **Anthropic Messages API**: `gpt2giga/routers/anthropic_router.py`
-- **System/log endpoints + HTML viewer**: `gpt2giga/routers/system_router.py`, `gpt2giga/templates/log_viewer.html`
-- **Protocol mapping**: `gpt2giga/protocol/request_mapper.py`, `gpt2giga/protocol/response_mapper.py`
-- **Attachments caching + upload**: `gpt2giga/protocol/attachments.py`
+- **System endpoints (health, ping)**: `gpt2giga/routers/system_router.py`
+- **Log endpoints + HTML viewer**: `gpt2giga/routers/logs_router.py`, `gpt2giga/templates/log_viewer.html`
+- **Protocol mapping**: `gpt2giga/protocol/request/transformer.py`, `gpt2giga/protocol/response/processor.py`
+- **Attachments caching + upload**: `gpt2giga/protocol/attachment/attachments.py`
 - **Auth + API key dependency**: `gpt2giga/auth.py`
-- **Settings/env parsing**: `gpt2giga/config.py`, `.env.example`
+- **Settings/env parsing**: `gpt2giga/models/config.py`, `.env.example`
+- **Security posture**: `gpt2giga/models/security.py`
 
 ## JIT Search Hints
 
@@ -173,16 +197,19 @@ rg -n "class.*Middleware" gpt2giga/middlewares/
 rg -n "def (prepare_|process_|transform_)" gpt2giga/protocol/
 
 # Find streaming generators
-rg -n "async def stream_" gpt2giga/utils.py
+rg -n "async def stream_" gpt2giga/common/streaming.py
 
 # Find schema normalization logic
-rg -n "def (normalize_json_schema|resolve_schema_refs)" gpt2giga/utils.py
+rg -n "def (normalize_json_schema|resolve_schema_refs)" gpt2giga/common/json_schema.py
 
 # Find all Pydantic settings
-rg -n "class.*Settings|class.*Config" gpt2giga/config.py
+rg -n "class.*Settings|class.*Config" gpt2giga/models/config.py
 
 # Find Anthropic-specific logic
 rg -n "anthropic|messages" gpt2giga/routers/anthropic_router.py
+
+# Find error mapping / exception handling
+rg -n "ERROR_MAPPING|exceptions_handler" gpt2giga/common/exceptions.py
 ```
 
 ## Common Gotchas
 
@@ -1,5 +1,7 @@
+import functools
 import time
 
+import anyio
 import tiktoken
 from fastapi import APIRouter
 from fastapi import Request
@@ -90,14 +92,18 @@ async def embeddings(request: Request):
     if isinstance(inputs, list):
         new_inputs = []
         if len(inputs) > 0 and isinstance(inputs[0], int):  # List[int]
-            encoder = tiktoken.encoding_for_model(gpt_model)
+            encoder = await anyio.to_thread.run_sync(
+                functools.partial(tiktoken.encoding_for_model, gpt_model)
+            )
             new_inputs = encoder.decode(inputs)
         else:
             encoder = None
             for row in inputs:
                 if isinstance(row, list):  # List[List[int]]
                     if encoder is None:
-                        encoder = tiktoken.encoding_for_model(gpt_model)
+                        encoder = await anyio.to_thread.run_sync(
+                            functools.partial(tiktoken.encoding_for_model, gpt_model)
+                        )
                     new_inputs.append(encoder.decode(row))
                 else:
                     new_inputs.append(row)