Skip to content

Commit b13750f

Browse files
committed
Merge branch 'dev'
2 parents 5182d7d + 288b1c7 commit b13750f

34 files changed

+15026
-2185
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,9 +124,11 @@ start_proxy.bat
124124
key_usage.json
125125
staged_changes.txt
126126
launcher_config.json
127+
quota_viewer_config.json
127128
cache/antigravity/thought_signatures.json
128129
logs/
129130
cache/
130131
*.env
131132

132133
oauth_creds/
134+

DOCUMENTATION.md

Lines changed: 511 additions & 21 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 52 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -329,10 +329,19 @@ The proxy includes a powerful text-based UI for configuration and management.
329329

330330
**Antigravity:**
331331
- Gemini 3 Pro with `thinkingLevel` support
332+
- Gemini 2.5 Flash/Flash Lite with thinking mode
332333
- Claude Opus 4.5 (thinking mode)
333334
- Claude Sonnet 4.5 (thinking and non-thinking)
335+
- GPT-OSS 120B Medium
334336
- Thought signature caching for multi-turn conversations
335337
- Tool hallucination prevention
338+
- Quota baseline tracking with background refresh
339+
- Parallel tool usage instruction injection
340+
- **Quota Groups**: Models that share quota are automatically grouped:
341+
- Claude/GPT-OSS: `claude-sonnet-4-5`, `claude-opus-4-5`, `gpt-oss-120b-medium`
342+
- Gemini 3 Pro: `gemini-3-pro-high`, `gemini-3-pro-low`, `gemini-3-pro-preview`
343+
- Gemini 2.5 Flash: `gemini-2.5-flash`, `gemini-2.5-flash-thinking`, `gemini-2.5-flash-lite`
344+
- All models in a group deplete the usage of the group equally. So in claude group - it is beneficial to use only Opus, and forget about Sonnet and GPT-OSS.
336345

337346
**Qwen Code:**
338347
- Dual auth (API key + OAuth Device Flow)
@@ -394,6 +403,8 @@ The proxy includes a powerful text-based UI for configuration and management.
394403
| `CONCURRENCY_MULTIPLIER_<PROVIDER>_PRIORITY_<N>` | Concurrency multiplier per priority tier |
395404
| `QUOTA_GROUPS_<PROVIDER>_<GROUP>` | Models sharing quota limits |
396405
| `OVERRIDE_TEMPERATURE_ZERO` | `remove` or `set` to prevent tool hallucination |
406+
| `GEMINI_CLI_QUOTA_REFRESH_INTERVAL` | Quota baseline refresh interval in seconds (default: 300) |
407+
| `ANTIGRAVITY_QUOTA_REFRESH_INTERVAL` | Quota baseline refresh interval in seconds (default: 300) |
397408

398409
</details>
399410

@@ -512,14 +523,48 @@ Uses Google OAuth to access internal Gemini endpoints with higher rate limits.
512523
- Automatic free-tier project onboarding
513524
- Paid vs free tier detection
514525
- Smart fallback on rate limits
526+
- Quota baseline tracking with background refresh (accurate remaining quota estimates)
527+
- Sequential rotation mode (uses credentials until quota exhausted)
528+
529+
**Quota Groups:** Models that share quota are automatically grouped:
530+
- **Pro**: `gemini-2.5-pro`, `gemini-3-pro-preview`
531+
- **2.5-Flash**: `gemini-2.0-flash`, `gemini-2.5-flash`, `gemini-2.5-flash-lite`
532+
- **3-Flash**: `gemini-3-flash-preview`
533+
534+
All models in a group deplete the shared quota equally. 24-hour per-model quota windows.
515535

516536
**Environment Variables (for stateless deployment):**
537+
538+
Single credential (legacy):
517539
```env
518540
GEMINI_CLI_ACCESS_TOKEN="ya29.your-access-token"
519541
GEMINI_CLI_REFRESH_TOKEN="1//your-refresh-token"
520542
GEMINI_CLI_EXPIRY_DATE="1234567890000"
521543
GEMINI_CLI_EMAIL="your-email@gmail.com"
522544
GEMINI_CLI_PROJECT_ID="your-gcp-project-id" # Optional
545+
GEMINI_CLI_TIER="standard-tier" # Optional: standard-tier or free-tier
546+
```
547+
548+
Multiple credentials (use `_N_` suffix where N is 1, 2, 3...):
549+
```env
550+
GEMINI_CLI_1_ACCESS_TOKEN="ya29.first-token"
551+
GEMINI_CLI_1_REFRESH_TOKEN="1//first-refresh"
552+
GEMINI_CLI_1_EXPIRY_DATE="1234567890000"
553+
GEMINI_CLI_1_EMAIL="first@gmail.com"
554+
GEMINI_CLI_1_PROJECT_ID="project-1"
555+
GEMINI_CLI_1_TIER="standard-tier"
556+
557+
GEMINI_CLI_2_ACCESS_TOKEN="ya29.second-token"
558+
GEMINI_CLI_2_REFRESH_TOKEN="1//second-refresh"
559+
GEMINI_CLI_2_EXPIRY_DATE="1234567890000"
560+
GEMINI_CLI_2_EMAIL="second@gmail.com"
561+
GEMINI_CLI_2_PROJECT_ID="project-2"
562+
GEMINI_CLI_2_TIER="free-tier"
563+
```
564+
565+
**Feature Toggles:**
566+
```env
567+
GEMINI_CLI_QUOTA_REFRESH_INTERVAL=300 # Quota refresh interval in seconds (default: 300 = 5 min)
523568
```
524569

525570
</details>
@@ -531,9 +576,11 @@ Access Google's internal Antigravity API for cutting-edge models.
531576

532577
**Supported Models:**
533578
- **Gemini 3 Pro** — with `thinkingLevel` support (low/high)
579+
- **Gemini 2.5 Flash** — with thinking mode support
580+
- **Gemini 2.5 Flash Lite** — configurable thinking budget
534581
- **Claude Opus 4.5** — Anthropic's most powerful model (thinking mode only)
535582
- **Claude Sonnet 4.5** — supports both thinking and non-thinking modes
536-
- Gemini 2.5 Pro/Flash
583+
- **GPT-OSS 120B** — OpenAI-compatible model
537584

538585
**Setup:**
539586
1. Run `python -m rotator_library.credential_tool`
@@ -545,6 +592,8 @@ Access Google's internal Antigravity API for cutting-edge models.
545592
- Tool hallucination prevention via parameter signature injection
546593
- Automatic thinking block sanitization for Claude
547594
- Credential prioritization (paid resets every 5 hours, free weekly)
595+
- Quota baseline tracking with background refresh (accurate remaining quota estimates)
596+
- Parallel tool usage instruction injection for Claude
548597

549598
**Environment Variables:**
550599
```env
@@ -556,6 +605,8 @@ ANTIGRAVITY_EMAIL="your-email@gmail.com"
556605
# Feature toggles
557606
ANTIGRAVITY_ENABLE_SIGNATURE_CACHE=true
558607
ANTIGRAVITY_GEMINI3_TOOL_FIX=true
608+
ANTIGRAVITY_QUOTA_REFRESH_INTERVAL=300 # Quota refresh interval (seconds)
609+
ANTIGRAVITY_PARALLEL_TOOL_INSTRUCTION_CLAUDE=true # Parallel tool instruction for Claude
559610
```
560611

561612
> **Note:** Gemini 3 models require a paid-tier Google Cloud project.

requirements.txt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,3 +19,9 @@ aiohttp
1919
colorlog
2020

2121
rich
22+
23+
# GUI for model filter configuration
24+
customtkinter
25+
26+
# For building the executable
27+
pyinstaller

src/proxy_app/build.py

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
import platform
44
import subprocess
55

6+
67
def get_providers():
78
"""
89
Scans the 'src/rotator_library/providers' directory to find all provider modules.
@@ -24,6 +25,7 @@ def get_providers():
2425
hidden_imports.append(f"--hidden-import={module_name}")
2526
return hidden_imports
2627

28+
2729
def main():
2830
"""
2931
Constructs and runs the PyInstaller command to build the executable.
@@ -47,22 +49,27 @@ def main():
4749
"--collect-data",
4850
"litellm",
4951
# Optimization: Exclude unused heavy modules
50-
"--exclude-module=tkinter",
5152
"--exclude-module=matplotlib",
5253
"--exclude-module=IPython",
5354
"--exclude-module=jupyter",
5455
"--exclude-module=notebook",
5556
"--exclude-module=PIL.ImageTk",
5657
# Optimization: Enable UPX compression (if available)
57-
"--upx-dir=upx" if platform.system() != "Darwin" else "--noupx", # macOS has issues with UPX
58+
"--upx-dir=upx"
59+
if platform.system() != "Darwin"
60+
else "--noupx", # macOS has issues with UPX
5861
# Optimization: Strip debug symbols (smaller binary)
59-
"--strip" if platform.system() != "Windows" else "--console", # Windows gets clean console
62+
"--strip"
63+
if platform.system() != "Windows"
64+
else "--console", # Windows gets clean console
6065
]
6166

6267
# Add hidden imports for providers
6368
provider_imports = get_providers()
6469
if not provider_imports:
65-
print("Warning: No providers found. The build might not include any LLM providers.")
70+
print(
71+
"Warning: No providers found. The build might not include any LLM providers."
72+
)
6673
command.extend(provider_imports)
6774

6875
# Add the main script
@@ -80,5 +87,6 @@ def main():
8087
except FileNotFoundError:
8188
print("Error: PyInstaller is not installed or not in the system's PATH.")
8289

90+
8391
if __name__ == "__main__":
8492
main()

src/proxy_app/detailed_logger.py

Lines changed: 44 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,26 @@
1+
# src/proxy_app/detailed_logger.py
2+
"""
3+
Raw I/O Logger for the Proxy Layer.
4+
5+
This logger captures the UNMODIFIED HTTP request and response at the proxy boundary.
6+
It is disabled by default and should only be enabled for debugging the proxy itself.
7+
8+
Use this when you need to:
9+
- Verify that requests/responses are not being corrupted
10+
- Debug HTTP-level issues between the client and proxy
11+
- Capture exact payloads as received/sent by the proxy
12+
13+
For normal request/response logging with provider correlation, use the
14+
TransactionLogger in the rotator_library instead (enabled via --enable-request-logging).
15+
16+
Directory structure:
17+
logs/raw_io/{YYYYMMDD_HHMMSS}_{request_id}/
18+
request.json # Unmodified incoming HTTP request
19+
streaming_chunks.jsonl # If streaming mode
20+
final_response.json # Unmodified outgoing HTTP response
21+
metadata.json # Summary metadata
22+
"""
23+
124
import json
225
import time
326
import uuid
@@ -14,30 +37,36 @@
1437
from rotator_library.utils.paths import get_logs_dir
1538

1639

17-
def _get_detailed_logs_dir() -> Path:
18-
"""Get the detailed logs directory, creating it if needed."""
40+
def _get_raw_io_logs_dir() -> Path:
41+
"""Get the raw I/O logs directory, creating it if needed."""
1942
logs_dir = get_logs_dir()
20-
detailed_dir = logs_dir / "detailed_logs"
21-
detailed_dir.mkdir(parents=True, exist_ok=True)
22-
return detailed_dir
43+
raw_io_dir = logs_dir / "raw_io"
44+
raw_io_dir.mkdir(parents=True, exist_ok=True)
45+
return raw_io_dir
2346

2447

25-
class DetailedLogger:
48+
class RawIOLogger:
2649
"""
27-
Logs comprehensive details of each API transaction to a unique, timestamped directory.
50+
Logs raw HTTP request/response at the proxy boundary.
51+
52+
This captures the EXACT data as received from and sent to the client,
53+
without any transformations. Useful for debugging the proxy itself.
54+
55+
DISABLED by default. Enable with --enable-raw-logging flag.
2856
2957
Uses fire-and-forget logging - if disk writes fail, logs are dropped (not buffered)
3058
to prevent memory issues, especially with streaming responses.
3159
"""
3260

3361
def __init__(self):
3462
"""
35-
Initializes the logger for a single request, creating a unique directory to store all related log files.
63+
Initializes the logger for a single request, creating a unique directory
64+
to store all related log files.
3665
"""
3766
self.start_time = time.time()
3867
self.request_id = str(uuid.uuid4())
3968
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
40-
self.log_dir = _get_detailed_logs_dir() / f"{timestamp}_{self.request_id}"
69+
self.log_dir = _get_raw_io_logs_dir() / f"{timestamp}_{self.request_id}"
4170
self.streaming = False
4271
self._dir_available = safe_mkdir(self.log_dir, logging)
4372

@@ -59,7 +88,7 @@ def _write_json(self, filename: str, data: Dict[str, Any]):
5988
)
6089

6190
def log_request(self, headers: Dict[str, Any], body: Dict[str, Any]):
62-
"""Logs the initial request details."""
91+
"""Logs the raw incoming request details."""
6392
self.streaming = body.get("stream", False)
6493
request_data = {
6594
"request_id": self.request_id,
@@ -81,7 +110,7 @@ def log_stream_chunk(self, chunk: Dict[str, Any]):
81110
def log_final_response(
82111
self, status_code: int, headers: Optional[Dict[str, Any]], body: Dict[str, Any]
83112
):
84-
"""Logs the complete final response, either from a non-streaming call or after reassembling a stream."""
113+
"""Logs the raw outgoing response."""
85114
end_time = time.time()
86115
duration_ms = (end_time - self.start_time) * 1000
87116

@@ -149,3 +178,7 @@ def _log_metadata(self, response_data: Dict[str, Any]):
149178
metadata["reasoning_content"] = reasoning
150179

151180
self._write_json("metadata.json", metadata)
181+
182+
183+
# Backward compatibility alias
184+
DetailedLogger = RawIOLogger

0 commit comments

Comments
 (0)