Mirrowel
diff --git a/‎.gitignore‎
Lines changed: 2 additions & 0 deletions b/‎.gitignore‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎DOCUMENTATION.md‎
Lines changed: 511 additions & 21 deletions b/‎DOCUMENTATION.md‎
Lines changed: 511 additions & 21 deletions
diff --git a/‎README.md‎
Lines changed: 52 additions & 1 deletion b/‎README.md‎
Lines changed: 52 additions & 1 deletion
diff --git a/‎requirements.txt‎
Lines changed: 6 additions & 0 deletions b/‎requirements.txt‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎src/proxy_app/build.py‎
Lines changed: 12 additions & 4 deletions b/‎src/proxy_app/build.py‎
Lines changed: 12 additions & 4 deletions
diff --git a/‎src/proxy_app/detailed_logger.py‎
Lines changed: 44 additions & 11 deletions b/‎src/proxy_app/detailed_logger.py‎
Lines changed: 44 additions & 11 deletions
@@ -124,9 +124,11 @@ start_proxy.bat
 key_usage.json
 staged_changes.txt
 launcher_config.json
+quota_viewer_config.json
 cache/antigravity/thought_signatures.json
 logs/
 cache/
 *.env
 
 oauth_creds/
+
@@ -329,10 +329,19 @@ The proxy includes a powerful text-based UI for configuration and management.
 
 **Antigravity:**
 - Gemini 3 Pro with `thinkingLevel` support
+- Gemini 2.5 Flash/Flash Lite with thinking mode
 - Claude Opus 4.5 (thinking mode)
 - Claude Sonnet 4.5 (thinking and non-thinking)
+- GPT-OSS 120B Medium
 - Thought signature caching for multi-turn conversations
 - Tool hallucination prevention
+- Quota baseline tracking with background refresh
+- Parallel tool usage instruction injection
+- **Quota Groups**: Models that share quota are automatically grouped:
+    - Claude/GPT-OSS: `claude-sonnet-4-5`, `claude-opus-4-5`, `gpt-oss-120b-medium`
+    - Gemini 3 Pro: `gemini-3-pro-high`, `gemini-3-pro-low`, `gemini-3-pro-preview`
+    - Gemini 2.5 Flash: `gemini-2.5-flash`, `gemini-2.5-flash-thinking`, `gemini-2.5-flash-lite`
+    - All models in a group deplete the usage of the group equally. So in claude group - it is beneficial to use only Opus, and forget about Sonnet and GPT-OSS.
 
 **Qwen Code:**
 - Dual auth (API key + OAuth Device Flow)
@@ -394,6 +403,8 @@ The proxy includes a powerful text-based UI for configuration and management.
 | `CONCURRENCY_MULTIPLIER_<PROVIDER>_PRIORITY_<N>` | Concurrency multiplier per priority tier |
 | `QUOTA_GROUPS_<PROVIDER>_<GROUP>` | Models sharing quota limits |
 | `OVERRIDE_TEMPERATURE_ZERO` | `remove` or `set` to prevent tool hallucination |
+| `GEMINI_CLI_QUOTA_REFRESH_INTERVAL` | Quota baseline refresh interval in seconds (default: 300) |
+| `ANTIGRAVITY_QUOTA_REFRESH_INTERVAL` | Quota baseline refresh interval in seconds (default: 300) |
 
 </details>
 
@@ -512,14 +523,48 @@ Uses Google OAuth to access internal Gemini endpoints with higher rate limits.
 - Automatic free-tier project onboarding
 - Paid vs free tier detection
 - Smart fallback on rate limits
+- Quota baseline tracking with background refresh (accurate remaining quota estimates)
+- Sequential rotation mode (uses credentials until quota exhausted)
+
+**Quota Groups:** Models that share quota are automatically grouped:
+- **Pro**: `gemini-2.5-pro`, `gemini-3-pro-preview`
+- **2.5-Flash**: `gemini-2.0-flash`, `gemini-2.5-flash`, `gemini-2.5-flash-lite`
+- **3-Flash**: `gemini-3-flash-preview`
+
+All models in a group deplete the shared quota equally. 24-hour per-model quota windows.
 
 **Environment Variables (for stateless deployment):**
+
+Single credential (legacy):
 ```env
 GEMINI_CLI_ACCESS_TOKEN="ya29.your-access-token"
 GEMINI_CLI_REFRESH_TOKEN="1//your-refresh-token"
 GEMINI_CLI_EXPIRY_DATE="1234567890000"
 GEMINI_CLI_EMAIL="your-email@gmail.com"
 GEMINI_CLI_PROJECT_ID="your-gcp-project-id"  # Optional
+GEMINI_CLI_TIER="standard-tier"  # Optional: standard-tier or free-tier
+```
+
+Multiple credentials (use `_N_` suffix where N is 1, 2, 3...):
+```env
+GEMINI_CLI_1_ACCESS_TOKEN="ya29.first-token"
+GEMINI_CLI_1_REFRESH_TOKEN="1//first-refresh"
+GEMINI_CLI_1_EXPIRY_DATE="1234567890000"
+GEMINI_CLI_1_EMAIL="first@gmail.com"
+GEMINI_CLI_1_PROJECT_ID="project-1"
+GEMINI_CLI_1_TIER="standard-tier"
+
+GEMINI_CLI_2_ACCESS_TOKEN="ya29.second-token"
+GEMINI_CLI_2_REFRESH_TOKEN="1//second-refresh"
+GEMINI_CLI_2_EXPIRY_DATE="1234567890000"
+GEMINI_CLI_2_EMAIL="second@gmail.com"
+GEMINI_CLI_2_PROJECT_ID="project-2"
+GEMINI_CLI_2_TIER="free-tier"
+```
+
+**Feature Toggles:**
+```env
+GEMINI_CLI_QUOTA_REFRESH_INTERVAL=300  # Quota refresh interval in seconds (default: 300 = 5 min)
 ```
 
 </details>
@@ -531,9 +576,11 @@ Access Google's internal Antigravity API for cutting-edge models.
 
 **Supported Models:**
 - **Gemini 3 Pro** — with `thinkingLevel` support (low/high)
+- **Gemini 2.5 Flash** — with thinking mode support
+- **Gemini 2.5 Flash Lite** — configurable thinking budget
 - **Claude Opus 4.5** — Anthropic's most powerful model (thinking mode only)
 - **Claude Sonnet 4.5** — supports both thinking and non-thinking modes
-- Gemini 2.5 Pro/Flash
+- **GPT-OSS 120B** — OpenAI-compatible model
 
 **Setup:**
 1. Run `python -m rotator_library.credential_tool`
@@ -545,6 +592,8 @@ Access Google's internal Antigravity API for cutting-edge models.
 - Tool hallucination prevention via parameter signature injection
 - Automatic thinking block sanitization for Claude
 - Credential prioritization (paid resets every 5 hours, free weekly)
+- Quota baseline tracking with background refresh (accurate remaining quota estimates)
+- Parallel tool usage instruction injection for Claude
 
 **Environment Variables:**
 ```env
@@ -556,6 +605,8 @@ ANTIGRAVITY_EMAIL="your-email@gmail.com"
 # Feature toggles
 ANTIGRAVITY_ENABLE_SIGNATURE_CACHE=true
 ANTIGRAVITY_GEMINI3_TOOL_FIX=true
+ANTIGRAVITY_QUOTA_REFRESH_INTERVAL=300  # Quota refresh interval (seconds)
+ANTIGRAVITY_PARALLEL_TOOL_INSTRUCTION_CLAUDE=true  # Parallel tool instruction for Claude
 ```
 
 > **Note:** Gemini 3 models require a paid-tier Google Cloud project.
 
@@ -19,3 +19,9 @@ aiohttp
 colorlog
 
 rich
+
+# GUI for model filter configuration
+customtkinter
+
+# For building the executable
+pyinstaller
@@ -3,6 +3,7 @@
 import platform
 import subprocess
 
+
 def get_providers():
     """
     Scans the 'src/rotator_library/providers' directory to find all provider modules.
@@ -24,6 +25,7 @@ def get_providers():
             hidden_imports.append(f"--hidden-import={module_name}")
     return hidden_imports
 
+
 def main():
     """
     Constructs and runs the PyInstaller command to build the executable.
@@ -47,22 +49,27 @@ def main():
         "--collect-data",
         "litellm",
         # Optimization: Exclude unused heavy modules
-        "--exclude-module=tkinter",
         "--exclude-module=matplotlib",
         "--exclude-module=IPython",
         "--exclude-module=jupyter",
         "--exclude-module=notebook",
         "--exclude-module=PIL.ImageTk",
         # Optimization: Enable UPX compression (if available)
-        "--upx-dir=upx" if platform.system() != "Darwin" else "--noupx",  # macOS has issues with UPX
+        "--upx-dir=upx"
+        if platform.system() != "Darwin"
+        else "--noupx",  # macOS has issues with UPX
         # Optimization: Strip debug symbols (smaller binary)
-        "--strip" if platform.system() != "Windows" else "--console",  # Windows gets clean console
+        "--strip"
+        if platform.system() != "Windows"
+        else "--console",  # Windows gets clean console
     ]
 
     # Add hidden imports for providers
     provider_imports = get_providers()
     if not provider_imports:
-        print("Warning: No providers found. The build might not include any LLM providers.")
+        print(
+            "Warning: No providers found. The build might not include any LLM providers."
+        )
     command.extend(provider_imports)
 
     # Add the main script
@@ -80,5 +87,6 @@ def main():
     except FileNotFoundError:
         print("Error: PyInstaller is not installed or not in the system's PATH.")
 
+
 if __name__ == "__main__":
     main()
@@ -1,3 +1,26 @@
+# src/proxy_app/detailed_logger.py
+"""
+Raw I/O Logger for the Proxy Layer.
+
+This logger captures the UNMODIFIED HTTP request and response at the proxy boundary.
+It is disabled by default and should only be enabled for debugging the proxy itself.
+
+Use this when you need to:
+- Verify that requests/responses are not being corrupted
+- Debug HTTP-level issues between the client and proxy
+- Capture exact payloads as received/sent by the proxy
+
+For normal request/response logging with provider correlation, use the
+TransactionLogger in the rotator_library instead (enabled via --enable-request-logging).
+
+Directory structure:
+    logs/raw_io/{YYYYMMDD_HHMMSS}_{request_id}/
+        request.json           # Unmodified incoming HTTP request
+        streaming_chunks.jsonl # If streaming mode
+        final_response.json    # Unmodified outgoing HTTP response
+        metadata.json          # Summary metadata
+"""
+
 import json
 import time
 import uuid
@@ -14,30 +37,36 @@
 from rotator_library.utils.paths import get_logs_dir
 
 
-def _get_detailed_logs_dir() -> Path:
-    """Get the detailed logs directory, creating it if needed."""
+def _get_raw_io_logs_dir() -> Path:
+    """Get the raw I/O logs directory, creating it if needed."""
     logs_dir = get_logs_dir()
-    detailed_dir = logs_dir / "detailed_logs"
-    detailed_dir.mkdir(parents=True, exist_ok=True)
-    return detailed_dir
+    raw_io_dir = logs_dir / "raw_io"
+    raw_io_dir.mkdir(parents=True, exist_ok=True)
+    return raw_io_dir
 
 
-class DetailedLogger:
+class RawIOLogger:
     """
-    Logs comprehensive details of each API transaction to a unique, timestamped directory.
+    Logs raw HTTP request/response at the proxy boundary.
+
+    This captures the EXACT data as received from and sent to the client,
+    without any transformations. Useful for debugging the proxy itself.
+
+    DISABLED by default. Enable with --enable-raw-logging flag.
 
     Uses fire-and-forget logging - if disk writes fail, logs are dropped (not buffered)
     to prevent memory issues, especially with streaming responses.
     """
 
     def __init__(self):
         """
-        Initializes the logger for a single request, creating a unique directory to store all related log files.
+        Initializes the logger for a single request, creating a unique directory
+        to store all related log files.
         """
         self.start_time = time.time()
         self.request_id = str(uuid.uuid4())
         timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
-        self.log_dir = _get_detailed_logs_dir() / f"{timestamp}_{self.request_id}"
+        self.log_dir = _get_raw_io_logs_dir() / f"{timestamp}_{self.request_id}"
         self.streaming = False
         self._dir_available = safe_mkdir(self.log_dir, logging)
 
@@ -59,7 +88,7 @@ def _write_json(self, filename: str, data: Dict[str, Any]):
         )
 
     def log_request(self, headers: Dict[str, Any], body: Dict[str, Any]):
-        """Logs the initial request details."""
+        """Logs the raw incoming request details."""
         self.streaming = body.get("stream", False)
         request_data = {
             "request_id": self.request_id,
@@ -81,7 +110,7 @@ def log_stream_chunk(self, chunk: Dict[str, Any]):
     def log_final_response(
         self, status_code: int, headers: Optional[Dict[str, Any]], body: Dict[str, Any]
     ):
-        """Logs the complete final response, either from a non-streaming call or after reassembling a stream."""
+        """Logs the raw outgoing response."""
         end_time = time.time()
         duration_ms = (end_time - self.start_time) * 1000
 
@@ -149,3 +178,7 @@ def _log_metadata(self, response_data: Dict[str, Any]):
             metadata["reasoning_content"] = reasoning
 
         self._write_json("metadata.json", metadata)
+
+
+# Backward compatibility alias
+DetailedLogger = RawIOLogger