[Feedback Encouraged] - "Gemini-Cli Orchestration," Reducing tokens usage & API calls #10256
Justadudeinspace
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Gemini-Cli Orchestration
Hey all,
I’ve been experimenting with using
gemini-clias more than a one-off prompt runner — essentially turning it into an AI orchestration engine by driving it through a structured workflow withGPT-5. Thought I’d share the approach here in case others are exploring similar patterns.🧩 How I Orchestrate Inputs
Instead of thinking of
gemini-clias a “question → answer” tool, I treat it like the control panel of a build system. My workflow looks like this:TUI Menu as Control Surface
I don’t fire off random prompts. I use the TUI input box like a patch request terminal. Every prompt is carefully written to output unified diffs (patches) against my local repo.
→ This avoids giant file rewrites and keeps model output surgical, lightweight, and easy to apply with
git apply.Modular Prompt Bundles
I split work into Passes.
Pass A: core utilities (feature flags, rate limiting, admin cache).
Pass B: moderation & safety.
Pass C: productivity modules (reminders, tickets, vault, RSS, menus).
Each Pass is self-contained, which keeps prompts compact, avoids repetition, and minimizes token drift.
Micro Fix Prompts
If a patch rejects or a single hunk fails, I don’t re-describe the whole repo.
Example micro prompt:
Unified diff only. Touch ONLY
modules/admin.py.Goal: add
/unban <user_id>and ensure admin check viaget_admins_cached.This saves quota, reduces hallucination risk, and keeps things tight.
⚡ Reducing Token Use & API Calls
I’m on a
free solo-dev trial, so token conservation is critical. Here’s how I keep things lean:Patch-Only Workflow
GPT-5never re-sees the entire project — only my intent + the files I allow it to touch. No “rewrite main.py from scratch.”Unified Diff Output
Smaller deltas → fewer tokens. I apply them directly via
git apply. The model doesn’t need to “hold” repo context across prompts.Feature Flags Everywhere
Every new feature is
OFFby default. I toggle it with/feature <key> on|off. This means expanded code doesn’t hammer the Bot API until I choose.Batch Digests
For chatty features (RSS, reminders), the bot sends one digest message instead of dozens of updates. That cuts down on API calls and keeps groups uncluttered.
🛡️ Handling Hallucinations
AI hallucinations aren’t going away — but I treat them like runtime errors in a build process. Here’s how I handle them:
Strict Prompt Rules
Every prompt starts with “Output ONLY unified diff. No prose. No code fences.” → this kills filler.
Scoped File Targets
I explicitly name which files the patch may touch. No “random modules” or ghost imports.
Reject/Retry Flow
If
git applycreates a.rej, I immediately fix with a micro prompt. I never rerun the whole giant prompt, so the error stays localized.Rate Limiting
Decorators like
@rate_limitkeep the bot itself from spamming Telegram’s API if I accidentally create a loop or noisy handler.🚀 Why This Matters
Most devs treat LLMs as autocomplete or one-shot generators. I’m treating
gemini-clias the command router for a continuous, incremental development process:Controlled → I define the scope and files every time.
Low-token → diffs, not rewrites.
Incremental → passes & micro fixes, not all-in-one dumps.
Resilient → hallucinations get sandboxed and corrected quickly.
This transforms gemini-cli from “tool that answers” into “tool that builds.”
🔗 Real Projects Using This Flow
I’m applying this orchestration process to my:
Ultimate Bot — a modular, Rose-class Telegram bot with admin, productivity, safety, and orchestration features.
BLUX Lite GOLD — my orchestrator project that mimics this very workflow:
User Input →
GPT-5→ Prompt →gemini-cli (Gemini-2.5-Pro)→ Patch → Test → Apply.Currently this orchestration loop is manual, but the goal is to have the orchestrator itself auto-handle the passes.
Without
gemini-cli’sTUI loop, I’d be burning way more tokens and time trying to manage this manually.💡 Curious
Is anyone else here doing orchestration like this?
Do you split your work into passes/modules too?
How do you keep patch reject rates down?
Would there be interest in a “patch-only mode” built directly into
gemini-cli(so prompts default to diffs and can be applied in one shot)?Thanks for the tool — it’s genuinely become the backbone of my AI-assisted development workflow.
— Solo dev, Ultimate Bot project
~JADIS
Beta Was this translation helpful? Give feedback.
All reactions