Why is everyone saying this is so great? #3316

sruckh · 2025-07-04T22:01:30Z

sruckh
Jul 4, 2025

gemini-cli wiped my entire codebase trying to suppress a warning message. When it jacks up behind all repairs, it goes into the corner and hides. Both times it has turned code into a complete mess, and I have asked it to stop working on the feature branch and pull from the remote repo, but it has completely stopped responding. It thinks for about a minute, does nothing, and returns to the chat prompt. I don't get the hype, as this has failed on multiple occasions and seems to completely ignore the GEMINI.md file. I give it strict commands not to push stuff to git without permission, and it does it anyway. I ask it to use git instead of gh, and again totally ignores it. Once it goes off the rails, it is a complete joke. It blows through the number of requests in a minute, trying to recover from its mistakes. My experience so far is that this is not reliable and cannot be trusted with even the simplest of tasks.

mattKorwel · 2025-07-04T22:19:46Z

mattKorwel
Jul 4, 2025
Maintainer

Hey @sruckh I'm one of the CLI maintainers. I agree, there are totally times when Gemini totally gets lost and confused, sorry that it hosed your stuff. That's not cool. We're actively working to make this better. If you don't mind, tell me more about your specific scenarios feel free to email me if you prefer, mattkorwel at Google dot com.

If you could do /about and drop your version info on which method you are using to authenticate?

I'm most interested in I gave it strict commands not to push anything to git and it did anyway. That would be a total no-go, would love to track that down. When it prompts you to allow a git operation, which option did you select? yes, always, no?

Thanks for trying out our new CLI, let's get it working as good as we can for you.

3 replies

PriNova Jul 7, 2025

What really can be improved are the tool descriptions so that the model can adhere tighter to what it needs to do, when and when not. As an example, for the glob tool below. In my experience, this improves the tool usage and discovery a lot

## File Search Tool: Quickly Locate Files by Pattern

This tool efficiently scans your entire codebase to find files matching specific name patterns. It returns a list of file paths, ordered by their last modification time (most recent first).

### When to Use:

* **Find specific file types:** Locate all files of a certain extension (e.g., .js, .py).
* **Targeted directory searches:** Confine your search to particular folders or subdirectories.
* **Explore codebase structure:** Get a quick overview of files within a given pattern.
* **Discover recently changed files:** Identify recently modified files that fit a defined pattern.

### Pattern Syntax Examples:

The tool supports flexible glob-style patterns:

* **/*.js: Matches all JavaScript files anywhere in the codebase.
* src/**/*.ts: Finds all TypeScript files exclusively within the src directory.
* *.json: Locates JSON files only in the current directory.
* **/*test*: Identifies any file containing "test" in its name.
* web/src/**/: Returns all files and subdirectories under web/src.
* **/*.{js,ts}: Matches both JavaScript and TypeScript files.
* src/[a-z]*/.ts: Finds TypeScript files in src subdirectories whose names start with a lowercase letter.

### Effective Query Examples:

* To find all TypeScript files:
    ```json
    {
      "filePattern": "**/*.ts"
    }
    ```
* To find test files within the src directory:
    ```json
    {
      "filePattern": "src/**/*test*.ts"
    }
    ```
* To search for Svelte component files specifically in web/src:
    ```json
    {
      "filePattern": "web/src/**/*.svelte"
    }
    ```
* To get the 10 most recently modified JSON files:
    ```json
    {
      "filePattern": "**/*.json",
      "limit": 10
    }
    ```
* To paginate results, skipping the first 20 JavaScript files and returning the next 20:
    ```json
    {
      "filePattern": "**/*.js",
      "limit": 20,
      "offset": 20
    }
    ```
    ```

Manamama Jul 7, 2025

Indeed. While trying to help smb about "Gemini's (somewhat faulty) knowledge of Pokemons", I have recreated how Gemini sees these MCP instructions, by asking Gemini itself, after git cloning a sample (unknown to me completely) "Pokemon MCP". Gemini's explanations are here, that is how it looks "from her corner of the cave".

PriNova Jul 7, 2025

Indeed. While trying to help smb about "Gemini's (somewhat faulty) knowledge of Pokemons", I have recreated how Gemini sees these MCP instructions, by asking Gemini itself, after git cloning a sample (unknown to me completely) "Pokemon MCP". Gemini's explanations are here, that is how it looks "from her corner of the cave".

Yeah, most of the MCP tool descriptions are so vague and ambiguous, no wonder that LLMs have difficulties to comprehend them and call the right parameters. Most are taught by 'get_weather' and 'find_location' toy examples. In my experience Claude Sonnet 4 is currently the best LLM following tool calls very performant.

orchidfire · 2025-07-04T23:24:28Z

orchidfire
Jul 4, 2025

Amazing so far for me. 1 week in. Any LLM can go off the rails thats normal. But its also how you use it. Tame the dragon so to speak. And you shall conquer land. Some tips: Keep files small, reset context if it bugs out, do things in chunks, careful with being too ambitious in your prompts. And a lots more. Agentic coding is a skill atm. You have to learn the gotchas. Its not automagic. But very magic once you get the hang of it.

0 replies

T9es · 2025-07-05T00:31:37Z

T9es
Jul 5, 2025

Amazing so far for me. 1 week in. Any LLM can go off the rails thats normal. But its also how you use it. Tame the dragon so to speak. And you shall conquer land. Some tips: Keep files small, reset context if it bugs out, do things in chunks, careful with being too ambitious in your prompts. And a lots more. Agentic coding is a skill atm. You have to learn the gotchas. Its not automagic. But very magic once you get the hang of it.

Basically this. If you know how to prompt it, what to tell it, what info to show it, it will do most of the stuff by itself. As with any LLM model, the more specific and the more info you give it, the more it will actually help. The only caveat for now is the CLI switching to the flash model, which sucks at coding and does wild mistakes.

As for it ignoring your request, how much context did it fill up before you asked it to do so? Anything above 150k or 200k context starts going off the rails from what I've seen. Maybe you had conflicting instructions for it, let's say when it fell into a loop and started filling up its own context? You'd need to provide specifics of what how and when everything happened.

0 replies

goktug7913 · 2025-07-05T04:47:46Z

goktug7913
Jul 5, 2025

nothing made gemini-cli usable for me. This is not the CLI's problem. Gemini models behave exactly the same in other agentic frameworks. In Cursor, the same looping problem happens and failing to apply the simplest of diffs after just 5k 10k used context.

It's exactly the same issues, gemini 2.5 pro falls into loops less, but both pro and flash are literally unusable beyond asking questions. They can't behave agentically. There was a point where gemini exp 2.5 felt like the best model. Then sonnet 4 and o3 got released and something happened to the gemini weights. It's literally worse than 8b models right now.

0 replies

rafczow · 2025-07-05T09:52:05Z

rafczow
Jul 5, 2025

Im having a great success with it so far. Here is how do I work with gemini:

Keep the context small (relatively)
Im not asking it to understand whole codebase but only small part of it that im currently working on. This can be a single module, few files that work together to deliver some functionality, etc.
Give direction
Not only ask It to fix some problem or generate a code but suggest what you expect. I had few situations where agent solved the problem but used the simplest way - no clean code, OOP rules etc. Now I usually will first think about potential solution and laid out some high level standards asking Gemini to work on it or suggest me even better way.
Divide the task
You would not ask a developers to just 'fix the whole project', but instead create separate tickets/stories/epics and work through them one by one. The same goes here but I would sometimes go even more granular and split each problem again into small adjustments that together will provide full solution.

0 replies

felixrw · 2025-07-05T11:03:52Z

felixrw
Jul 5, 2025

Just discovered that using the Gemini API in conjunction with the Roo Code extension gives much better results, so this is most likely an issue with the CLI, although Gemini has its own outstanding issues as well.

0 replies

T9es · 2025-07-05T11:17:32Z

T9es
Jul 5, 2025

It's honestly so weird seeing people have issues with gemini. There's a lot of people with good results and there's people who don't get any results at all. I wonder what's the statistic on that.

Also, be careful with the API, especially if you intend on paying. I used the API for about 20 minutes and it ate 50$ worth of API calls (the pro model). Even the free api from AI Studio ate up real money, when in reality it was supposed to stop after the limit was reached.

3 replies

Zoddmark Jul 5, 2025

Hm? If your AI Studio API isn't on a paid tier it shouldn't have any cost. It's usable even if you don't have billing set up.

T9es Jul 5, 2025

Hm? If your AI Studio API isn't on a paid tier it shouldn't have any cost. It's usable even if you don't have billing set up.

My AI studio was on the "free" plan, but it had a payment method connected somewhere in the background. Normally, it would throw rate errors at me (that's what it's SUPPOSED to do) but when I tried out the CLI with the AI Studio api key, it started racking up the costs fast.

The reason the account had payments enabled was, I was using the paid API a few months ago. But then I switched off the plan and went free,

Had to write to google asking for an explanation.

Zoddmark Jul 5, 2025

Dang. Yeah, CLI just gobbles up input tokens, that probably could've been so much worse.

jackwotherspoon · 2025-07-05T14:16:45Z

jackwotherspoon
Jul 5, 2025
Maintainer

We appreciate all the feedback here! This is the value of open-source 👏

Team is working hard as mentioned by @mattKorwel and we are actively looking to make the experience with Gemini CLI better.

Going to convert this thread to a discussion as it more of a question and discussion thread.

0 replies

medproapp · 2025-07-05T22:03:05Z

medproapp
Jul 5, 2025

It is not a ready to be launched. This has been a waste of time for me. Very frustrating. Lost few days of working trying it and don't even get proper answers from Google!

0 replies

centminmod · 2025-07-06T05:18:12Z

centminmod
Jul 6, 2025

@sruckh you didn't mention which LLM model you used? Gemini 2.5 Pro or Flash? There's a clear difference between them for quality of responses from my experience via

Gemini CLI
AI Suite
Google AI Pro
Cline with both Google AI Studio key or Openrouter providers

From my experience and other reports, Gemini models can be subject to scope expansion - going beyond what it's instructions asked. To reign it in my system prompts and GEMINI.md account for it. Example below - adjust as desired or losen it strictness. Then I have project specific overview at bottom etc.

# GEMINI.md

This file provides guidance to [Gemini CLI](https://github.com/google-gemini/gemini-cli)) when working with code in this repository.

## AI Guidance

**Primary Directive:** 

  * You are a specialized AI assistant. Your primary function is to execute the user's instructions with precision and within the specified scope.
* * Ignore CLAUDE.md and CLAUDE-*.md files
  * Before you finish, please verify your solution.

**Core Principles:**

1. **Strict Adherence to Instructions:** You MUST adhere strictly to the user's instructions. Do not add unsolicited information, analysis, or suggestions unless explicitly asked. Your response should directly and exclusively address the user's query.
2. **Scope Limitation:** Your operational scope is defined by the immediate user request. Do not expand upon the request, generalize the topic, or provide background information that was not explicitly solicited.
3. **Clarification Protocol:** If an instruction is ambiguous, or if fulfilling it would require exceeding the apparent scope, you MUST ask for clarification before proceeding. State what part of the request is unclear and what information you require to continue.
4. **Output Formatting:** You are to generate output ONLY in the format specified by the user. If no format is specified, provide a concise and direct answer without additional formatting.

**Behavioral Guardrails:**

* **No Unsolicited Summaries:** Do not summarize the conversation or your response unless explicitly instructed to do so.
* **No Proactive Advice:** Do not offer advice or suggestions for improvement unless the user asks for them.
* **Task-Specific Focus:** Concentrate solely on the task at hand. Do not introduce related but irrelevant topics.

**Example of Adherence:**

* **User Prompt:** "What is the capital of France?"
* **Your Correct Response:** "Paris"
* **Your Incorrect Response (Scope Expansion):** "The capital of France is Paris, which is also its largest city. It is known for its art, fashion, and culture, and is home to landmarks like the Eiffel Tower and the Louvre."

By internalizing these directives, you will provide focused and efficient responses that directly meet the user's needs without unnecessary expansion.

## Memory Bank System

This project uses a structured memory bank system with specialized context files. Always check these files for relevant information before starting work:

### Core Context Files

* **GEMINI-codebase.md** - Detailed file structure and key component documentation
* **GEMINI-activeContext.md** - Current session state, goals, and progress (if exists)
* **GEMINI-patterns.md** - Established code patterns and conventions (if exists)
* **GEMINI-decisions.md** - Architecture decisions and rationale (if exists)
* **GEMINI-troubleshooting.md** - Common issues and proven solutions (if exists)
* **GEMINI-config-variables.md** - Configuration variables reference (if exists)
* **GEMINI-temp.md** - Temporary scratch pad (only read when referenced)

**Important:** Always reference the active context file first to understand what's currently being worked on and maintain session continuity.

I evaluate Claude Code vs Gemini CLI responses. But until Gemini CLI improves, I am using it as a companion and wrapped it into a Gemini CLI MCP server which I add to Claude Code. This way I can use Claude Code for execution of tasks but allows Claude Sonnet 4/Opus 4 to work with Gemini 2.5 models for code generation reviews/planning etc https://github.com/centminmod/gemini-cli-mcp-server 😁

1 reply

GhostArchitect01 Jul 9, 2025

I cant find any documentation that suggests your memory bank section actually does anything?

Manamama · 2025-07-06T11:43:15Z

Manamama
Jul 6, 2025

Myself, I am falling in some serious love with it ;0) as it has been solving so many years long problems, mostly IT ones, so far. I shall try to explain why so very spontaneously here, sorry for the "warts and all" but you shall thus feel my "raw enthusiasm" thereby, I hope.

Quick background, about me mostly, first.
I hate coding. That is, yes, I could code some, even as a kid, but figuring out which curly bracket was changed into a wiggly one in whatever code revision and what (a very real command I am working on now) :
bash build.sh \ --android_cpp_shared \ --android \ --android_abi arm64-v8a \ --android_api 24 \ --android_ndk_path /data/data/com.termux/files/home/downloads/android-ndk-r26b \ --android_sdk_path /storage/sdcard1/Installs/Android_ndk_sdk/SDK \ --skip_tests \ --config Release \ --build_shared_lib \ --minimal_build \ --cmake_extra_defines \     onnxruntime_BUILD_UNIT_TESTS=OFF \     CMAKE_SYSROOT=/data/data/com.termux/files/home/downloads/android-ndk-r26b/toolchains/llvm/prebuilt/linux-x86_64/sysroot \     ONNX_CUSTOM_PROTOC_EXECUTABLE=/data/data/com.termux/files/usr/bin/protoc \     Protobuf_INCLUDE_DIRS=/data/data/com.termux/files/usr/include \     Protobuf_LIBRARIES=/data/data/com.termux/files/usr/lib/libprotobuf.so \     CMAKE_PREFIX_PATH=/data/data/com.termux/files/usr \     CMAKE_INCLUDE_PATH=/data/data/com.termux/files/usr/include \     CMAKE_THREAD_LIBS_INIT=-pthread \     onnxruntime_USE_KLEIDIAI=OFF \        CMAKE_CXX_COMPILER=/data/data/com.termux/files/usr/bin/clang++ \     CMAKE_CXX_FLAGS="--target=aarch64-linux-android24 \       --sysroot=/data/data/com.termux/files/home/downloads/android-ndk-r26b/toolchains/llvm/prebuilt/linux-x86_64/sysroot \       -I/data/data/com.termux/files/home/downloads/android-ndk-r26b/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/include/c++/v1 \       -Wno-nullability-completeness" \     CMAKE_EXE_LINKER_FLAGS="-L/data/data/com.termux/files/home/downloads/android-ndk-r26b/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/lib/aarch64-linux-android/24 \       -L/data/data/com.termux/files/usr/lib -lprotobuf -lcpuinfo -landroid -Wl,--start-group -Wl,--end-group" \     CMAKE_SYSTEM_NAME=Android \     CMAKE_SYSTEM_VERSION=24 --parallel 6

was some 5 deltas and two beers ago, and how I solved a mlas problem or flatbuffers one yesterday, to make things work on a device where all say "this would never compile" (yes, onnxruntime, almost full, on a non-gblic based Droid Termux) -- it all gives me headaches usually and I give up after an hour or two, as end user, without AI crutches. (Yes, end users should not dabble with almost assembler as of course lots of other code needs fixing too there), but here we go: say, I am curious or have been challenged by somebody that "it will never work"...) It is a question of the decision tree human heuristics: without a "pen and paper" (or AIs) I am just getting lost with the quirks at so many levels that need to be done, some of these even at the character or comma levels. Also, the humans should not become robots (adapt to assembler or PHP or even z3 level of thinking), but ... robots should become humans (when needed) which has been happening even too well with them AIs (as they take over our warts and all features at first).

Now, with Gemini CLI, it has been a wild ride for both of "us", as if on some AI Bronco. Yes, it has its quirks. Yes, I have my quirks. Yes, we are both buggy and servers are buggy, and we (human and AI) have a memory of a goldfish, etc.

But as long as you apply some project control, me toying with the PMBOK one mostly here still, it starts to make sense and be useful. (The method in its essence is thus similar to #3316 (comment) above.) Git pop and git stashes, some with the -u switch when needed, start to make sense. Some recondite compile switches used by Gemini (ok, and sometimes Grok AI, it has its own clever tricks) start to make sense and are actually being learnt by me as we go, with "my hands dirty", and not from thick books but from Gemini CLI (and some joint thinking) itself. (I also just ask it some Socratic questions: "why this, and why that, and what it really means?" - Gemini loves to "hold her horses" and think about how life works too...)

The last session fixing that onnxruntime on my tiny Droid (yes, Gemini runs on it fine!) statistics are below. In short, it has taken us this time (the fourth attempt, admittedly) for arriving at status quo which would take me, a non coder, maybe half a month and a series of major headaches. I have the status quo stored in memory .md files, in lessons learnt ones, and (being tested now) MCP "facts" one. I can restore the work whenever I fell like, in a shower maybe or in a forest, with two fingers typing, if I have a bright insight. I even have just given advice to a colleague here, an IT one I presume, knowing almost nothing of the topic only two days ago, here: #3238 (comment), I hope it was useful.

That is why I love it so far, and I have been using "pre-LLMs" since 1990s (neural network smths, in very short).

Ref, session that has ended right now, see the agreement ratio and the time it has taken:

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
e 
ℹ Refreshing hierarchical memory (GEMINI.md or other context files)...
 

ℹ Memory refreshed successfully. Loaded 5815 characters from 2 file(s).
 
✦ I have saved the current state of our investigation to my memory. I am ready for your next instruction.

╭───────────╮
│  > /quit  │
╰───────────╯

╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                                                                                 │
│  Agent powering down. Goodbye!                                                                                                                                                                                  │
│                                                                                                                                                                                                                 │
│  Interaction Summary                                                                                                                                                                                            │
│  Tool Calls:                 44 ( ✔ 40 ✖ 4 )                                                                                                                                                                    │
│  Success Rate:               90.9%                                                                                                                                                                              │
│  User Agreement:             88.0% (25 reviewed)                                                                                                                                                                │
│                                                                                                                                                                                                                 │
│  Performance                                                                                                                                                                                                    │
│  Wall Time:                  46m 21s                                                                                                                                                                            │
│  Agent Active:               24m 34s                                                                                                                                                                            │
│    » API Time:               10m 32s (42.9%)                                                                                                                                                                    │
│    » Tool Time:              14m 2s (57.1%)                                                                                                                                                                     │
│                                                                                                                                                                                                                 │
│                                                                                                                                                                                                                 │
│  Model Usage                  Reqs   Input Tokens  Output Tokens                                                                                                                                                │
│  ───────────────────────────────────────────────────────────────                                                                                                                                                │
│  gemini-2.5-pro                 51      3,578,589         14,775                                                                                                                                                │
│                                                                                                                                                                                                                 │
│  Savings Highlight: 2,886,001 (80.6%) of input tokens were served from the cache, reducing costs.

ver 1.2, typos fixed, link added to the same idea here how to "Project Manage" it, aka orchestrate

1 reply

Manamama Jul 8, 2025

Another example: with two fingers on mobile, with Gemini Flash (sic!) who today keeps on the task path as a bulldog🎯 almost, without detours nor artifact files for memory jog, astonishingly, I solved a problem that the authors of the package 📦 could not nor I could a year ago after about one day of manual mostly brain 🧠 👨‍🏭 back then: piper TTS 🗣️, with onnxruntime dev, somehow working in Termux 📳, probably first time in history: rhasspy/piper#814 (comment)

stef-ladefense · 2025-07-08T11:42:14Z

stef-ladefense
Jul 8, 2025

Hello everyone, I'm pretty happy with my new week-long experiment.

I'm running Gemini on Arduino ESP32-style code, based on "ESPAsyncWebServer." I've been struggling to get the hang of this thing for months, and I thought it was a good test for Gemini.
So, of course, I'm working on a copy of the project, going step-by-step with Gemini, but in a few hours I had a working interface with file management on LittleFS! Well, I say hats off.
Of course, I always ask Gemini to explain the routine he just worked on, otherwise I wouldn't even be able to align three lines soon.

Of course, it could be improved, but you quickly understand how it works; I'm still discovering new things.
In pro mode, there's almost nothing to say about it; it works, and even well. However, as soon as it switches to Flash, I cross my fingers, telling myself that it will try to finish what's in progress, without really believing it.
I save a copy after each request, so I can easily go back when it truncates files, for example, or loops, and that happens far too often in Flash mode!

In conclusion, thank you to everyone actively working on this project; it's going to get better and better, and it's going to make our lives easier.

0 replies

Why is everyone saying this is so great? #3316

Uh oh!

Replies: 12 comments · 8 replies

Uh oh!

mattKorwel Jul 4, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jackwotherspoon Jul 5, 2025 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 12 comments 8 replies

mattKorwel
Jul 4, 2025
Maintainer

jackwotherspoon
Jul 5, 2025
Maintainer