📝 High praise + strategic AI usage tip (emergent puzzle solving) #2386
Replies: 7 comments 1 reply
-
Ps. Below is Gemini's musing about the method itself, as mini lessons-learnt:
And here is what ChatGPT thinks about it all, as another post-mortem. With a Mermaid graph: Ver. 1.1 |
Beta Was this translation helpful? Give feedback.
-
Wow. I love reading this, and I'm going to give your approach a try! |
Beta Was this translation helpful? Give feedback.
-
awesome! |
Beta Was this translation helpful? Give feedback.
This comment was marked as off-topic.
This comment was marked as off-topic.
-
For the record, apart from the code that works but is a bit unusual (for some AIs at least), mine, mentioned in above:
(its working copy here now). Anecdote: even during this "final touch" of Gemini's proposal many AIs are not sure "what to touch" and thus they fall for the (what I call by now) "Ulam's curse" as they at first also dabble with the wrong part of code (the "Ulam's attractor " is too strong ;) causing it then to fill in yet again all the matrix, like Gemini was struggling hard not to do, via referring these "PM artifacts"... ver. 1.1 |
Beta Was this translation helpful? Give feedback.
-
Fascinating. I guess the use of files as buckets of different relevant context are allowing the model to self select what context to send to the next inference. Which is a little like search based RAG, but where the LLM is making more qualitative decisions |
Beta Was this translation helpful? Give feedback.
-
PS. Oh, it seems that smth more advanced has been invented too: https://github.com/modelcontextprotocol/servers/tree/main/src/sequentialthinking Update: I have elaborated the Readme and structure since: https://github.com/Manamama/Puzzles_for_AIs/tree/main (with Gemini CLI's help, sic) ver. 1.1 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
💬 In short: this is the first AI tool I’ve seen that genuinely “thinks” — or rather, has been given the tools to think.
Context:
I’m not a dev. I am not an "IT person" even. My background is in literature and philosophy — but since 2022 I’ve been testing the limits of AI reasoning through handmade puzzles, some (actually atypical) public ones can be seen here:
👉 Manamama/Puzzles_for_AIs
This week I installed
gemini-cli
on both Ubuntu and Android (yes, it runs in Termux — after minor tweaks). I then ran it against a puzzle I call the Organic Spiral problem:👉 spiral_create.md
Most AIs — even GPT-4 or Claude — consistently fail this after dozens of attempts. Why? They overfit to known visual heuristics like [Ulam’s spiral] and lack the working memory plus some strategic-level thinking skills to track divergent planning hypotheses. So I created this puzzle precisely to test strategic reasoning under (moderate here) ambiguity.
What Gemini did differently:
What made Gemini succeed is this: the CLI lets it externalize memory — by writing "PM artifact files" aka "proper thinking" (above the "think step by step" etc. methods) mid-process.
To support that, I introduced a mini-PMBOK-style set of artifacts — a live strategic scaffold to help Gemini think not just in code, but in branches, risks, and tests. It worked. Not perfectly, but works well enough for now (I shall iterate on it too, soon).
I’m attaching the structure below so others can reuse or adapt it in AI project experiments. The format is intentionally lean — it’s not ceremony, just function.
📦 Strategic Tracker:
ai-strategy-tracker/
(Ver. 2.0)Click to expand folder and file structure
📄
PROJECT_PLAN.md
RISK_REGISTER.md
🌱
ASSUMPTIONS.md
🔁
DECISION_LOG.md
🧪
TEST_MODULES/
(optional)How to Use:
PROJECT_PLAN.md
— scope the goal.ASSUMPTIONS.md
.RISK_REGISTER.md
to mark potential failure modes.DECISION_LOG.md
— include reasoning.Here is how much it has taken Gemini to solve it:
│ Cumulative Stats (16 Turns) │
│ │
│ Input Tokens 1,234,052 │
│ Output Tokens 12,759 │
│ Thoughts Tokens 374,275 │
│ ───────────────────────────────── │
│ Total Tokens 1,621,086 │
│ │
│ Total duration (API) 29m 17s │
Much praise to the Gemini CLI team — this interface empowers models to behave not just like code writers, but like strategic agents. That’s rare so far.
As I wrote, I am just beginning testing it, the results are not perfect, but good enough, random session log here:
https://github.com/Manamama/Puzzles_for_AIs/blob/main/Tasks/tests/Gemini/2.0%20cli_spiral01.md
https://github.com/Manamama/Puzzles_for_AIs/blob/main/Tasks/tests/Gemini/2.0%20Artefacts2.zip (the set of artefacts, "memory jog" that it self-created).
Update, the code moved to: https://github.com/Manamama/Puzzles_for_AIs/blob/main/puzzles/programming/spiral_create.md, so hrefs fixed.
Ver. 1.4
Beta Was this translation helpful? Give feedback.
All reactions