remove runonce and improve prompt

KongZ · KongZ · commit e2d191984da1 · 2026-02-26T23:41:20.000+07:00
Signed-off-by: Siri Chongasamethaworn &lt;siri@omise.co&gt;
diff --git a/docs/modification_modes.md b/docs/modification_modes.md
@@ -0,0 +1,114 @@
+# Modification Modes
+
+KubeAI Chatbot supports three resource modification modes, controlled by the `MODIFY_RESOURCES` environment variable. The mode determines how the agent behaves when a task requires a `kubectl` command that creates, updates, or deletes Kubernetes resources.
+
+> [!IMPORTANT]
+> Regardless of the modification mode, the agent will **never** read or list Kubernetes Secrets. This restriction is hardcoded and cannot be overridden.
+
+## Modes
+
+### `none` — Read-Only (Default)
+
+```yaml
+env:
+  MODIFY_RESOURCES: "none"
+```
+
+The agent operates in **read-only mode**. It can freely execute read commands (`get`, `describe`, `logs`, `top`, `events`, etc.) but will never execute a write command through its tools.
+
+When a task requires a resource modification, the agent will:
+
+  1. Gather the necessary context using read-only tools.
+  2. Provide the exact `kubectl` command(s) the user should run manually.
+  3. Explain what each command does and why.
+
+**Best for**: Teams that want AI-assisted diagnostics and guidance without allowing the bot to change anything in the cluster.
+
+---
+
+### `allow` — Confirm Before Modifying
+
+```yaml
+env:
+  MODIFY_RESOURCES: "allow"
+```
+
+The agent can execute write commands, but only after **explicit user confirmation**. When the agent plans a write operation, the system pauses and presents the user with a confirmation prompt listing the command(s) about to be run. The user must approve before anything is executed.
+
+Read-only commands (`get`, `describe`, `logs`, etc.) run immediately without any confirmation.
+
+**Best for**: Teams that want the convenience of automated execution but with a human-in-the-loop for any destructive or modifying actions.
+
+---
+
+### `auto` — Automatic Modification
+
+```yaml
+env:
+  MODIFY_RESOURCES: "auto"
+```
+
+The agent can execute both read and write commands automatically, without requesting user confirmation. The agent will:
+
+  1. Gather context first using read-only tools.
+  2. Briefly announce what it is about to do and why.
+  3. Execute the modification immediately.
+
+The agent will still ask for user input when genuinely required (e.g., a required value such as a namespace or image tag is not specified).
+
+**Best for**: Trusted internal tooling or teams with high confidence in the agent's behaviour who want to minimise confirmation prompts.
+
+---
+
+## Comparison
+
+| Feature                             | `none`   |        `allow`         |  `auto`  |
+| :---------------------------------- | :------: | :--------------------: | :------: |
+| Read commands (get, describe, logs) | ✅ Auto  |        ✅ Auto         | ✅ Auto  |
+| Write commands (apply, delete, …)   | ❌ Never | ✅ After user confirms | ✅ Auto  |
+| Provides commands for manual run    | ✅ Yes   |           —            |     —    |
+| User confirmation dialog            |    —     |         ✅ Yes         |  ❌ No   |
+| Minimises user interaction          |    —     |           —            |  ✅ Yes  |
+| Kubernetes Secrets access           | ❌ Never |        ❌ Never        | ❌ Never |
+
+---
+
+## Helm Values
+
+Set the mode via `values.yaml`:
+
+```yaml
+env:
+  MODIFY_RESOURCES: "none"  # Options: none, allow, auto
+```
+
+Or override at install time:
+
+```bash
+helm install kubeai-chatbot ./charts/kubeai-chatbot \
+  --set env.SLACK_BOT_TOKEN="xoxb-..." \
+  --set env.SLACK_SIGNING_SECRET="..." \
+  --set env.GEMINI_API_KEY="..." \
+  --set env.MODIFY_RESOURCES="allow"
+```
+
+---
+
+## RBAC Alignment
+
+The modification mode should be aligned with the Kubernetes RBAC permissions granted to the bot's service account. The Helm chart provides a `rbac.allowWrite` value to control this:
+
+```yaml
+rbac:
+  create: true
+  allowWrite: false  # Set to true when using allow or auto mode
+```
+
+| `MODIFY_RESOURCES` | Recommended `rbac.allowWrite` |
+| :----------------- | :---------------------------: |
+| `none`             |            `false`            |
+| `allow`            |            `true`             |
+| `auto`             |            `true`             |
+
+> [!WARNING]
+> Setting `MODIFY_RESOURCES: "allow"` or `"auto"` while `rbac.allowWrite: false` will result in permission errors when the agent attempts write operations. Conversely, granting write RBAC while using `MODIFY_RESOURCES: "none"` is safe but unnecessarily permissive.
diff --git a/pkg/agent/conversation.go b/pkg/agent/conversation.go
@@ -60,15 +60,6 @@ type Agent struct {
 	// Output is the channel to send messages to the UI.
 	Output chan any
 
-	// RunOnce indicates if the agent should run only once.
-	// If true, the agent will run only once and then exit.
-	// If false, the agent will run in a loop until the context is done.
-	RunOnce bool
-
-	// InitialQuery is the initial query to the agent.
-	// If provided, the agent will run only once and then exit.
-	InitialQuery string
-
 	// AgentName is the name of the assistant.
 	AgentName string
 
@@ -229,10 +220,6 @@ func (s *Agent) Init(ctx context.Context) error {
 	// current history of the conversation.
 	s.currChatContent = []any{}
 
-	if s.InitialQuery == "" && s.RunOnce {
-		return fmt.Errorf("RunOnce mode requires an initial query to be provided")
-	}
-
 	if s.Session != nil {
 		if s.Session.ChatMessageStore == nil {
 			s.Session.ChatMessageStore = sessions.NewInMemoryChatStore()
@@ -280,11 +267,10 @@ func (s *Agent) Init(ctx context.Context) error {
 	s.Tools.RegisterTool(tools.NewKubectlTool())
 
 	systemPrompt, err := s.generatePrompt(ctx, defaultSystemPromptTemplate, PromptData{
-		Tools:             s.Tools,
-		EnableToolUseShim: s.EnableToolUseShim,
-		ModifyResources:   s.ModifyResources,
-		// RunOnce is a good proxy to indicate the agentic session is non-interactive mode.
-		SessionIsInteractive: !s.RunOnce,
+		Tools:                s.Tools,
+		EnableToolUseShim:    s.EnableToolUseShim,
+		ModifyResources:      s.ModifyResources,
+		SessionIsInteractive: true,
 		AgentName:            s.AgentName,
 	})
 	if err != nil {
@@ -359,14 +345,8 @@ func (c *Agent) Run(ctx context.Context, initialQuery string) error {
 		ctx = journal.ContextWithSlackUserID(ctx, c.Session.SlackUserID)
 	}
 
-	// Save unexpected error and return it in for RunOnce mode
-	log.Info("Starting agent loop", "initialQuery", initialQuery, "runOnce", c.RunOnce)
+	log.Info("Starting agent loop", "initialQuery", initialQuery)
 	go func() {
-		// If initialQuery is empty, try to use the one from the struct
-		if initialQuery == "" {
-			initialQuery = c.InitialQuery
-		}
-
 		if initialQuery != "" {
 			c.addMessage(api.MessageSourceUser, api.MessageTypeText, initialQuery)
 			answer, handled, err := c.handleMetaQuery(ctx, initialQuery)
@@ -405,12 +385,6 @@ func (c *Agent) Run(ctx context.Context, initialQuery string) error {
 			log.V(2).Info("Agent loop iteration", "state", c.AgentState())
 			switch c.AgentState() {
 			case api.AgentStateIdle, api.AgentStateDone:
-				// In RunOnce mode, we are done, so exit
-				if c.RunOnce {
-					log.V(2).Info("RunOnce mode, exiting agent loop")
-					c.setAgentState(api.AgentStateExited)
-					return
-				}
 				log.V(2).Info("initiating user input")
 				c.addMessage(api.MessageSourceAgent, api.MessageTypeUserInputRequest, ">>>")
 				select {
@@ -471,13 +445,6 @@ func (c *Agent) Run(ctx context.Context, initialQuery string) error {
 					log.Info("Set agent state to running, will process agentic loop", "currIteration", c.currIteration, "currChatContent", len(c.currChatContent))
 				}
 			case api.AgentStateWaitingForInput:
-				// In RunOnce mode, if we need user choice, exit with error
-				if c.RunOnce {
-					log.Error(nil, "RunOnce mode cannot handle user choice requests")
-					c.setAgentState(api.AgentStateExited)
-					c.addMessage(api.MessageSourceAgent, api.MessageTypeError, "Error: RunOnce mode cannot handle user choice requests")
-					return
-				}
 				select {
 				case <-ctx.Done():
 					log.V(2).Info("Agent loop done")
@@ -502,12 +469,6 @@ func (c *Agent) Run(ctx context.Context, initialQuery string) error {
 							c.pendingFunctionCalls = []ToolCallAnalysis{}
 							c.Session.LastModified = time.Now()
 							c.addMessage(api.MessageSourceAgent, api.MessageTypeError, "Error: "+err.Error())
-							// In RunOnce mode, exit on tool execution error
-							if c.RunOnce {
-								c.setAgentState(api.AgentStateExited)
-								c.lastErr = err
-								return
-							}
 							continue
 						}
 						// Clear pending function calls after execution
@@ -526,7 +487,7 @@ func (c *Agent) Run(ctx context.Context, initialQuery string) error {
 				// Agent is running, don't wait for input, just continue to process the agentic loop
 				log.V(2).Info("Agent is in running state, processing agentic loop")
 			case api.AgentStateExited:
-				log.V(2).Info("Agent exited in RunOnce mode")
+				log.V(2).Info("Agent exited")
 				return
 			}
 
@@ -559,13 +520,6 @@ func (c *Agent) Run(ctx context.Context, initialQuery string) error {
 					if err != nil {
 						c.setAgentState(api.AgentStateDone)
 						c.pendingFunctionCalls = []ToolCallAnalysis{}
-
-						// In RunOnce mode, exit on shim conversion error
-						if c.RunOnce {
-							c.setAgentState(api.AgentStateExited)
-							return
-						}
-
 						continue
 					}
 				}
@@ -702,20 +656,14 @@ func (c *Agent) Run(ctx context.Context, initialQuery string) error {
 				}
 
 				if !c.SkipPermissions && c.ModifyResources != ModifyResourcesModeAuto && modifiesResourceToolCallIndex >= 0 {
-					// In RunOnce mode or read-only mode, block the write and return an error
-					if c.RunOnce || c.ModifyResources == ModifyResourcesModeNone {
+					// In read-only mode, block the write and return an error
+					if c.ModifyResources == ModifyResourcesModeNone {
 						var commandDescriptions []string
 						for _, call := range c.pendingFunctionCalls {
 							commandDescriptions = append(commandDescriptions, call.ParsedToolCall.Description())
 						}
 
-						var errorMessage string
-						if c.ModifyResources == ModifyResourcesModeNone {
-							errorMessage = "Resource modification is disabled (read-only mode). Provide the exact `kubectl` command in your response for the user to execute manually instead of using this tool."
-						} else {
-							errorMessage = "RunOnce mode cannot handle permission requests. The following commands require approval:\n* " + strings.Join(commandDescriptions, "\n* ")
-							errorMessage += "\nUse --skip-permissions flag to bypass permission checks in RunOnce mode."
-						}
+						errorMessage := "Resource modification is disabled (read-only mode). The following commands were blocked:\n* " + strings.Join(commandDescriptions, "\n* ") + "\nProvide the exact `kubectl` command in your response for the user to execute manually instead of using this tool."
 
 						log.Error(nil, "Tool call blocked", "reason", errorMessage, "commands", commandDescriptions)
 
@@ -901,7 +849,7 @@ func (c *Agent) NewSession() (string, error) {
 		Tools:                c.Tools,
 		EnableToolUseShim:    c.EnableToolUseShim,
 		ModifyResources:      c.ModifyResources,
-		SessionIsInteractive: !c.RunOnce,
+		SessionIsInteractive: true,
 		AgentName:            c.AgentName,
 	})
 	if err != nil {
diff --git a/pkg/agent/systemprompt_template_default.txt b/pkg/agent/systemprompt_template_default.txt
@@ -76,6 +76,18 @@ You can execute commands that modify resources automatically without requesting
   - ❌ Incorrect: `kubectl --namespace=default get pods`
 - This ensures commands are properly recognized and filtered by the system.
 - NEVER use commands that open an interactive editor or shell (e.g., kubectl edit, kubectl exec -it, kubectl run --stdin --tty).
+- NEVER pass a piped, chained, or compound command to the tool. Pipes (`|`), `&&`, `;`, and backticks are NOT allowed when calling the kubectl tool. Each tool call must be a single, standalone `kubectl` command.
+  - ✅ Correct: `kubectl get pods -A --field-selector spec.nodeName=my-node`
+  - ✅ Correct: `kubectl get pods -A -o jsonpath='{range .items[?(@.spec.nodeName=="my-node")]}{.metadata.namespace}{"/"}{.metadata.name}{"\n"}{end}'`
+  - ❌ Incorrect: `kubectl get pods -A | grep my-node`
+  - ❌ Incorrect: `kubectl get pods -A && kubectl get nodes`
+- When filtering output, ALWAYS use kubectl's built-in mechanisms instead of pipes:
+  - Use `--field-selector` to filter by resource fields (e.g., `spec.nodeName`, `status.phase`)
+  - Use `-l` / `--selector` to filter by labels
+  - Use `-o jsonpath='...'` or `-o go-template='...'` with filter expressions for complex output
+- When using `-o jsonpath`, ALWAYS wrap the jsonpath expression in single quotes: `-o jsonpath='...'`
+  - ✅ Correct: `kubectl get pods -o jsonpath='{.items[0].metadata.name}'`
+  - ❌ Incorrect: `kubectl get pods -o jsonpath={.items[0].metadata.name}` (unquoted)
 
 ## Security Rules:
 **CRITICAL:**