You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/Codex/03_agent_loop.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,9 +35,9 @@ The Agent Loop is responsible for:
35
35
* A request to perform an action (a "function call"): "I need to run this command: `python -c 'print(\"hello world\")'`"
36
36
4. Showing you the text part of the response in the [Terminal UI](01_terminal_ui__ink_components_.md).
37
37
5. Handling the "function call":
38
-
* Checking if it needs your permission based on the [Approval Policy](04_approval_policy___security_.md).
38
+
* Checking if it needs your permission based on the [Approval Policy](04_approval_policy___security.md).
39
39
* If needed, asking you "Allow command?" via the UI.
40
-
* If approved, actually running the command using the [Command Execution & Sandboxing](06_command_execution___sandboxing_.md) system.
40
+
* If approved, actually running the command using the [Command Execution & Sandboxing](06_command_execution___sandboxing.md) system.
41
41
6. Getting the result of the command (the output "hello world").
42
42
7. Sending that result back to the AI ("I ran the command, and it printed 'hello world'").
43
43
8. Getting the AI's final response (maybe: "Great, the script ran successfully!").
@@ -76,11 +76,11 @@ graph TD
76
76
77
77
1.**Input:** Gets input from you (via the [Input Handling](02_input_handling__textbuffer_editor_.md)).
78
78
2.**AI Call:** Sends the current conversation state (including your latest input and any previous steps) to the AI model (OpenAI API).
79
-
3.**Response Processing:** Receives the AI's response. This could be simple text, or it could include a request to use a tool (like running a shell command). This is covered more in [Response & Tool Call Handling](05_response___tool_call_handling_.md).
79
+
3.**Response Processing:** Receives the AI's response. This could be simple text, or it could include a request to use a tool (like running a shell command). This is covered more in [Response & Tool Call Handling](05_response___tool_call_handling.md).
80
80
4.**Tool Handling:** If the AI requested a tool:
81
-
* Check the [Approval Policy](04_approval_policy___security_.md).
81
+
* Check the [Approval Policy](04_approval_policy___security.md).
82
82
* Potentially ask you for confirmation via the [Terminal UI](01_terminal_ui__ink_components_.md).
83
-
* If approved, execute the tool via [Command Execution & Sandboxing](06_command_execution___sandboxing_.md).
83
+
* If approved, execute the tool via [Command Execution & Sandboxing](06_command_execution___sandboxing.md).
84
84
* Package the tool's result (e.g., command output) to send back to the AI in the next step.
85
85
5.**Update State:** Adds the AI's message and any tool results to the conversation history. Shows updates in the UI.
86
86
6.**Loop:** If the task isn't finished (e.g., because a tool was used and the AI needs to react to the result), it sends the updated conversation back to the AI (Step 2). If the task *is* finished, it waits for your next input.
@@ -320,7 +320,7 @@ export class AgentLoop {
320
320
* If a tool call is found, it calls `handleFunctionCall`.
321
321
***`handleFunctionCall()`:**
322
322
* Parses the details of the tool request (e.g., the command arguments).
323
-
* Uses `handleExecCommand` (which contains logic related to [Approval Policy](04_approval_policy___security_.md) and [Command Execution](06_command_execution___sandboxing_.md)) to potentially run the command, using the `getCommandConfirmation` callback if needed.
323
+
* Uses `handleExecCommand` (which contains logic related to [Approval Policy](04_approval_policy___security.md) and [Command Execution](06_command_execution___sandboxing.md)) to potentially run the command, using the `getCommandConfirmation` callback if needed.
324
324
* Formats the result of the tool execution (e.g., command output) into a specific `function_call_output` message.
325
325
* Returns this output message. The `run` method adds this to `turnInput`, so the *next* iteration of the `while` loop will send this result back to the AI, letting it know what happened.
326
326
***Finally:** Once the `while` loop finishes (meaning the AI didn't request any more tools in its last response), it signals loading is done (`onLoading(false)`).
@@ -329,11 +329,11 @@ This loop ensures that the conversation flows logically, handling text, tool req
329
329
330
330
## Conclusion
331
331
332
-
The Agent Loop is the central orchestrator within Codex. It acts like a diligent assistant, taking your requests, interacting with the powerful AI model, managing tools like shell commands, ensuring safety through approvals, and keeping the conversation state updated. It connects the [Terminal UI](01_terminal_ui__ink_components_.md) where you interact, the [Input Handling](02_input_handling__textbuffer_editor_.md) that captures your text, the AI model that provides intelligence, and the systems that actually execute actions ([Command Execution & Sandboxing](06_command_execution___sandboxing_.md)).
332
+
The Agent Loop is the central orchestrator within Codex. It acts like a diligent assistant, taking your requests, interacting with the powerful AI model, managing tools like shell commands, ensuring safety through approvals, and keeping the conversation state updated. It connects the [Terminal UI](01_terminal_ui__ink_components_.md) where you interact, the [Input Handling](02_input_handling__textbuffer_editor_.md) that captures your text, the AI model that provides intelligence, and the systems that actually execute actions ([Command Execution & Sandboxing](06_command_execution___sandboxing.md)).
333
333
334
334
Understanding the Agent Loop helps you see how Codex manages the complex back-and-forth required to turn your natural language requests into concrete actions. But when the Agent Loop wants to run a command suggested by the AI, how does Codex decide whether to ask for your permission first? That crucial safety mechanism is the topic of our next chapter.
335
335
336
-
Next up: [Approval Policy & Security](04_approval_policy___security_.md)
336
+
Next up: [Approval Policy & Security](04_approval_policy___security.md)
Copy file name to clipboardExpand all lines: docs/Codex/05_response___tool_call_handling.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ nav_order: 5
7
7
8
8
# Chapter 5: Response & Tool Call Handling
9
9
10
-
In the [previous chapter](04_approval_policy___security_.md), we learned how Codex decides *if* it's allowed to perform an action suggested by the AI, acting like a security guard based on the rules you set. But how does Codex understand the AI's response in the first place, especially when the AI wants to do something specific, like run a command or change a file?
10
+
In the [previous chapter](04_approval_policy___security.md), we learned how Codex decides *if* it's allowed to perform an action suggested by the AI, acting like a security guard based on the rules you set. But how does Codex understand the AI's response in the first place, especially when the AI wants to do something specific, like run a command or change a file?
11
11
12
12
That's where **Response & Tool Call Handling** comes in. Think of this part of Codex as its "ears" and "hands." It listens carefully to the instructions coming back from the AI model (the "response") and, if the AI asks to perform an action (a "tool call"), it figures out *exactly* what the AI wants to do (like which command to run or what file change to make) and gets ready to do it.
13
13
@@ -97,7 +97,7 @@ sequenceDiagram
97
97
* The **tool name** (e.g., `shell`) is identified.
98
98
* The **arguments** string is extracted.
99
99
* The arguments string (which is often JSON) is parsed to get the actual details (e.g., the `command` array `["git", "status"]`).
100
-
5.**Prepare for Action:** The Agent Loop now knows the specific tool and its arguments. It packages this information (tool name + parsed arguments) and prepares for the next stage: checking the [Approval Policy & Security](04_approval_policy___security_.md) and, if approved, proceeding to [Command Execution & Sandboxing](06_command_execution___sandboxing_.md).
100
+
5.**Prepare for Action:** The Agent Loop now knows the specific tool and its arguments. It packages this information (tool name + parsed arguments) and prepares for the next stage: checking the [Approval Policy & Security](04_approval_policy___security.md) and, if approved, proceeding to [Command Execution & Sandboxing](06_command_execution___sandboxing.md).
* If an item is a `function_call`, the `handleFunctionCall` helper is called.
190
190
*`handleFunctionCall` extracts the `name` and `arguments`.
191
191
* It crucially calls `parseToolCallArguments` (from `utils/parsers.ts`) to turn the JSON string `arguments` into a usable object.
192
-
* Based on the `name` (`shell`, `apply_patch`), it calls the appropriate execution handler (like `handleExecCommand`), passing the parsed arguments. This handler coordinates with the [Approval Policy & Security](04_approval_policy___security_.md) and [Command Execution & Sandboxing](06_command_execution___sandboxing_.md) systems.
192
+
* Based on the `name` (`shell`, `apply_patch`), it calls the appropriate execution handler (like `handleExecCommand`), passing the parsed arguments. This handler coordinates with the [Approval Policy & Security](04_approval_policy___security.md) and [Command Execution & Sandboxing](06_command_execution___sandboxing.md) systems.
You've now seen how Codex acts as an interpreter for the AI. It doesn't just receive text; it receives structured instructions. The **Response & Tool Call Handling** system is responsible for parsing these instructions, figuring out if the AI wants to use a tool (like `shell` or `apply_patch`), and extracting the precise arguments needed for that tool. This crucial step translates the AI's intentions into actionable details that Codex can then use to interact with your system, always respecting the rules set by the [Approval Policy & Security](04_approval_policy___security_.md).
373
+
You've now seen how Codex acts as an interpreter for the AI. It doesn't just receive text; it receives structured instructions. The **Response & Tool Call Handling** system is responsible for parsing these instructions, figuring out if the AI wants to use a tool (like `shell` or `apply_patch`), and extracting the precise arguments needed for that tool. This crucial step translates the AI's intentions into actionable details that Codex can then use to interact with your system, always respecting the rules set by the [Approval Policy & Security](04_approval_policy___security.md).
374
374
375
375
Now that Codex understands *what* command the AI wants to run (e.g., `git status`), how does it actually *execute* that command safely, especially if running in `full-auto` mode? That's the topic of our next chapter.
Copy file name to clipboardExpand all lines: docs/Codex/06_command_execution___sandboxing.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ nav_order: 6
7
7
8
8
# Chapter 6: Command Execution & Sandboxing
9
9
10
-
In the [previous chapter](05_response___tool_call_handling.md), we learned how Codex listens to the AI and understands when it wants to use a tool, like running a specific shell command (`git status` or `npm install`). We also know from the [Approval Policy & Security](04_approval_policy___security_.md) chapter that Codex checks if it *should* run the command based on your chosen safety level.
10
+
In the [previous chapter](05_response___tool_call_handling.md), we learned how Codex listens to the AI and understands when it wants to use a tool, like running a specific shell command (`git status` or `npm install`). We also know from the [Approval Policy & Security](04_approval_policy___security.md) chapter that Codex checks if it *should* run the command based on your chosen safety level.
11
11
12
12
But once Codex has the command and permission (either from you or automatically), how does it actually *run* that command? And how does it do it safely, especially if you've given it more freedom in `full-auto` mode?
13
13
@@ -35,15 +35,15 @@ This system takes a command requested by the AI (like `python script.py` or `git
35
35
***How (Examples):**
36
36
***macOS Seatbelt:** Uses a built-in macOS feature (`sandbox-exec`) with a specific policy file to strictly control what the command can access (e.g., only allow writing to the project folder, block network access).
37
37
***Docker Container:** Runs the command inside a lightweight container (like the one defined in `codex-cli/Dockerfile`). This container has only specific tools installed and can have network rules applied (using `iptables`/`ipset` via `init_firewall.sh`) to limit internet access.
38
-
***When:** Typically used automatically in `full-auto` mode (as decided by the [Approval Policy & Security](04_approval_policy___security_.md) check), or potentially if a specific command is flagged as needing extra caution.
38
+
***When:** Typically used automatically in `full-auto` mode (as decided by the [Approval Policy & Security](04_approval_policy___security.md) check), or potentially if a specific command is flagged as needing extra caution.
39
39
***Pros:** Significantly reduces the risk of accidental damage from faulty or malicious commands suggested by the AI.
40
40
***Cons:** Might prevent a command from working if it legitimately needs access to something the sandbox blocks (like a specific system file or network resource). The setup can be more complex.
41
41
42
42
## How It Works: From Approval to Execution
43
43
44
-
The Command Execution system doesn't decide *whether* to run a command – that's the job of the [Approval Policy & Security](04_approval_policy___security_.md). This system comes into play *after* the approval check.
44
+
The Command Execution system doesn't decide *whether* to run a command – that's the job of the [Approval Policy & Security](04_approval_policy___security.md). This system comes into play *after* the approval check.
45
45
46
-
Remember the `handleExecCommand` function from the [Agent Loop](03_agent_loop.md) chapter? It first calls `canAutoApprove` ([Approval Policy & Security](04_approval_policy___security_.md)). If the command is approved (either by policy or by you), `canAutoApprove` tells `handleExecCommand`*whether* sandboxing is needed (`runInSandbox: true` or `runInSandbox: false`).
46
+
Remember the `handleExecCommand` function from the [Agent Loop](03_agent_loop.md) chapter? It first calls `canAutoApprove` ([Approval Policy & Security](04_approval_policy___security.md)). If the command is approved (either by policy or by you), `canAutoApprove` tells `handleExecCommand`*whether* sandboxing is needed (`runInSandbox: true` or `runInSandbox: false`).
You've reached the end of the workshop tour! The **Command Execution & Sandboxing** system is Codex's way of actually *doing* things on the command line when instructed by the AI. It carefully considers the safety level decided by the [Approval Policy & Security](04_approval_policy___security_.md) and chooses the right execution method: direct "raw" execution for trusted commands, or running inside a protective "sandbox" (like macOS Seatbelt or a Docker container) for potentially riskier operations, especially in `full-auto` mode. This layered approach allows Codex to be powerful while providing crucial safety mechanisms against unintended consequences.
352
+
You've reached the end of the workshop tour! The **Command Execution & Sandboxing** system is Codex's way of actually *doing* things on the command line when instructed by the AI. It carefully considers the safety level decided by the [Approval Policy & Security](04_approval_policy___security.md) and chooses the right execution method: direct "raw" execution for trusted commands, or running inside a protective "sandbox" (like macOS Seatbelt or a Docker container) for potentially riskier operations, especially in `full-auto` mode. This layered approach allows Codex to be powerful while providing crucial safety mechanisms against unintended consequences.
353
353
354
354
We've seen how Codex handles input, talks to the AI, checks policies, and executes commands. But how does Codex know *which* AI model to use, what your API key is, or which approval mode you prefer? All these settings need to be managed.
Copy file name to clipboardExpand all lines: docs/Codex/07_configuration_management.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ nav_order: 7
7
7
8
8
# Chapter 7: Configuration Management
9
9
10
-
In the [previous chapter](06_command_execution___sandboxing_.md), we saw how Codex carefully executes commands, using sandboxing for safety when needed. But how does Codex remember your preferences between sessions? For instance, how does it know which AI model you like to use, or whether you prefer `auto-edit` mode? And how can you give Codex persistent instructions about how you want it to behave?
10
+
In the [previous chapter](06_command_execution___sandboxing.md), we saw how Codex carefully executes commands, using sandboxing for safety when needed. But how does Codex remember your preferences between sessions? For instance, how does it know which AI model you like to use, or whether you prefer `auto-edit` mode? And how can you give Codex persistent instructions about how you want it to behave?
11
11
12
12
This is where **Configuration Management** comes in. Think of it like the settings menu or preferences file for Codex.
13
13
@@ -17,7 +17,7 @@ Imagine you prefer using the powerful `gpt-4o` model instead of the default `o4-
17
17
18
18
Configuration Management solves this by allowing Codex to:
19
19
20
-
1.**Load Default Settings:** Read a special file to know your preferred model, default [Approval Policy](04_approval_policy___security_.md) mode, etc.
20
+
1.**Load Default Settings:** Read a special file to know your preferred model, default [Approval Policy](04_approval_policy___security.md) mode, etc.
21
21
2.**Load Custom Instructions:** Read other special files containing your personal guidelines or project-specific rules for the AI.
22
22
23
23
This way, Codex behaves consistently according to your setup without needing constant reminders. It's like setting up your favorite text editor with your preferred theme and plugins – you do it once, and it remembers.
Copy file name to clipboardExpand all lines: docs/Codex/08_single_pass_mode.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -304,7 +304,7 @@ export type EditedFiles = z.infer<typeof EditedFilesSchema>;
304
304
305
305
Single-Pass Mode offers a different, potentially faster way to use Codex for well-defined tasks. By providing extensive context upfront and asking the AI for a complete set of structured file operations in one response, it minimizes back-and-forth. You gather context, send one big request, review the complete proposed solution, and either accept or reject it entirely. While still experimental, it's a powerful approach for streamlining larger refactoring or generation tasks where the requirements are clear.
306
306
307
-
This concludes our tour through the core concepts of Codex! We've journeyed from the [Terminal UI](01_terminal_ui__ink_components_.md) and [Input Handling](02_input_handling__textbuffer_editor_.md), through the central [Agent Loop](03_agent_loop.md), into the crucial aspects of [Approval Policy & Security](04_approval_policy___security_.md), [Response & Tool Call Handling](05_response___tool_call_handling.md), and safe [Command Execution & Sandboxing](06_command_execution___sandboxing.md), learned about [Configuration Management](07_configuration_management.md), and finally explored the alternative [Single-Pass Mode](08_single_pass_mode.md).
307
+
This concludes our tour through the core concepts of Codex! We've journeyed from the [Terminal UI](01_terminal_ui__ink_components_.md) and [Input Handling](02_input_handling__textbuffer_editor_.md), through the central [Agent Loop](03_agent_loop.md), into the crucial aspects of [Approval Policy & Security](04_approval_policy___security.md), [Response & Tool Call Handling](05_response___tool_call_handling.md), and safe [Command Execution & Sandboxing](06_command_execution___sandboxing.md), learned about [Configuration Management](07_configuration_management.md), and finally explored the alternative [Single-Pass Mode](08_single_pass_mode.md).
308
308
309
309
We hope this gives you a solid understanding of how Codex works under the hood. Feel free to dive deeper into the codebase, experiment, and perhaps even contribute!
0 commit comments