Skip to content

Commit 890033e

Browse files
author
codegen-bot
committed
.
1 parent ce2338f commit 890033e

File tree

7 files changed

+64
-54
lines changed

7 files changed

+64
-54
lines changed

docs/blog/act-via-code.mdx

Lines changed: 56 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -5,85 +5,88 @@ iconType: "solid"
55
description: "The path to advanced code manipulation agents"
66
---
77

8-
The future of AI-powered software development isn't just about understanding code—it's about manipulating it effectively. As AI models become increasingly sophisticated in comprehending codebases, we're discovering that the real bottleneck isn't their "intelligence," but rather their ability to make precise, reliable changes to code. This is where the concept of "acting via code" becomes crucial.
8+
<Frame caption="Voyager (Jim Fan)">
9+
<img src="/images/nether-portal.png" />
10+
</Frame>
911

10-
## The Current Landscape
1112

12-
Today's AI coding assistants typically operate through:
13+
# Act via Code
1314

14-
- Generating complete code snippets
15-
- Suggesting text-based changes
16-
- Producing diffs for review
15+
Two and a half years since the launch of the GPT-3 API, code assistants have emerged as the most powerful and practically useful applications of LLMs. The rapid adoption of AI-powered IDEs and prototype builders isn't surprising — code is structured, deterministic, and rich with patterns, making it an ideal domain for machine learning. As model capabilities continue to scale, we're seeing compounding improvements in code understanding and generation.
1716

18-
While these approaches work for simple tasks, they break down when dealing with complex, multi-file changes or large-scale refactors. The fundamental issue? They're trying to manipulate code as text, rather than as the structured data it really is.
17+
Yet there's a striking gap between what AI agents can understand and what they can actually do. While they can reason about complex architectural changes, debug intricate issues, and propose sophisticated refactors, they often can't execute these ideas. The ceiling isn't intelligence or context—it's the ability to manipulate code at scale. Large-scale modifications remain unreliable or impossible, not because agents don't understand what to do, but because they lack the right interfaces to do it.
1918

20-
## Why Acting via Code Matters
19+
The bottleneck isn't intelligence — it's tooling. By giving AI models the ability to write and execute code that modifies code, we're about to unlock an entire class of tasks that agents already understand but can't yet perform. Code execution environments represent the most expressive tool we could offer an agent—enabling composition, abstraction, and systematic manipulation of complex systems. When paired with ever-improving language models, this will unlock another step function improvement in AI capabilities.
2120

22-
Acting via code means providing AI agents with programmatic interfaces to manipulate codebases. Instead of generating text patches, agents can express transformations through code itself. This approach offers several key advantages:
21+
## Beating Minecraft with Code Execution
2322

24-
1. **Precision and Reliability**
23+
In mid-2023, a research project called [Voyager](https://voyager.minedojo.org) made waves: it effectively solved Minecraft, performing several multiples better than the prior SOTA on many important dimensions. This was a massive breakthrough — previous reinforcement learning systems had struggled for years with even basic Minecraft tasks.
2524

26-
- Changes are expressed through well-defined operations
27-
- Dependencies and references are handled automatically
28-
- Transformations are composable and reusable
25+
While the AI community was focused on scaling intelligence, Voyager demonstrated something more fundamental: the right tools can unlock entirely new tiers of capability. The same GPT-4 model that struggled with Minecraft using traditional frameworks achieved remarkable results when allowed to write and execute code. This wasn't about raw intelligence—it was about giving the agent a more expressive way to act.
2926

30-
2. **Scale and Consistency**
27+
<Frame>
28+
<img src="/images/voyager-performance.png" />
29+
</Frame>
3130

32-
- Changes can be applied across massive codebases
33-
- Transformations remain consistent across files
34-
- Complex refactors become tractable
31+
The breakthrough came from a simple yet powerful insight: let the AI write code. Instead of limiting the agent to primitive "tools," Voyager allowed GPT-4 to write and execute [JS programs](https://github.com/MineDojo/Voyager/tree/main/skill_library/trial2/skill/code) that controlled Minecraft actions through a clean API:
3532

36-
3. **Verifiability and Safety**
37-
- Changes are expressed in a reviewable format
38-
- Transformations can be tested before application
39-
- Operations can be rolled back if needed
33+
```javascript
34+
async function chopSpruceLogs(bot) {
35+
const spruceLogCount = bot.inventory.count(mcData.itemsByName.spruce_log.id);
36+
const logsToMine = 3 - spruceLogCount;
37+
if (logsToMine > 0) {
38+
bot.chat("Chopping down spruce logs...");
39+
await mineBlock(bot, "spruce_log", logsToMine);
40+
bot.chat("Chopped down 3 spruce logs.");
41+
} else {
42+
bot.chat("Already have 3 spruce logs in inventory.");
43+
}
44+
}
45+
```
4046

41-
## Building Blocks for AI Agents
47+
This approach transformed the agent's capabilities. Rather than being constrained to atomic actions like `equipItem(...)`, it could create higher-level operations like [`craftShieldWithFurnace()`](https://github.com/MineDojo/Voyager/blob/main/skill_library/trial2/skill/code/craftShieldWithFurnace.js) through composing JS APIs. The system also implemented a memory mechanism, storing successful programs for reuse in similar situations—effectively building its own library of proven solutions it could later refer to and adapt to similar circumstances.
4248

43-
For AI agents to effectively manipulate code, they need:
49+
<Frame>
50+
<img src="/images/voyager-retrieval.png" />
51+
</Frame>
4452

45-
1. **A Natural Mental Model**
53+
As the Voyager authors noted:
4654

47-
- Operations that match how developers think about code changes
48-
- High-level abstractions for common patterns
49-
- Clear semantics for transformations
55+
<Tip>*"We opt to use code as the action space instead of low-level motor commands because programs can naturally represent temporally extended and compositional actions, which are essential for many long-horizon tasks in Minecraft."*</Tip>
5056

51-
2. **Composable Primitives**
57+
## Code is an Ideal Action Space
5258

53-
- Basic operations that can be combined
54-
- Tools for building higher-level abstractions
55-
- Ways to express complex transformations
59+
The implications of code as an action space extend far beyond gaming. Code provides a uniquely powerful interface between AI and real-world systems. When an agent writes code, it gains several critical advantages over traditional atomic tools.
5660

57-
3. **Rich Static Analysis**
58-
- Understanding of dependencies and references
59-
- Analysis of control flow and types
60-
- Knowledge of cross-file relationships
61+
First, code enforces correctness through syntax and type systems. Second, it enables effective retrieval and composition—AI models excel at understanding, adapting, and combining existing code patterns. Third, code execution provides immediate, objective feedback through errors and outputs. Finally, and perhaps most importantly, code is inherently composable—any tool can be wrapped in a function and used as a building block for more complex operations.
6162

62-
## The Path Forward
63+
Programs are also a natural medium of interaction between humans and agents. Code explicitly encodes reasoning in a human-readable format, making the agent's actions transparent and reviewable. There's no magic—just deterministic program execution that can be debugged, modified, and improved. In this paradigm, the agent becomes a sophisticated program search mechanism, exploring the space of possible solutions while maintaining the reliability of traditional software.
6364

64-
As we move toward more advanced AI coding agents, the ability to act via code becomes increasingly critical. Future AI systems will need to:
65+
## For Software Engineering
6566

66-
1. **Build Their Own Tools**
67+
This brings us to software engineering, where we see a massive gap between AI's theoretical capabilities and practical achievements. Many code modification tasks are fundamentally programmatic—dependency analysis, refactors, control flow analysis—yet we lack the tools to express them properly.
6768

68-
- Create custom abstractions for common patterns
69-
- Develop specialized transformation utilities
70-
- Maintain their own libraries of operations
69+
Consider how a developer thinks about refactoring: it's rarely about direct text manipulation. Instead, we think in terms of high-level operations: "move this function," "rename this variable everywhere," "split this module." These operations can be encoded into a powerful Python API:
7170

72-
2. **Reason About Changes**
71+
```python
72+
# simple access to high-level code constructs
73+
for component in codebase.jsx_components:
74+
# access detailed code structure and relations
75+
if len(component.usages) == 0:
76+
# powerful edit APIs that handle edge cases
77+
component.rename(component.name + 'Page')
78+
```
7379

74-
- Understand the impact of transformations
75-
- Plan complex refactoring operations
76-
- Verify the correctness of changes
80+
This isn't just another code manipulation library—it's a scriptable language server that builds on proven foundations like LSP and codemods, but designed specifically for programmatic analysis and refactoring.
7781

78-
3. **Learn From Experience**
79-
- Improve transformation strategies over time
80-
- Develop better abstractions through use
81-
- Share knowledge across different contexts
82+
## What does this look like?
8283

83-
## Conclusion
84+
At Codegen, we've built exactly this system. Our approach centers on four key principles:
8485

85-
The path to advanced AI coding agents isn't just about better language models—it's about giving those models the right tools to manipulate code effectively. By enabling AI to "act via code," we create a foundation for more sophisticated, reliable, and scalable code transformation capabilities.
86+
The foundation must be Python, enabling easy composition with existing tools and workflows. Operations must be in-memory for performance, handling large-scale changes efficiently. The system must be open source, allowing developers and AI researchers to extend and enhance it. And perhaps most importantly, it must be thoroughly documented—not just for humans, but for the next generation of AI agents that will build upon it.
8687

87-
Just as self-driving cars need sophisticated controls to navigate the physical world, AI coding agents need powerful, precise interfaces to manipulate codebases. This programmatic approach creates a shared language that both humans and AI can use to express, verify, and apply code changes reliably at scale.
88+
## What does this enable?
8889

89-
The future of AI-powered development lies not in generating better patches or diffs, but in enabling AI to work with code the way developers do: through code itself.
90+
We've already used this approach to merge hundreds of thousands of lines of code in enterprise codebases. Our tools have automated complex tasks like feature flag deletion, test suite reorganization, import cycle elimination, and dead code removal. But more importantly, we've proven that code-as-action-space isn't just theoretical—it's a practical approach to scaling software engineering.
91+
92+
This is just the beginning. With Codegen, we're providing the foundation for the next generation of code manipulation tools—built for both human developers and AI agents. We believe this approach will fundamentally change how we think about and implement large-scale code changes, making previously impossible tasks not just possible, but routine.

docs/blog/posts.mdx

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ Why traditional language servers aren't enough for the future of AI-powered code
1919
label="2024-01-24"
2020
description="A Deep Dive into Codemod Frameworks"
2121
title="Codemod Frameworks"
22+
href="/blog/codemod-frameworks"
2223
>
2324
## Codemod Frameworks
2425

@@ -30,6 +31,12 @@ Comparing popular tools for programmatic code transformation
3031

3132
## Act via Code
3233

33-
The path to advanced code manipulation agents
34+
Programs are the natural convergence of LLMs and traditional computation.
35+
36+
<Card
37+
img="/images/voyager.png"
38+
title="Act via Code"
39+
href="https://codegen.com"
40+
/>
3441

3542
</Update>

docs/images/nether-portal.png

3.86 MB
Loading

docs/images/voyager-full.png

1.46 MB
Loading
759 KB
Loading

docs/images/voyager-retrieval.png

726 KB
Loading

docs/images/voyager.png

664 KB
Loading

0 commit comments

Comments
 (0)