You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/blog/act-via-code.mdx
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,11 +12,11 @@ description: "The path to advanced code manipulation agents"
12
12
13
13
Two and a half years since the launch of the GPT-3 API, code assistants have emerged as potentially the premier use case of LLMs. The rapid adoption of AI-powered IDEs and prototype builders isn't surprising — code is structured, deterministic, and rich with patterns, making it an ideal domain for machine learning. Developers actively working with tools like Cursor (myself included) have an exhiliarating yet uncertain sense that the field of software engineering is approaching an inflection point.
14
14
15
-
Yet there's a striking gap between understanding and action for today's code assistants. When provided proper context, frontier LLMs can analyze massive enterprise codebases and propose practical paths towards sophisticated, large-scale improvements. But implementing changes that impact more than a small set of files with modern AI assistants is fundamentally infeasible. As a result, our collaboration with AI assistants, as of early 2025, is dominated by tasks where humans are directly in the iteration loop and the modifications themselves span a dozen files. This has proven to be a workable paradigm for two sets of tasks: collaboration in IDEs (Windsurf, Cursor) and 0-1 chat assistants for 0-1 app creation (v0, lovable.dev, bolt.new).
15
+
Yet there's a striking gap between understanding and action for today's code assistants. When provided proper context, frontier LLMs can analyze massive enterprise codebases and propose practical paths towards sophisticated, large-scale improvements. But implementing changes that impact more than a small set of files with modern AI assistants is fundamentally infeasible. The good news is that for focused, file-level changes, we've found real success: AI-powered IDEs are transforming how developers write and review code, while chat-based assistants are revolutionizing how we bootstrap new applications (via tools like v0, lovable.dev, and bolt.new).
16
16
17
-
There are certain things, however, which deal with codebase structure, that are fundamentally programmatic, and these are out of reach for today's assistants despite the fact that they understand what's going on. Eliminating tech debt, largr-scale migrations, managing code modularity and dependency analysis, enforcing type coverage, etc. These tasks that well below the high watermark of AI understanding, yet they remain out of reach for today's AI systems because the mechanism necessary to perform them is not baked into your IDE.
17
+
However, there's a whole class of critical engineering tasks that remain just out of reach - tasks that are fundamentally programmatic and deal with codebase structure at scale. Think about the teams dedicated to eliminating tech debt, managing large-scale migrations, analyzing dependency graphs, and enforcing type coverage across the codebase. Today's AI assistants can fully understand these challenges and even propose solutions, but they lack the mechanisms to actually implement them. The intelligence is there, but it's trapped in your IDE's text completion window.
18
18
19
-
The bottleneck isn't intelligence — it's tooling. The solution requires letting AI systems programmatically interact with codebases and software systems through code execution environments. Code execution environments represent the most expressive tool we could offer an agent—enabling composition, abstraction, and systematic manipulation of complex systems. When paired with ever-improving language models, this will unlock another step function improvement for code assistants, enabling their application in an entirely new set of valuable tasks.
19
+
The bottleneck isn't intelligence — it's tooling. The solution requires letting AI systems programmatically interact with codebases and software systems through code execution environments. Code execution environments represent the most expressive tool we could offer an agent—enabling composition, abstraction, and systematic manipulation of complex systems. By combining code execution environments with custom APIs that correspond to powerful large-scale operations, we can unlock a new set of tasks in which agents can be significant contributors. When paired with ever-improving foundation models, this will lead to a step function improvement for code assistants, enabling their application in an entirely new set of valuable tasks.
0 commit comments