Skip to content

Commit 79dfafd

Browse files
committed
ace-fc
1 parent 9983f96 commit 79dfafd

File tree

1 file changed

+20
-13
lines changed

1 file changed

+20
-13
lines changed

ace-fca.md

Lines changed: 20 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,35 @@
11
**PRE RELEASE** - if you're reading this, this post has not been published fully yet - feedback welcome but please don't pass this around just yet!
22

3-
# Advanced Context Engineering for Coding Agents
3+
# Putting AI to Work in Complex Codebases
44

5-
*Lessons Learned trying to get AI to solve complex problems in brownfield codebases*
5+
Everyone knows AI coding tools struggle with real production codebases. The [the Stanford study on AI's impact on developer productivity](https://www.youtube.com/watch?v=tbDDYKRFjhk) proved it: A lot of the "extra code" shipped by AI tools ends up just reworking the slop that was shipped last week. Coding agents are great for new projects or small changes, but in large established codebases, AI can often make developers *less* productive.
66

7-
Hey folks, dex here.
7+
The common response is somewhere between the pessimist "this will never work" and the more measured "maybe someday when there are smarter models."
88

9-
You may remember me from April's [12-factor agents](https://hlyr.dev/12fa) guide, as the coiner of the term "context engineering", or from the [AI Engineer talk on the topic](https://www.youtube.com/watch?v=8kMaTybvDUw).
9+
But here's what we discovered: **you can get really far with today's models if you embrace core context engineering principles**.
1010

11-
I'm stoked to share with what we've been up to since then and why it matters to every team that's serious about software, and get feedback from the community here.
11+
We have been iterating a lot on techniques and have found some workflows that let current models handle 300k LOC Rust codebases, ship a week's worth of work in a day, and maintain code quality that passes expert review. The key is something we call "frequent intentional compaction" - deliberately structuring how you feed context to the AI throughout the development process.
1212

13-
**Note** - if you prefer video - this post is based on [a talk that was recorded at YC](https://hlyr.dev/ace) on Aug 20th
13+
This isn't another "10x your productivity" pitch. What follows is some of the learnings that convinced me that AI for coding is not just for toys and prototypes, but rather a deeply technical engineering craft.
1414

15-
### From 12-factor agents to context engineering for coding agents
15+
### The Stanford study that changed everything
1616

17-
I have 2 favorite talks from AI Engineer 2025. (incidentally, the only two AIE talks with [more views than 12-factor agents](https://www.youtube.com/@aiDotEngineer/videos))
17+
Two talks from AI Engineer 2025 fundamentally shaped my thinking about this problem.
1818

1919
The first is [Sean Grove's talk on "Specs are the new code"](https://www.youtube.com/watch?v=8rABwKRsec4) and the second is [the Stanford study on AI's impact on developer productivity](https://www.youtube.com/watch?v=tbDDYKRFjhk).
2020

2121
Sean argued that we’re all *vibe coding wrong*. The idea of chatting with an AI agent for two hours, specifying what you want, and then throwing away all the prompts while committing only the final code… is like a Java developer compiling a JAR and checking in the compiled binary while throwing away the source.
2222

2323
Sean proposes that in the AI future, the specs will become the real code. That in two years, you'll be opening python files in your IDE with about the same frequency that, today, you might open up a hex editor to read assembly (which, for most of us, is never).
2424

25-
Yegor's talk on developer productivity tackled an orthogonal problem. They analyzed commits from 100k developers and found, among other things,
25+
[Yegor's talk on developer productivity](https://www.youtube.com/watch?v=tbDDYKRFjhk) tackled an orthogonal problem. They analyzed commits from 100k developers and found, among other things,
2626

2727
1. That AI tools often lead to a lot of rework, diminishing the perceived productivity gains
28-
2. That AI tools work well for greenfield projects, but are often counter-productive for brownfield codebases and complex tasks
2928

3029
<img width="2008" height="1088" alt="image" src="https://github.com/user-attachments/assets/f7cec497-3ee2-47d1-8f91-a18210625e19" />
3130

31+
2. That AI tools work well for greenfield projects, but are often counter-productive for brownfield codebases and complex tasks
32+
3233
<img width="1326" height="751" alt="Screenshot 2025-08-29 at 10 55 32 AM" src="https://github.com/user-attachments/assets/06f03232-f9d9-4a92-a182-37056bf877a4" />
3334

3435
This matched what I heard talking with founders:
@@ -45,9 +46,15 @@ The general vibe on AI-coding for hard stuff tends to be
4546
Heck even [Amjad](https://x.com/amasad) was on a [lenny's podcast 9 months ago](https://www.lennysnewsletter.com/p/behind-the-product-replit-amjad-masad) talking about how PMs use Replit agent to prototype new stuff and then they hand it off to engineers to implement for production.
4647
(Disclaimer: i haven't caught up with him recently (ok, ever), this stance may have changed)
4748

48-
Whenever I hear "Maybe someday when the models are smart" I generally leap to exclaim **that’s what context engineering is all about**: getting the most out of *today’s* models.
49+
Whenever I hear "Maybe someday when the models are smart" I generally leap to exclaim **that's what context engineering is all about**: getting the most out of *today's* models.
50+
51+
### What's actually possible today
52+
53+
I'll deep dive on this a bit futher down, but to prove this isn't just theory, let me outline a concrete example. A few weeks ago, I decided to test our techniques on [BAML](https://github.com/BoundaryML/baml), a 300k LOC Rust codebase for a programming language that works with LLMs. I'm at best an amateur Rust dev and had never touched the BAML codebase before.
54+
55+
Within an hour or so, I had a [PR fixing a bug](https://github.com/BoundaryML/baml/pull/2259#issuecomment-3155883849) which was approved by the maintainer the next morning. A few weeks later, [@hellovai](https://x.com/hellovai) and I paired on shipping 35k LOC to BAML, adding [cancellation support](https://github.com/BoundaryML/baml/pull/2357) and [WASM compilation](https://github.com/BoundaryML/baml/pull/2330) - features the team estimated would take a senior engineer 3-5 days each. We got both draft prs ready in about 7 hours.
4956

50-
While obsessing over these talks and llms and context for the last few months, I think we found something really cool.
57+
Again, this is all built around a workflow we call [frequent intentional compaction](#what-works-even-better-frequent-intentional-compaction) - essentially designing your entire development process around context management, keeping utilization in the 40-60% range, and building in high-leverage human review at exactly the right points. We use a "research, plan, implement" workflow, but the core capabilities/learnings here are FAR more general than any specific workflow or set of prompts.
5158

5259
### Our weird journey to get here
5360

@@ -87,7 +94,7 @@ I'll dive into:
8794

8895
### But first: The Naive Way to manage agent context
8996

90-
Most of us start by using a coding agent like a chatbot. You talk (or shout) back and forth with it, vibing your way through a problem until you either run out of context, give up, or the agent starts apologizing.
97+
Most of us start by using a coding agent like a chatbot. You talk (or [drunkenly shout](https://ghuntley.com/six-month-recap/#:~:text=Last%20week%2C%20over%20Zoom%20margaritas%2C%20a%20friend%20and%20I%20reminisced%20about%20COBOL.)) back and forth with it, vibing your way through a problem until you either run out of context, give up, or the agent starts apologizing.
9198

9299
<img width="1328" height="741" alt="Screenshot 2025-08-29 at 11 08 34 AM" src="https://github.com/user-attachments/assets/51a46854-c542-4515-afbb-a2fe26970809" />
93100

0 commit comments

Comments
 (0)