Skip to content

Commit 99ae673

Browse files
authored
feat: Edit with self correcting patches (#441)
The problem: Line editing is unreliable, full writes work but can have side effects and devour tokens. Main challenge is that LLMs suck at counting reliably. This solution removes the need for counting and builds on the assumption that there is plenty of patch data an LLM trained on. But, likely the numbers are still close, which helps us deal with any ambiguity in the patch. Bonus is that with this we can support multiple edits in a file with a single LLM call and minimal tokens. How it works: * First parse the patch into a loosy format we can work with * Iterate over the lines in the target file, if there is a match with a hunk, create a candidate * If that line matches with the next line for any current candidate, keep the candidate, otherwise ditch * Finally, we can recalculate the source and dest lines and ranges, and re-render the patch. The algorithm supports ambiguity (multiple hunks matching). Still needs to be dealt with. TODO: - [x] Deal with ambiguity - [x] More tests for multiple hunks - [x] Test with diffy if the regenerated patches are correct - [x] Add it all to the tool - [x] Run some evals If the evals pass we'll keep it in. After we run a set of SWE, if we see no issues, as far as I'm concerned we can make it the default (and write a cool blog article). Aside, it should also be possible to Cow/borrow most of the inner values and have this be near zero copy. The way it will be used however it's no where near a bottleneck. Cool exercise though.
1 parent e70c871 commit 99ae673

22 files changed

+1132
-42
lines changed

Cargo.lock

Lines changed: 20 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,7 @@ update-informer = { version = "1.2.0", features = [
9696
"ureq",
9797
"rustls-tls",
9898
], default-features = false }
99+
diffy = "0.4.2"
99100

100101
# Something is still pulling in libssl, this is a quickfix and should be investigated
101102
[target.'cfg(linux)'.dependencies]
@@ -146,6 +147,7 @@ evaluations = []
146147

147148

148149
[patch.crates-io]
150+
# diffy = { git = "https://github.com/timonv/diffy", branch = "fix/debug-wrong-line" }
149151
# arrow = { version = "=53.2.0", optional = false }
150152
# arrow-arith = { version = "=53.2.0", optional = false }
151153
# swiftide = { path = "../swiftide/swiftide" }

README.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -230,13 +230,15 @@ Additionally, kwaak provides a number of slash commands, `/help` will show all a
230230

231231
### How does it work?
232232

233-
On initial boot up, Kwaak will index your codebase. This can take a while, depending on the size. Once indexing has been completed once, subsequent startups will be faster. Indexes are stored with [duckdb](https://duckdb.org), and indexing is cached with [redb](https://github.com/cberner/redb).
233+
On initial boot up, Kwaak will index your codebase. This can take a while, depending on the size. Once indexing has been completed once, subsequent startups will be faster. Indexes are stored with [duckdb](https://duckdb.org). Kwaak uses the index to provide context to the agents.
234234

235235
Kwaak provides a chat interface similar to other LLM chat applications. You can type messages to the agent, and the agent will try to accomplish the task and respond.
236236

237237
When starting a chat, the code of the current branch is copied into an on-the-fly created docker container. This container is then used to run the code and execute the commands.
238238

239-
After each chat completion, kwaak will lint, commit, and push the code to the remote repository if any code changes have been made. Kwaak can also create a pull request. Pull requests include an issue link to #48. This helps us identify the success rate of the agents, and also enforces transparency for code reviewers.
239+
After each chat completion, kwaak can lint, commit, and push the code to the remote repository if any code changes have been made. Kwaak can also create a pull request. Pull requests include an issue link to #48. This helps us identify the success rate of the agents, and also enforces transparency for code reviewers. This behaviour is fully configurable.
240+
241+
Kwaak uses patch based editing by default. This means that only the changed lines are sent to the agent. This is more efficient. If you experience issues, try changing the edit mode to `whole` or `line`.
240242

241243
<p align="right">(<a href="#readme-top">back to top</a>)</p>
242244

@@ -376,7 +378,7 @@ max_elapsed_time_sec = 120
376378
#### Other configuration
377379

378380
- **`agent_custom_constraints`**: Additional constraints / instructions for the agent.
379-
These are passes to the agent in the system prompt and are rendered in a list. If you
381+
These are passes to the agent in the system prompt. If you
380382
intend to use more complicated instructions, consider adding a file to read in the
381383
repository instead.
382384
- **`cache_dir`, `log_dir`**: Directories for cache and logs. Defaults are within your system's cache directory.
@@ -386,7 +388,7 @@ max_elapsed_time_sec = 120
386388
- **`otel_enabled`**: Enables OpenTelemetry tracing if set and respects all the standard OpenTelemetry environment variables.
387389
- **`tool_executor`**: Defaults to `docker`. Can also be `local`. We **HIGHLY** recommend using `docker` for security reasons unless you are running in a secure environment.
388390
- **`tavily_api_key`**: Enables the agent to use [tavily](https://tavily.com) for web search. Their entry-level plan is free. (we are not affiliated)
389-
- **`agent_edit_mode`**: Defaults to `whole` (write full files at the time). If you experience issues with (very) large files, you can experiment with `line` edits.
391+
- **`agent_edit_mode`**: Defaults to `patch`. Other options are `whole` and `line`. If you experience issues, try changing the edit mode. `whole` will always write the full file. This consumes more tokens and can have side effects.
390392
- **`git.auto_push_remote`**: Enabled by default if a github key is present. Automatically pushes to the remote repository after each chat completion. You can disable this by setting it to `false`.
391393
- **`git.auto_commit_disabled`**: Opt-out of automatic commits after each chat completion.
392394
- **`tools`**: A list of tool names to enable or disable.

src/agent/agents/coding.rs

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,13 @@ pub fn build_system_prompt(repository: &Repository) -> Result<Prompt> {
213213
].into_iter().map(Into::into));
214214
}
215215

216+
if repository.config().agent_edit_mode.is_patch() {
217+
constraints.extend([
218+
"Prefer editing files with `patch_file` over `write_file`".into(),
219+
"If `patch_file` continues to be troublesome, defer to `write_file` instead".into(),
220+
]);
221+
}
222+
216223
if repository.config().endless_mode {
217224
constraints
218225
.push("You cannot ask for feedback and have to try to complete the given task".into());

src/agent/conversation_summarizer.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ impl ConversationSummarizer {
5151
}
5252
}
5353

54+
#[must_use]
5455
pub fn summarize_hook(self) -> impl AfterEachFn {
5556
move |agent| {
5657
let llm = self.llm.clone();

src/agent/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
pub mod agents;
22
mod commit_and_push;
3-
mod conversation_summarizer;
3+
pub mod conversation_summarizer;
44
pub mod env_setup;
55
pub mod running_agent;
66
pub mod session;

src/agent/session.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -402,6 +402,10 @@ pub fn available_tools(
402402
tools.push(tools::replace_lines());
403403
tools.push(tools::add_lines());
404404
}
405+
AgentEditMode::Patch => {
406+
tools.push(tools::read_file_with_line_numbers());
407+
tools.push(tools::patch_file());
408+
}
405409
}
406410

407411
// gitHub-related tools

src/agent/tools/mod.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
mod delegate_agent;
2+
mod patch_file;
23
mod replace_lines;
34

45
pub use delegate_agent::DelegateAgent;
6+
pub use patch_file::patch_file;
57
pub use replace_lines::replace_lines;
68

79
use std::sync::Arc;

0 commit comments

Comments
 (0)