|
| 1 | +# GitHub Coding Agent |
| 2 | + |
| 3 | +This is a coding agent that can read a GitHub issue, write code and submits a PR with a fix. It currently performs 5% on [SWE-Bench lite](https://www.swebench.com/). Demo: |
| 4 | + |
| 5 | +https://github.com/user-attachments/assets/3f214430-aeb8-412e-ad5e-0c173e0cfbc7 |
| 6 | + |
| 7 | +## What You Need |
| 8 | +- A GitHub account |
| 9 | +- git installed on your computer |
| 10 | +- Python 3.10 |
| 11 | + |
| 12 | +## Setup Steps |
| 13 | + |
| 14 | +1. Start Llama Stack: |
| 15 | + |
| 16 | +This uses the fireworks distribution of Llama Stack, but will work with any other distribution that supports 3.3 70B model (405b support coming soon). |
| 17 | +```bash |
| 18 | +export LLAMA_STACK_PORT=5000 |
| 19 | +export FIREWORKS_API_KEY=your_key_here |
| 20 | +docker run -it \ |
| 21 | + -p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \ |
| 22 | + -v ~/.llama:/root/.llama \ |
| 23 | + llamastack/distribution-fireworks \ |
| 24 | + --port $LLAMA_STACK_PORT \ |
| 25 | + --env FIREWORKS_API_KEY=$FIREWORKS_API_KEY |
| 26 | +``` |
| 27 | + |
| 28 | +2. Get a GitHub token: |
| 29 | + - Go to https://github.com/settings/personal-access-tokens/new |
| 30 | + - Enter a name |
| 31 | + - Pick which repositories it has access to |
| 32 | + |
| 33 | + - Give it these permissions: |
| 34 | + |
| 35 | + - Create the token and copy it |
| 36 | +3. Setup your .env file: |
| 37 | +```bash |
| 38 | +cp .env.example .env |
| 39 | +``` |
| 40 | +Then open `.env` and add your GitHub token: |
| 41 | +``` |
| 42 | +GITHUB_API_KEY=github_pat_11SDF... |
| 43 | +``` |
| 44 | + |
| 45 | +4. Create a virtual environment: |
| 46 | +```bash |
| 47 | +# python -m venv .venv should also work here as well but this is only tested on python 3.10 |
| 48 | +conda create -n llama-stack-coding-agent python=3.10 |
| 49 | +conda activate llama-stack-coding-agent |
| 50 | +``` |
| 51 | + |
| 52 | +5. Install the dependencies: |
| 53 | + |
| 54 | +```bash |
| 55 | +pip install -r requirements.txt |
| 56 | +``` |
| 57 | + |
| 58 | +6. Start the agent: |
| 59 | +```bash |
| 60 | +python -m llama_agent.main --issue-url your_github_issue_url |
| 61 | + |
| 62 | +# For example: |
| 63 | +# python -m llama_agent.main --issue-url https://github.com/example-user/example-repo/issues/34 |
| 64 | +``` |
| 65 | + |
| 66 | +## What It Does |
| 67 | +- Reads GitHub issues |
| 68 | +- Clones the repository under `sandbox/` |
| 69 | +- Creates a fix locally |
| 70 | +- Makes a new branch |
| 71 | +- Submits a Pull Request with the fixes |
| 72 | +- If it can't fix something, it leaves a comment explaining why |
| 73 | +- Only supports Llama 3.3 70B at the moment |
| 74 | + |
| 75 | +## Is It Safe? |
| 76 | + |
| 77 | +Yes - the LLM: |
| 78 | + |
| 79 | +- Doesn't have access to git tools or GitHub API (regular logic makes git commands) |
| 80 | +- Doesn't execute any commands or arbitrary code |
| 81 | +- Only works in a sandbox folder |
| 82 | +- Won't push to your main branch |
| 83 | +- Only creates new branches |
| 84 | +- Won't close or merge Pull Requests |
| 85 | +- Can't close/edit issues |
| 86 | + |
| 87 | +## Evaluation results |
| 88 | +This currently performs 5% on SWE-Bench lite using Llama 3.3 70B; There are a lot of opportunities for improvement. See the evaluation results here: https://huggingface.co/datasets/aidando73/llama-codes-swe-bench-evals/tree/main. |
0 commit comments