Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
fail_fast: true
default_stages: [commit]
default_stages: [pre-commit]

repos:
- repo: https://github.com/psf/black
Expand Down
131 changes: 89 additions & 42 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,78 +12,125 @@ The OG code genereation experimentation platform!
If you are looking for the evolution that is an opinionated, managed service – check out gptengineer.app.

If you are looking for a well maintained hackable CLI for – check out aider.
# gpt-engineer

GitHub Repo stars · Discord Follow · License · GitHub Issues or Pull Requests · GitHub Release · Twitter Follow

The OG code generation experimentation platform!

If you are looking for the evolution that is an opinionated, managed service – check out gptengineer.app.

If you are looking for a well maintained hackable CLI – check out aider.


gpt-engineer lets you:

- Specify software in natural language
- Sit back and watch as an AI writes and executes the code
- Ask the AI to implement improvements


## Getting Started

### Install gpt-engineer

For **stable** release:
For stable release:

```bash
python -m pip install gpt-engineer
```

- `python -m pip install gpt-engineer`
For development:

For **development**:
- `git clone https://github.com/gpt-engineer-org/gpt-engineer.git`
- `cd gpt-engineer`
- `poetry install`
- `poetry shell` to activate the virtual environment
```bash
git clone https://github.com/gpt-engineer-org/gpt-engineer.git
cd gpt-engineer
poetry install
poetry shell # activate the virtual environment
```

We actively support Python 3.10 - 3.12. The last version to support Python 3.8 - 3.9 was [0.2.6](https://pypi.org/project/gpt-engineer/0.2.6/).
We actively support Python 3.10 - 3.12. The last version to support Python 3.8 - 3.9 was 0.2.6.

### Setup API key

Choose **one** of:
- Export env variable (you can add this to .bashrc so that you don't have to do it each time you start the terminal)
- `export OPENAI_API_KEY=[your api key]`
- .env file:
- Create a copy of `.env.template` named `.env`
- Add your OPENAI_API_KEY in .env
- Custom model:
- See [docs](https://gpt-engineer.readthedocs.io/en/latest/open_models.html), supports local model, azure, etc.
Choose one of:

Check the [Windows README](./WINDOWS_README.md) for Windows usage.
- Export an environment variable (add it to your shell profile so you don't need to set it every time):

**Other ways to run:**
- Use Docker ([instructions](docker/README.md))
- Do everything in your browser:
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://github.com/gpt-engineer-org/gpt-engineer/codespaces)
```bash
export OPENAI_API_KEY=[your api key]
```

- Use a `.env` file:
- Create a copy of `.env.template` named `.env`
- Add your `OPENAI_API_KEY` in `.env`

- Custom model: See the docs for instructions (supports local models, Azure, etc.).

Check the `WINDOWS_README.md` file for Windows-specific instructions.

Other ways to run:

- Use Docker (see `docker/README.md`)
- Open in GitHub Codespaces


## Usage

### Create new code (default usage)
- Create an empty folder for your project anywhere on your computer
- Create a file called `prompt` (no extension) inside your new folder and fill it with instructions
- Run `gpte <project_dir>` with a relative path to your folder
- For example: `gpte projects/my-new-project` from the gpt-engineer directory root with your new folder in `projects/`

1. Create an empty folder for your project.
2. Inside that folder create a file named `prompt` (no extension) and fill it with your instructions.
3. From the gpt-engineer repo root run:

```bash
gpte <project_dir>
# example: gpte projects/my-new-project
```


### Improve existing code
- Locate a folder with code which you want to improve anywhere on your computer
- Create a file called `prompt` (no extension) inside your new folder and fill it with instructions for how you want to improve the code
- Run `gpte <project_dir> -i` with a relative path to your folder
- For example: `gpte projects/my-old-project -i` from the gpt-engineer directory root with your folder in `projects/`

### Benchmark custom agents
- gpt-engineer installs the binary 'bench', which gives you a simple interface for benchmarking your own agent implementations against popular public datasets.
- The easiest way to get started with benchmarking is by checking out the [template](https://github.com/gpt-engineer-org/gpte-bench-template) repo, which contains detailed instructions and an agent template.
- Currently supported benchmark:
- [APPS](https://github.com/hendrycks/apps)
- [MBPP](https://github.com/google-research/google-research/tree/master/mbpp)
1. Locate the folder containing the code you want to improve.
2. Create a `prompt` file inside it with instructions for the improvement.
3. Run:

```bash
gpte <project_dir> -i
# example: gpte projects/my-old-project -i
```


### Benchmarking

The `gpt-engineer` package installs a `bench` binary for benchmarking agent implementations. See the `gpte-bench-template` repo for a starter template.

Supported datasets include APPS and MBPP.

The community has started work with different benchmarking initiatives, as described in [this Loom](https://www.loom.com/share/206805143fbb4302b5455a5329eaab17?sid=f689608f-8e49-44f7-b55f-4c81e9dc93e6) video.

### Research
Some of our community members have worked on different research briefs that could be taken further. See [this document](https://docs.google.com/document/d/1qmOj2DvdPc6syIAm8iISZFpfik26BYw7ZziD5c-9G0E/edit?usp=sharing) if you are interested.

## Terms
By running gpt-engineer, you agree to our [terms](https://github.com/gpt-engineer-org/gpt-engineer/blob/main/TERMS_OF_USE.md).
See the `docs` and community resources for research notes and briefs.


## Notes

- Limiting context window: see `docs/context_window.md` for strategies to control token usage and avoid truncation.
- By running gpt-engineer you agree to the terms in `TERMS_OF_USE.md`.


## Links & Community

- Roadmap: `ROADMAP.md`
- Governance: `GOVERNANCE.md`
- Contributing: `.github/CONTRIBUTING.md`
- Discord: https://discord.gg/8tcDQ89Ej2


---

_This README was updated locally._

## Relation to gptengineer.app (GPT Engineer)
[gptengineer.app](https://gptengineer.app/) is a commercial project for the automatic generation of web apps.
It features a UI for non-technical users connected to a git-controlled codebase.
The gptengineer.app team is actively supporting the open source community.


Expand Down
150 changes: 150 additions & 0 deletions docs/context_window.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Context window (token limit)

This note explains what a context window (token limit) is, why it matters when using LLMs, and practical strategies to work within it.

## What is the context window?

A model's context window (also called token limit) is the maximum number of tokens the model can accept as input (and sometimes include in output). Tokens roughly correspond to pieces of words; common English text averages ~0.7–1.3 tokens per word depending on vocabulary and punctuation.

If your prompt + conversation + document history exceed the context window, older content will be truncated (dropped) or the model will return an error depending on the client.

## Why it matters

- Cost: Many API providers bill per token. Sending more tokens increases costs.
- Performance: Larger inputs increase latency and can require more memory on the client/server side.
- Truncation / information loss: When the context exceeds the limit, parts of history or documents are omitted, which can break coherence, reasoning, or cause the model to lose earlier instructions or facts.

## Practical strategies

Below are three pragmatic strategies to manage content so it fits the context window while preserving useful information.

### 1) Truncation (simple, predictable)

When total tokens are too large, drop old or less-important content. This is easy, predictable, and safe for streaming/long chats. Use heuristics to drop older messages or large binary blobs (images, raw code) first.

Pros: simple, low compute overhead.
Cons: may drop crucial earlier context.

Conceptual pseudocode:

```
function build_payload(history, new_message, max_tokens):
payload = [system_prompt]
payload.append(new_message)
for msg in reversed(history): # start from most recent
if token_count(payload) + token_count(msg) > max_tokens:
break
payload.prepend(msg)
return payload
```

Tips:
- Keep a sliding window of the most recent N messages.
- Prefer to keep the system instructions and the most recent user/assistant turn.

### 2) Summarization / compaction (preserve meaning)

Compress older content into a shorter summary that preserves important facts. Periodically summarize the conversation or documents and store the summary in place of raw items. This preserves context at a lower token cost.

Pros: maintains semantic information; better for long-running sessions.
Cons: requires extra API calls or compute for summarization and careful prompt engineering to avoid losing critical specifics.

Conceptual pseudocode:

```
if total_tokens(history) > summary_threshold:
chunk = select_oldest_chunk(history)
summary = call_model_summarize(chunk)
remove chunk from history
append summary_marker(summary) to history

# Then build payload as in truncation, prioritizing summaries + recent messages
```

Implementation notes:
- Use structured summaries when possible: facts, entities, decisions, open tasks.
- Keep both a human-readable summary and a small machine-friendly key-value store for retrieval.
- Re-summarize incrementally: each time you summarize, append to the summary rather than re-summarize everything from scratch.

### 3) Configuration option (developer-facing control)

Expose a configuration option to tune how the system behaves when approaching the token limit. Example knobs:

- max_context_tokens: hard limit used when composing payloads.
- strategy: one of ["truncate", "summarize", "hybrid"].
- preserve_system_prompts: boolean; always keep system prompts.
- preserve_recent_turns: N recent user/assistant turns to always keep.

This lets users choose tradeoffs appropriate to their use case (cost vs. fidelity).

Example configuration object (JSON-like):

```
config = {
"max_context_tokens": 32000,
"strategy": "hybrid",
"preserve_system_prompts": true,
"preserve_recent_turns": 6,
"summary_chunk_size": 4000 # tokens per summarization chunk
}
```

Hybrid strategy: try to include as much recent raw context as possible, then include summaries of older content, and finally truncate if still necessary.

## Pseudocode: hybrid end-to-end

```
function prepare_context(history, new_message, config):
ensure system_prompt in history (or separate)

# Step 1: try to keep recent turns
payload = [system_prompt, new_message]
for msg in reversed(history.recent(config.preserve_recent_turns)):
if token_count(payload) + token_count(msg) <= config.max_context_tokens:
payload.prepend(msg)

# Step 2: include summaries of older content
older = history.older_than_recent()
for chunk in chunked(older, config.summary_chunk_size):
summary = get_or_create_summary(chunk)
if token_count(payload) + token_count(summary) <= config.max_context_tokens:
payload.append(summary)
else:
break

# Step 3: if still too large, truncate the least-important remaining items
if token_count(payload) > config.max_context_tokens:
payload = truncate_least_important(payload, config.max_context_tokens)

return payload
```

## Troubleshooting notes & edge cases

- "Off-by-one" token errors: different tokenizers or APIs may count tokens differently. Always leave a safety buffer (e.g., 32–256 tokens) when computing allowed tokens for model input + expected output.

- Unexpected truncation of system messages: ensure system prompts are treated as highest priority and pinned into the payload.

- Cost spikes when summarizing: summarization itself consumes tokens (both input and output), so amortize summarization by doing it infrequently or offline when possible.

- Losing exact data (e.g., code or long tables): summaries can lose exact formatting or specifics. For cases where exactness matters, keep the original as a downloadable artifact and include a short index or pointer in the summary.

- Very long single documents: chunk documents into logical sections and summarize each section, or use retrieval (vector DB) + short relevant context injection instead of sending whole doc.

- Multi-user/parallel sessions: keep per-session histories and shared summaries carefully namespaced to avoid mixing users' contexts.

## Additional suggestions

- Instrument token usage and provide metrics to users (tokens per request, cost per request, average history length). This helps tune thresholds.
- Provide a debugging mode that prints the token counts and what was dropped or summarized before each request.
- When integrating with retrieval (vector DBs), index long documents and retrieve only the most relevant chunks to inject into prompts rather than pushing entire documents.

## References and further reading

- Tokenization and how tokens map to words depends on the model's tokenizer (BPE / byte-level BPE etc.).
- For long-running agents, consider combining summarization with retrieval-augmented generation (RAG) patterns.


---

Notes: this page is intentionally concise. If you have an existing draft on the canvas you want copied verbatim, paste it here or tell me where to read it and I will replace this content with the draft's exact text.
2 changes: 1 addition & 1 deletion docs/examples/open_llms/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,4 @@ export MODEL_NAME="CodeLlama"
python examples/open_llms/langchain_interface.py
```

That's it 🤓 time to go back [to](/docs/open_models.md#running-the-example) and give `gpte` a try.
That's it 🤓 time to go back [to](../../open_models.md) and give `gpte` a try.
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Welcome to GPT-ENGINEER's Documentation
windows_readme_link
open_models.md
tracing_debugging.md
context_window.md

.. toctree::
:maxdepth: 2
Expand Down
4 changes: 2 additions & 2 deletions docs/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
<br>

## Get started
[Here’s](/en/latest/installation.html) how to install ``gpt-engineer``, set up your environment, and start building.
[Here’s](installation.rst) how to install ``gpt-engineer``, set up your environment, and start building.

We recommend following our [Quickstart](/en/latest/quickstart.html) guide to familiarize yourself with the framework by building your first application with ``gpt-engineer``.
We recommend following our [Quickstart](quickstart.rst) guide to familiarize yourself with the framework by building your first application with ``gpt-engineer``.

<br>

Expand Down
2 changes: 1 addition & 1 deletion docs/open_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ Feel free to try out larger models on your hardware and see what happens.
Running the Example
==================

To see that your setup works check [test open LLM setup](examples/test_open_llm/README.md).
To see that your setup works check [test open LLM setup](examples/open_llms/README.md).

If above tests work proceed 😉

Expand Down
10 changes: 5 additions & 5 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,13 @@ Quickstart
Installation
============

To install LangChain run:
To install gpt-engineer run:

.. code-block:: console

$ python -m pip install gpt-engineer

For more details, see our [Installation guide](/instllation.html).
For more details, see our `Installation guide <installation.rst>`_.

Setup API Key
=============
Expand All @@ -29,9 +29,9 @@ Choose one of the following:
- Create a copy of ``.env.template`` named ``.env``
- Add your ``OPENAI_API_KEY`` in .env

- If you want to use a custom model, visit our docs on `using open models and azure models <./open_models.html>`_.
- If you want to use a custom model, visit our docs on `using open models and azure models <open_models.md>`_.

- To set API key on windows check the `Windows README <./windows_readme_link.html>`_.
- To set API key on Windows check the `Windows README <windows_readme_link.rst>`_.

Building with ``gpt-engineer``
==============================
Expand Down Expand Up @@ -60,7 +60,7 @@ Improve Existing Code

$ gpte projects/my-old-project -i

By running ``gpt-engineer`` you agree to our `terms <./terms_link.html>`_.
By running ``gpt-engineer`` you agree to our `terms <terms_link.rst>`_.

To **run in the browser** you can simply:

Expand Down
Loading