diff --git a/docs/blog/act-via-code.mdx b/docs/blog/act-via-code.mdx
index ff00405d8..06e5772a9 100644
--- a/docs/blog/act-via-code.mdx
+++ b/docs/blog/act-via-code.mdx
@@ -2,19 +2,22 @@
title: "Act via Code"
icon: "code"
iconType: "solid"
-description: "The path to advanced code manipulation agents"
+description: "The path to fully-automated software engineering"
---
-
+
-Two and a half years since the launch of the GPT-3 API, code assistants have emerged as potentially the premier use case of LLMs. The rapid adoption of AI-powered IDEs and prototype builders isn't surprising — code is structured, deterministic, and rich with patterns, making it an ideal domain for machine learning. Experienced developers working with tools like Cursor (myself included) can tell that the field of software engineering is about to go through rapid change.
+Two and a half years since the launch of the GPT-3 API, code assistants have emerged as potentially the premier use case of LLMs. The rapid adoption of AI-powered IDEs and prototype builders isn't surprising — code is structured, deterministic, and rich with patterns, making it an ideal domain for machine learning. Developers actively working with tools like Cursor (myself included) have an exhiliarating yet uncertain sense that the field of software engineering is approaching an inflection point.
-Yet there's a striking gap between understanding and action. Today's AI agents can analyze enterprise codebases and propose sophisticated improvements—eliminating tech debt, untangling dependencies, improving modularity. But ask them to actually implement these changes across millions of lines of code, and they hit a wall. Their ceiling isn't intelligence—it's the ability to safely and reliably execute large-scale modifications on real, enterprise codebases.
+Yet there's a striking gap between understanding and action for today's code assistants. When provided proper context, frontier LLMs can analyze massive enterprise codebases and propose practical paths towards sophisticated, large-scale improvements. But implementing changes that impact more than a small set of files with modern AI assistants is fundamentally infeasible. The good news is that for focused, file-level changes, we've found real success: AI-powered IDEs ([Windsurf](https://codeium.com/windsurf), [Cursor](https://www.cursor.com/)) are transforming how developers write and review code, while chat-based assistants are revolutionizing how we bootstrap and prototype new applications (via tools like [v0](https://v0.dev/), [lovable.dev](https://lovable.dev/), and [bolt.new](https://bolt.new/)).
-The bottleneck isn't intelligence — it's tooling. By giving AI models the ability to write and execute code that modifies code, we're about to unlock an entire class of tasks that agents already understand but can't yet perform. Code execution environments represent the most expressive tool we could offer an agent—enabling composition, abstraction, and systematic manipulation of complex systems. When paired with ever-improving language models, this will unlock another step function improvement in AI capabilities.
+However, there's a whole class of critical engineering tasks that remain out of reach - tasks that are fundamentally programmatic and deal with codebase structure at scale. A significant amount of effort on modern engineering teams is directed towards eliminating tech debt, managing large-scale migrations, analyzing dependency graphs, enforcing type coverage across the codebase, and similar tasks that require a global view of a codebase. Today's AI assistants can fully understand these challenges and even propose solutions, but they lack the mechanisms to actually implement them. The intelligence is there, but it's trapped in your IDE's text completion window.
+
+
+The bottleneck isn't intelligence — it's tooling. The solution requires letting AI systems programmatically interact with codebases and software systems through code execution environments. Code execution environments represent the most expressive tool we could offer an agent—enabling composition, abstraction, and systematic manipulation of complex systems. By combining code execution environments with custom APIs that correspond to powerful large-scale operations, we can unlock a new set of tasks in which agents can be significant contributors. When paired with ever-improving foundation models, this will lead to a step function improvement for code assistants, enabling their application in an entirely new set of valuable tasks.
## Beating Minecraft with Code Execution
@@ -44,7 +47,7 @@ async function chopSpruceLogs(bot) {
}
```
-This approach transformed the agent's capabilities. Rather than being constrained to atomic actions like `equipItem(...)`, it could create higher-level operations like [`craftShieldWithFurnace()`](https://github.com/MineDojo/Voyager/blob/main/skill_library/trial2/skill/code/craftShieldWithFurnace.js) through composing JS APIs. Furthermore, Wang et al. implemented a memory mechanism, in which successful "action programs" could later be recalled, copied, and built upon, effectively enabling the agent to accumulate experience.
+This approach transformed the agent's capabilities. Rather than being constrained to atomic actions like `equipItem(...)` (this would be typical of "traditional" agent algorithms, such as ReAct), it could create higher-level operations like [craftShieldWithFurnace()](https://github.com/MineDojo/Voyager/blob/main/skill_library/trial2/skill/code/craftShieldWithFurnace.js) through composing the atomic APIs. Furthermore, Wang et al. implemented a memory mechanism, in which these successful "action programs" could later be recalled, copied, and built upon, effectively enabling the agent to accumulate experience.
@@ -56,9 +59,26 @@ As the Voyager authors noted:
## Code is an Ideal Action Space
-The implications of code as an action space extend far beyond gaming. This architectural insight — letting AI act through code rather than atomic commands — will lead to a step change in the capabilities of AI systems. Nowhere is this more apparent than in software engineering, where agents already understand complex transformations but lack the tools to execute them effectively.
+What these authors demonstrated is a fundamental insight that extends far beyond gaming. Letting AI act through code rather than atomic commands will lead to a step change in the capabilities of AI systems. Nowhere is this more apparent than in software engineering, where agents already understand complex transformations but lack the tools to execute them effectively.
+
+Today's productionized code assistants operate though an interface where they can directly read/write to text files and perform other bespoke activities, like searching through file embeddings or running terminal commands.
+
+In the act via code paradigm, all of these actions are expressed through writing and executing code, like the below:
+
+```python
+# Implement `grep` via for loops and if statements
+for function in codebase.functions:
+ if 'Page' in function.name:
+
+ # Implement systematic actions, like moving things around, through an API
+ function.move_to_file('/pages/' + function.name + '.tsx')
+```
+
+Provided a sufficiently comprehensive set of APIs, this paradigm has many clear advantages:
-When an agent writes code, it gains several critical advantages over traditional atomic tools:
+- **API-Driven Extensibility**: Any operation that can be expressed through an API becomes accessible to the agent. This means the scope of tasks an agent can handle grows with our ability to create clean APIs for complex operations.
+
+- **Programmatic Efficiency**: Many agent tasks involve systematic operations across large codebases. Expressing these as programs rather than individual commands dramatically reduces computational overhead and allows for batch operations.
- **Composability**: Agents can build their own tools by combining simpler operations. This aligns perfectly with LLMs' demonstrated ability to compose and interpolate between examples to create novel solutions.
@@ -68,17 +88,29 @@ When an agent writes code, it gains several critical advantages over traditional
- **Natural Collaboration**: Programs are a shared language between humans and agents. Code explicitly encodes reasoning in a reviewable format, making actions transparent, debuggable, and easily re-runnable.
-## For Software Engineering
+## Code Manipulation Programs
-Software engineering tasks are inherently programmatic and graph-based — dependency analysis, refactors, control flow analysis, etc. Yet today's AI agents interface with code primarily through string manipulation, missing the rich structure that developers and their tools rely on. By giving agents APIs that operate on the codebase's underlying graph structure rather than raw text, we can unlock a new tier of capabilities. Imagine agents that can rapidly traverse dependency trees, analyze control flow, and perform complex refactors while maintaining perfect awareness of the codebase's structure.
+For software engineering, we believe the path forward is clear: agents need a framework that matches how developers think about and manipulate code. While decades of static analysis work gives us a strong foundation, traditional code modification frameworks weren't designed with AI-human collaboration in mind - they expose low-level APIs that don't match how developers (or AI systems) think about code changes.
-Consider how a developer thinks about refactoring: it's rarely about direct text manipulation. Instead, we think in terms of high-level operations: "move this function," "rename this variable everywhere," "split this module." These operations can be encoded into a powerful Python API:
+We're building a framework with high-level APIs that correspond to how engineers actually think about code modifications. The APIs are clean and intuitive, following clear [principles](/docs/principles) that eliminate sharp edges and handle edge cases automatically. Most importantly, the framework encodes rich structural understanding of code. Consider this example:
```python
-# simple access to high-level code constructs
+# Access to high-level semantic operations
for component in codebase.jsx_components:
- # access detailed code structure and relations
+ # Rich structural analysis built-in
if len(component.usages) == 0:
- # powerful edit APIs that handle edge cases
+ # Systematic operations across the codebase
component.rename(component.name + 'Page')
```
+
+This isn't just string manipulation - the framework understands React component relationships, tracks usage patterns, and can perform complex refactors while maintaining correctness. By keeping the codebase representation in memory, we can provide lightning-fast operations for both analysis and systematic edits.
+
+The documentation for such a framework isn't just API reference - it's education for advanced intelligence about how to successfully manipulate code at scale. We're building for a future where AI systems are significant contributors to codebases, and they need to understand not just the "how" but the "why" behind code manipulation patterns.
+
+Crucially, we believe these APIs will extend beyond the codebase itself into the broader software engineering ecosystem. When agents can seamlessly interact with tools like Datadog, AWS, and other development platforms through the same clean interfaces, we'll take a major step toward [autonomous software engineering](/about#our-mission). The highest leverage move isn't just giving agents the ability to modify code - it's giving them programmatic access to the entire software development lifecycle.
+
+## Codegen is now OSS
+
+We're excited to release [Codegen](https://github.com/codegen-sh/codegen-sdk) as open source [Apache 2.0](https://github.com/codegen-sh/codegen-sdk?tab=Apache-2.0-1-ov-file) and build out this vision with the broader developer community. [Get started with Codegen](/introduction/getting-started) today or please join us in our [Slack community](https://community.codegen.com) if you have feedback or questions about a use case!
+
+Jay Hack, Founder
\ No newline at end of file
diff --git a/docs/blog/posts.mdx b/docs/blog/posts.mdx
index ed1c72110..2d1b27ed1 100644
--- a/docs/blog/posts.mdx
+++ b/docs/blog/posts.mdx
@@ -11,8 +11,8 @@ iconType: "solid"
Why code as an action space will lead to a step function improvement in agent capabilities.
diff --git a/docs/building-with-codegen/dot-codegen.mdx b/docs/building-with-codegen/dot-codegen.mdx
new file mode 100644
index 000000000..c4a4ca7a5
--- /dev/null
+++ b/docs/building-with-codegen/dot-codegen.mdx
@@ -0,0 +1,114 @@
+---
+title: "The .codegen Directory"
+sidebarTitle: ".codegen Directory"
+icon: "folder"
+iconType: "solid"
+---
+
+The `.codegen` directory contains your project's Codegen configuration, codemods, and supporting files. It's automatically created when you run `codegen init`.
+
+## Directory Structure
+
+```bash
+.codegen/
+├── config.toml # Project configuration
+├── codemods/ # Your codemod implementations
+├── jupyter/ # Jupyter notebooks for exploration
+├── docs/ # API documentation
+├── examples/ # Example code
+└── prompts/ # AI system prompts
+```
+
+## Initialization
+
+The directory is created and managed using the `codegen init` command:
+
+```bash
+codegen init [--fetch-docs] [--repo-name NAME] [--organization-name ORG]
+```
+
+
+The `--fetch-docs` flag downloads API documentation and examples specific to your project's programming language.
+
+
+### Configuration
+
+The `config.toml` file stores your project settings:
+
+```toml
+organization_name = "your-org"
+repo_name = "your-repo"
+programming_language = "python" # or other supported language
+```
+
+This configuration is used by Codegen to provide language-specific features and proper repository context.
+
+## Git Integration
+
+Codegen automatically adds appropriate entries to your `.gitignore`:
+
+```gitignore
+# Codegen
+.codegen/prompts/
+.codegen/docs/
+.codegen/examples/
+```
+
+
+While prompts, docs, and examples are ignored, your codemods in `.codegen/codemods/` are tracked in Git.
+
+
+## Working with Codemods
+
+The `codemods/` directory is where your transformation functions live. You can create new codemods using:
+
+```bash
+codegen create my-codemod [--description "what it does"]
+```
+
+This will:
+1. Create a new file in `.codegen/codemods/`
+2. Generate a system prompt in `.codegen/prompts/` (if using `--description`)
+3. Set up the necessary imports and decorators
+
+
+Use `codegen list` to see all codemods in your project.
+
+
+## Jupyter Integration
+
+The `jupyter/` directory contains notebooks for interactive development:
+
+```python
+from codegen import Codebase
+
+# Initialize codebase
+codebase = Codebase('../../')
+
+# Print stats
+print(f"📚 Total Files: {len(codebase.files)}")
+print(f"⚡ Total Functions: {len(codebase.functions)}")
+```
+
+
+A default notebook is created during initialization to help you explore your codebase.
+
+
+## Next Steps
+
+After initializing your `.codegen` directory:
+
+1. Create your first codemod:
+```bash
+codegen create my-codemod -d "describe what you want to do"
+```
+
+2. Run it:
+```bash
+codegen run my-codemod --apply-local
+```
+
+3. Deploy it for team use:
+```bash
+codegen deploy my-codemod
+```
diff --git a/docs/building-with-codegen/files-and-directories.mdx b/docs/building-with-codegen/files-and-directories.mdx
index 914be1038..632b5138b 100644
--- a/docs/building-with-codegen/files-and-directories.mdx
+++ b/docs/building-with-codegen/files-and-directories.mdx
@@ -7,8 +7,8 @@ iconType: "solid"
Codegen provides two primary abstractions for working with your codebase's file structure:
-- [`File`](../api-reference/core/File)
-- [`Directory`](../api-reference/core/Directory)
+- [File](/api-reference/core/File)
+- [Directory](/api-reference/core/Directory)
Both of these expose a rich API for accessing and manipulating their contents.
@@ -16,10 +16,10 @@ This guide explains how to effectively use these classes to manage your codebase
## Accessing Files and Directories
-You typically access files from the [`Codebase`](/api-reference/core/Codebase) object with two APIs:
+You typically access files from the [codebase](/api-reference/core/Codebase) object with two APIs:
-- [`Codebase.get_file(...)`](../api-reference/core/Codebase#get_file) - Get a file by its path
-- [`Codebase.files`](../api-reference/core/Codebase#files) - Enables iteration over all files in the codebase
+- [codebase.get_file(...)](/api-reference/core/Codebase#get_file) - Get a file by its path
+- [codebase.files](/api-reference/core/Codebase#files) - Enables iteration over all files in the codebase
```python
# Get a file from the codebase
@@ -33,7 +33,7 @@ for file in codebase.files:
exists = codebase.has_file("path/to/file.py")
```
-These APIs are similar for [`Directory`](../api-reference/core/Directory), which provides similar methods for accessing files and subdirectories.
+These APIs are similar for [Directory](/api-reference/core/Directory), which provides similar methods for accessing files and subdirectories.
```python
# Get a directory
@@ -74,7 +74,7 @@ docs = codebase.files(extensions=[".md", ".mdx"])
config_files = codebase.files(extensions=[".json", ".yaml", ".toml"])
```
-These APIs are similar for [`Directory`](../api-reference/core/Directory), which provides similar methods for accessing files and subdirectories.
+These APIs are similar for [Directory](/api-reference/core/Directory), which provides similar methods for accessing files and subdirectories.
## Raw Content and Metadata
@@ -102,10 +102,10 @@ Files and Directories provide several APIs for accessing and iterating over thei
See, for example:
-- `.functions` ([`File`](../api-reference/core/File#functions) / [`Directory`](../api-reference/core/Directory#functions)) - All [`Functions`](../api-reference/core/Function) in the file/directory
-- `.classes` ([`File`](../api-reference/core/File#classes) / [`Directory`](../api-reference/core/Directory#classes)) - All [`Classes`](../api-reference/core/Class) in the file/directory
-- `.imports` ([`File`](../api-reference/core/File#imports) / [`Directory`](../api-reference/core/Directory#imports)) - All [`Imports`](../api-reference/core/Import) in the file/directory
-- [`File.code_block`](../api-reference/core/File#code-block) - The top-level [`CodeBlock`](../api-reference/core/CodeBlock) containing the file's statements
+- `.functions` ([File](/api-reference/core/File#functions) / [Directory](/api-reference/core/Directory#functions)) - All [Functions](../api-reference/core/Function) in the file/directory
+- `.classes` ([File](/api-reference/core/File#classes) / [Directory](/api-reference/core/Directory#classes)) - All [Classes](../api-reference/core/Class) in the file/directory
+- `.imports` ([File](/api-reference/core/File#imports) / [Directory](/api-reference/core/Directory#imports)) - All [Imports](../api-reference/core/Import) in the file/directory
+
```python
# Get all functions in a file
@@ -120,10 +120,10 @@ for cls in file.classes:
print(f"Methods: {[m.name for m in cls.methods]}")
print(f"Attributes: {[a.name for a in cls.attributes]}")
-# Get imports
-for import_stmt in file.import_statements:
- print(f"Import from: {import_stmt.module}")
- print(f"Imported symbols: {[s.name for s in import_stmt.symbols]}")
+# Get imports (can also do `file.import_statements`)
+for imp in file.imports:
+ print(f"Import from: {imp.module}")
+ print(f"Imported symbol: {[s.name for s in imp.imported_symbol]}")
# Get specific symbols
main_function = file.get_function("main")
diff --git a/docs/building-with-codegen/parsing-codebases.mdx b/docs/building-with-codegen/parsing-codebases.mdx
index 72d19dca8..f1112ca0f 100644
--- a/docs/building-with-codegen/parsing-codebases.mdx
+++ b/docs/building-with-codegen/parsing-codebases.mdx
@@ -1,6 +1,6 @@
---
title: "Parsing Codebases"
-sidebarTitle: "Initialization"
+sidebarTitle: "Parsing Codebases"
icon: "power-off"
iconType: "solid"
---
diff --git a/docs/building-with-codegen/reusable-codemods.mdx b/docs/building-with-codegen/reusable-codemods.mdx
new file mode 100644
index 000000000..549d58df9
--- /dev/null
+++ b/docs/building-with-codegen/reusable-codemods.mdx
@@ -0,0 +1,120 @@
+---
+title: "Reusable Codemods"
+sidebarTitle: "Reusable Codemods"
+icon: "arrows-rotate"
+iconType: "solid"
+---
+
+Codegen enables you to create reusable code transformations using Python functions decorated with `@codegen.function`. These codemods can be shared, versioned, and run by your team.
+
+## Creating Codemods
+
+The easiest way to create a new codemod is using the CLI [create](/cli/create) command:
+
+```bash
+codegen create rename-function
+```
+
+This creates a new codemod in your `.codegen/codemods` directory:
+
+```python
+import codegen
+from codegen import Codebase
+
+@codegen.function("rename-function")
+def run(codebase: Codebase):
+ """Add a description of what this codemod does."""
+ # Add your code here
+ pass
+```
+
+
+ Codemods are stored in `.codegen/codemods/name/name.py` and are tracked in Git for easy sharing.
+
+
+### AI-Powered Generation with `-d`
+
+You can use AI to generate an initial implementation by providing a description:
+
+```bash
+codegen create rename-function -d "Rename the getUserData function to fetchUserProfile"
+```
+
+This will:
+1. Generate an implementation based on your description
+2. Create a custom system prompt that you can provide to an IDE chat assistant (learn more about [working with AI](/introduction/work-with-ai))
+3. Place both files in the codemod directory
+
+## Running Codemods
+
+Once created, run your codemod using:
+
+```bash
+codegen run rename-function
+```
+
+The execution flow:
+1. Codegen parses your codebase into a graph representation
+2. Your codemod function is executed against this graph
+3. Changes are tracked and applied to your filesystem
+4. A diff preview shows what changed
+
+
+## Codemod Structure
+
+A codemod consists of three main parts:
+
+1. The `@codegen.function` decorator that names your codemod
+2. A `run` function that takes a `Codebase` parameter
+3. Your transformation logic using the Codebase API
+
+```python
+import codegen
+from codegen import Codebase
+
+@codegen.function("update-imports")
+def run(codebase: Codebase):
+ """Update import statements to use new package names."""
+ for file in codebase.files:
+ for imp in file.imports:
+ if imp.module == "old_package":
+ imp.rename("new_package")
+ codebase.commit()
+```
+
+## Arguments
+
+Codemods can accept arguments using Pydantic models:
+
+```python
+from pydantic import BaseModel
+
+class RenameArgs(BaseModel):
+ old_name: str
+ new_name: str
+
+@codegen.function("rename-function")
+def run(codebase: Codebase, arguments: RenameArgs):
+ """Rename a function across the codebase."""
+ old_func = codebase.get_function(arguments.old_name)
+ if old_func:
+ old_func.rename(arguments.new_name)
+ codebase.commit()
+```
+
+Run it with:
+```bash
+codegen run rename-function --arguments '{"old_name": "getUserData", "new_name": "fetchUserProfile"}'
+```
+
+## Directory Structure
+
+Your codemods live in a dedicated directory structure:
+
+```
+.codegen/
+└── codemods/
+ └── rename_function/
+ ├── rename_function.py # The codemod implementation
+ └── rename_function_prompt.md # System prompt (if using AI)
+```
\ No newline at end of file
diff --git a/docs/cli/create.mdx b/docs/cli/create.mdx
index 58b4eb508..5b7b5dd55 100644
--- a/docs/cli/create.mdx
+++ b/docs/cli/create.mdx
@@ -5,16 +5,12 @@ icon: "plus"
iconType: "solid"
---
-The `create` command generates new codemods.
+The `create` command generates a new codemod function with the necessary boilerplate.
```bash
-codegen create organize-types --description "put all types in a single file"
+codegen create rename-function
```
-
-If you provide a `--description`, The Codegen CLI will use AI to generate a first implementation.
-
-
## Usage
```bash
@@ -23,73 +19,55 @@ codegen create NAME [OPTIONS]
## Arguments
-- `NAME`: The name/label for your codemod
+- `NAME`: The name of the codemod to create (e.g., "rename-function")
## Options
-- `--description`, `-d`: Description of what the codemod should do. When provided, uses AI to generate an implementation.
-- `--overwrite`: Overwrites the codemod if it already exists at the target path.
+- `--description`, `-d`: A description of what the codemod should do. This will be used to generate an AI-powered implementation.
-## Examples
+## Generated Files
-Create a basic codemod in the current directory:
-```bash
-codegen create update-imports
-```
+When you run `codegen create rename-function`, it creates:
-Create a codemod with AI assistance:
-```bash
-codegen create rename-function --description "Rename the getUserData function to fetchUserProfile across the codebase"
```
-
-Create a codemod in a specific directory, overwriting if it exists:
-```bash
-codegen create api-migration --overwrite
+.codegen/
+└── codemods/
+ └── rename_function/
+ ├── rename_function.py # The codemod implementation
+ └── rename_function_prompt.md # System prompt (if --description used)
```
-## Generated Files
+The generated codemod will have this structure:
-When you run `create`, new files are added to your `.codegen` directory structure:
+```python
+import codegen
+from codegen import Codebase
-```bash
-.codegen/
-├── config.toml
-├── codemods/
-│ └── rename_function.py # ← Generated by create
-├── docs/
-│ ├── api/
-│ ├── examples/
-│ └── tutorials/
-└── prompts/
- └── rename-function.md # ← Generated by create with --description
+@codegen.function("rename-function")
+def run(codebase: Codebase):
+ """Add a description of what this codemod does."""
+ # Add your code here
+ pass
```
-The command creates:
-1. A codemod implementation file in `.codegen/codemods/` (automatically converted to snake_case)
-2. A system prompt file in `.codegen/prompts/` (when using `--description`)
-
-
-The `prompts/` directory is automatically added to `.gitignore`, while your codemod in `codemods/` is tracked in Git.
-
-
-## AI-Assisted Generation
+## Examples
-When you provide a `--description`, Codegen:
-1. Analyzes your request using AI (~30 seconds)
-2. Generates a codemod implementation
-3. Creates a system prompt file for future AI interactions
-4. Wraps the code with necessary CLI decorators
+Create a basic codemod:
+```bash
+codegen create rename-function
+```
-The generated codemod is ready to run but can be customized to fit your specific needs.
+Create with an AI-powered implementation:
+```bash
+codegen create rename-function -d "Rename the getUserData function to fetchUserProfile"
+```
## Next Steps
After creating a codemod:
-1. Review and edit the implementation to customize its behavior
-2. Run it with [`codegen run`](/cli/run):
-```bash
-codegen run NAME
-```
+1. Edit the implementation in the generated .py file
+2. Test it with `codegen run rename-function`
+3. Deploy it for team use with `codegen deploy rename-function`
## Common Issues
diff --git a/docs/cli/run.mdx b/docs/cli/run.mdx
index ed83a4943..3adce12a4 100644
--- a/docs/cli/run.mdx
+++ b/docs/cli/run.mdx
@@ -5,7 +5,7 @@ icon: "play"
iconType: "solid"
---
-The `run` command executes a codemod and manages its output, whether viewing changes in the web UI, applying them locally, or creating pull requests.
+The `run` command executes a codemod against your local codebase, showing you the changes and applying them to your filesystem.
```bash
codegen run rename-function
@@ -23,23 +23,16 @@ codegen run LABEL [OPTIONS]
## Options
-- `--web`: Automatically open the diff in the web app
-- `--apply-local`: Apply the generated changes to your local filesystem
- `--diff-preview N`: Show a preview of the first N lines of the diff
- `--arguments JSON`: Pass arguments to the codemod as a JSON string (required if the codemod expects arguments)
## Examples
-Run a codemod and view results in terminal:
+Run a codemod:
```bash
codegen run rename-function
```
-Run and automatically apply changes locally:
-```bash
-codegen run rename-function --apply-local
-```
-
Run with a diff preview limited to 50 lines:
```bash
codegen run rename-function --diff-preview 50
@@ -52,53 +45,21 @@ codegen run rename-function --arguments '{"old_name": "getUserData", "new_name":
## Output
-The command provides:
-1. A web link to view changes in the Codegen UI
-2. Run details and logs
-3. A diff preview (if requested)
-4. Instructions for applying changes locally
+The command will:
+1. Parse your codebase
+2. Run the codemod
+3. Show a diff preview (if requested)
+4. Apply changes to your filesystem
-## Applying Changes
+## Execution Flow
-When using `--apply-local`, Codegen will:
-1. Generate a patch from the codemod's changes
-2. Apply it to your local filesystem
-3. Provide git commands to commit the changes
+When you run a codemod:
+1. Codegen parses your entire codebase into a graph representation
+2. The codemod function is executed against this graph
+3. Any changes made by the codemod are tracked
+4. Changes are automatically applied to your local files
+5. A summary of changes is displayed
-Your working directory must be clean (no uncommitted changes) when using `--apply-local`. If you have uncommitted changes, the command will provide instructions for resolving the situation.
+The codebase parsing step may take a few moments for larger codebases. Learn more in [How it Works](/introduction/how-it-works.mdx)
-
-## Common Issues
-
-### Uncommitted Changes
-If `--apply-local` fails due to uncommitted changes, you have two options:
-
-1. Save your changes:
-```bash
-git status # Check working directory
-git add . # Stage changes
-git commit -m 'msg' # Commit changes
-codegen run ... # Run command again
-```
-
-2. Discard changes (⚠️ destructive):
-```bash
-git reset --hard HEAD # Discard uncommitted changes
-git clean -fd # Remove untracked files
-codegen run ... # Run command again
-```
-
-### Multiple Codemods
-If multiple codemods share the same name, specify the exact file path:
-```bash
-codegen run ./path/to/specific/codemod.py
-```
-
-## Next Steps
-
-After running a codemod:
-1. Review the changes in the web UI or diff preview
-2. Apply changes locally with `--apply-local`
-3. Commit the changes to your repository
-4. Create a pull request if needed
diff --git a/docs/images/mine-amethyst.png b/docs/images/mine-amethyst.png
new file mode 100644
index 000000000..1211f96f7
Binary files /dev/null and b/docs/images/mine-amethyst.png differ
diff --git a/docs/mint.json b/docs/mint.json
index f4f2784ff..df6519df0 100644
--- a/docs/mint.json
+++ b/docs/mint.json
@@ -91,6 +91,8 @@
"pages": [
"building-with-codegen/at-a-glance",
"building-with-codegen/parsing-codebases",
+ "building-with-codegen/reusable-codemods",
+ "building-with-codegen/dot-codegen",
"building-with-codegen/language-support",
"building-with-codegen/commit-and-reset",
"building-with-codegen/git-operations",
diff --git a/src/codegen/cli/commands/create/main.py b/src/codegen/cli/commands/create/main.py
index d6e1e5416..3fafe819e 100644
--- a/src/codegen/cli/commands/create/main.py
+++ b/src/codegen/cli/commands/create/main.py
@@ -5,7 +5,6 @@
from codegen.cli.api.client import RestAPI
from codegen.cli.auth.constants import PROMPTS_DIR
-from codegen.cli.auth.decorators import requires_auth
from codegen.cli.auth.session import CodegenSession
from codegen.cli.codemod.convert import convert_to_cli
from codegen.cli.errors import ServerError
@@ -30,38 +29,53 @@ def get_prompts_dir() -> Path:
def get_target_path(name: str, path: Path) -> Path:
- """Get the target path for the new function file."""
+ """Get the target path for the new function file.
+
+ Creates a directory structure like:
+ .codegen/codemods/function_name/function_name.py
+ """
# Convert name to snake case for filename
name_snake = name.lower().replace("-", "_").replace(" ", "_")
+ # If path points to a specific file, use its parent directory
if path.suffix == ".py":
- # If path is a file, use it directly
- return path
+ base_dir = path.parent
else:
- # If path is a directory, create name_snake.py in it
- return path / f"{name_snake}.py"
+ base_dir = path
+
+ # Create path within .codegen/codemods
+ codemods_dir = base_dir / ".codegen" / "codemods"
+ function_dir = codemods_dir / name_snake
+ return function_dir / f"{name_snake}.py"
def make_relative(path: Path) -> str:
"""Convert a path to a relative path from cwd, handling non-existent paths."""
- # If it's just a filename in the current directory, return it directly
- if str(path.parent) == ".":
- return f"./{path.name}"
-
try:
return f"./{path.relative_to(Path.cwd())}"
except ValueError:
- # For paths in subdirectories, try to make the parent relative
- try:
- parent_rel = path.parent.relative_to(Path.cwd())
- return f"./{parent_rel}/{path.name}"
- except ValueError:
- # If all else fails, just return the filename
- return f"./{path.name}"
+ # If all else fails, just return the full path relative to .codegen
+ parts = path.parts
+ if ".codegen" in parts:
+ idx = parts.index(".codegen")
+ return "./" + str(Path(*parts[idx:]))
+ return f"./{path.name}"
+
+
+def get_default_code(name: str) -> str:
+ """Get the default function code without using the API."""
+ return f'''import codegen
+from codegen import Codebase
+
+@codegen.function("{name}")
+def run(codebase: Codebase):
+ """Add a description of what this codemod does."""
+ # Add your code here
+ pass
+'''
@click.command(name="create")
-@requires_auth
@requires_init
@click.argument("name", type=str)
@click.argument("path", type=click.Path(path_type=Path), default=Path.cwd())
@@ -82,44 +96,38 @@ def create_command(session: CodegenSession, name: str, path: Path, description:
pretty_print_error(f"File already exists at {format_path(rel_path)}\n\nTo overwrite the file:\n{format_command(f'codegen create {name} {rel_path} --overwrite')}")
return
- if description:
- status_message = "Generating function (using LLM, this will take ~30s)"
- else:
- status_message = "Setting up function"
-
- rich.print("") # Add a newline before the spinner
- with create_spinner(status_message) as status:
- try:
- # Get code from API
- response = RestAPI(session.token).create(name=name, query=description if description else None)
+ rich.print("") # Add a newline before output
- # Convert the code to include the decorator
- code = convert_to_cli(response.code, session.config.programming_language or ProgrammingLanguage.PYTHON, name)
+ try:
+ if description:
+ # Use API to generate implementation
+ with create_spinner("Generating function (using LLM, this will take ~30s)") as status:
+ response = RestAPI(session.token).create(name=name, query=description)
+ code = convert_to_cli(response.code, session.config.programming_language or ProgrammingLanguage.PYTHON, name)
- # Create the target directory if needed
- target_path.parent.mkdir(parents=True, exist_ok=True)
+ # Write the system prompt if provided
+ if response.context:
+ prompt_path = get_prompts_dir() / f"{name.lower().replace(' ', '-')}-system-prompt.md"
+ prompt_path.write_text(response.context)
+ else:
+ # Use default implementation
+ code = get_default_code(name)
- # Write the function code
- target_path.write_text(code)
+ # Create the target directory if needed
+ target_path.parent.mkdir(parents=True, exist_ok=True)
- # Write the system prompt to the prompts directory
- if response.context:
- prompt_path = get_prompts_dir() / f"{name.lower().replace(' ', '-')}-system-prompt.md"
- prompt_path.write_text(response.context)
+ # Write the function code
+ target_path.write_text(code)
- except ServerError as e:
- status.stop()
- raise click.ClickException(str(e))
- except ValueError as e:
- status.stop()
- raise click.ClickException(str(e))
+ except (ServerError, ValueError) as e:
+ raise click.ClickException(str(e))
# Success message
rich.print(f"\n✅ {'Overwrote' if overwrite and target_path.exists() else 'Created'} function '{name}'")
rich.print("")
rich.print("📁 Files Created:")
rich.print(f" [dim]Function:[/dim] {make_relative(target_path)}")
- if response.context:
+ if description and response.context:
rich.print(f" [dim]Prompt:[/dim] {make_relative(get_prompts_dir() / f'{name.lower().replace(" ", "-")}-system-prompt.md')}")
# Next steps
diff --git a/src/codegen/cli/commands/init/main.py b/src/codegen/cli/commands/init/main.py
index 50f2309b0..06cedf762 100644
--- a/src/codegen/cli/commands/init/main.py
+++ b/src/codegen/cli/commands/init/main.py
@@ -4,7 +4,6 @@
import rich
import rich_click as click
-import toml
from codegen.cli.auth.constants import CODEGEN_DIR
from codegen.cli.auth.session import CodegenSession
@@ -54,15 +53,7 @@ def init_command(repo_name: str | None = None, organization_name: str | None = N
codegen_dir, docs_dir, examples_dir = initialize_codegen(action, session=session, fetch_docs=fetch_docs)
# Print success message
- rich.print(f"✅ {action} complete")
-
- # Show repo info from config.toml
- config_path = codegen_dir / "config.toml"
- if config_path.exists():
- config = toml.load(config_path)
- rich.print(f" [dim]Organization:[/dim] {config.get('organization_name', 'unknown')}")
- rich.print(f" [dim]Repository:[/dim] {config.get('repo_name', 'unknown')}")
- rich.print("")
+ rich.print(f"✅ {action} complete\n")
rich.print(get_success_message(codegen_dir, docs_dir, examples_dir))
# Print next steps
diff --git a/src/codegen/cli/commands/init/render.py b/src/codegen/cli/commands/init/render.py
index e6a152bd8..27b02749a 100644
--- a/src/codegen/cli/commands/init/render.py
+++ b/src/codegen/cli/commands/init/render.py
@@ -3,7 +3,8 @@
def get_success_message(codegen_dir: Path, docs_dir: Path, examples_dir: Path) -> str:
"""Get the success message to display after initialization."""
- return """📁 Folders Created:
- [dim] Location:[/dim] .codegen
- [dim] Docs:[/dim] .codegen/docs
- [dim] Examples:[/dim] .codegen/examples"""
+ return """📁 .codegen configuration folder created:
+ [dim]config.toml[/dim] Project configuration
+ [dim]codemods/[/dim] Your codemod implementations
+ [dim]jupyter/[/dim] Notebooks for codebase exploration
+ [dim]prompts/[/dim] AI system prompts (gitignored)"""
diff --git a/src/codegen/cli/commands/run/main.py b/src/codegen/cli/commands/run/main.py
index 1b4f52e72..357f1f380 100644
--- a/src/codegen/cli/commands/run/main.py
+++ b/src/codegen/cli/commands/run/main.py
@@ -1,151 +1,46 @@
import json
-import webbrowser
-import rich
import rich_click as click
-from rich.panel import Panel
-from codegen.cli.api.client import RestAPI
-from codegen.cli.auth.decorators import requires_auth
from codegen.cli.auth.session import CodegenSession
-from codegen.cli.errors import ServerError
-from codegen.cli.git.patch import apply_patch
-from codegen.cli.rich.codeblocks import format_command
-from codegen.cli.rich.spinners import create_spinner
from codegen.cli.utils.codemod_manager import CodemodManager
from codegen.cli.utils.json_schema import validate_json
-from codegen.cli.utils.url import generate_webapp_url
from codegen.cli.workspace.decorators import requires_init
-def run_function(session: CodegenSession, function, web: bool = False, apply_local: bool = False, diff_preview: int | None = None):
- """Run a function and handle its output."""
- with create_spinner(f"Running {function.name}...") as status:
- try:
- run_output = RestAPI(session.token).run(
- function=function,
- )
-
- status.stop()
- rich.print(f"✅ Ran {function.name} successfully")
- if run_output.web_link:
- # Extract the run ID from the web link
- run_id = run_output.web_link.split("/run/")[1].split("/")[0]
- function_id = run_output.web_link.split("/codemod/")[1].split("/")[0]
-
- rich.print(" [dim]Web viewer:[/dim] [blue underline]" + run_output.web_link + "[/blue underline]")
- run_details_url = generate_webapp_url(f"functions/{function_id}/run/{run_id}")
- rich.print(f" [dim]Run details:[/dim] [blue underline]{run_details_url}[/blue underline]")
-
- if run_output.logs:
- rich.print("")
- panel = Panel(run_output.logs, title="[bold]Logs[/bold]", border_style="blue", padding=(1, 2), expand=False)
- rich.print(panel)
-
- if run_output.error:
- rich.print("")
- panel = Panel(run_output.error, title="[bold]Error[/bold]", border_style="red", padding=(1, 2), expand=False)
- rich.print(panel)
-
- if run_output.observation:
- # Only show diff preview if requested
- if diff_preview:
- rich.print("") # Add some spacing
-
- # Split and limit diff to requested number of lines
- diff_lines = run_output.observation.splitlines()
- truncated = len(diff_lines) > diff_preview
- limited_diff = "\n".join(diff_lines[:diff_preview])
-
- if truncated:
- if apply_local:
- limited_diff += "\n\n...\n\n[yellow]diff truncated to {diff_preview} lines, view the full change set in your local file system[/yellow]"
- else:
- limited_diff += (
- "\n\n...\n\n[yellow]diff truncated to {diff_preview} lines, view the full change set on your local file system after using run with `--apply-local`[/yellow]"
- )
-
- panel = Panel(limited_diff, title="[bold]Diff Preview[/bold]", border_style="blue", padding=(1, 2), expand=False)
- rich.print(panel)
-
- if not apply_local:
- rich.print("")
- rich.print("Apply changes locally:")
- rich.print(format_command(f"codegen run {function.name} --apply-local"))
- rich.print("Create a PR:")
- rich.print(format_command(f"codegen run {function.name} --create-pr"))
- else:
- rich.print("")
- rich.print("[yellow] No changes were produced by this codemod[/yellow]")
-
- if web and run_output.web_link:
- webbrowser.open_new(run_output.web_link)
-
- if apply_local and run_output.observation:
- try:
- apply_patch(session.git_repo, f"\n{run_output.observation}\n")
- rich.print("")
- rich.print("[green]✓ Changes have been applied to your local filesystem[/green]")
- rich.print("[yellow]→ Don't forget to commit your changes:[/yellow]")
- rich.print(format_command("git add ."))
- rich.print(format_command("git commit -m 'Applied codemod changes'"))
- except Exception as e:
- rich.print("")
- rich.print("[red]✗ Failed to apply changes locally[/red]")
- rich.print("\n[yellow]This usually happens when you have uncommitted changes.[/yellow]")
- rich.print("\nOption 1 - Save your changes:")
- rich.print(" 1. [blue]git status[/blue] (check your working directory)")
- rich.print(" 2. [blue]git add .[/blue] (stage your changes)")
- rich.print(" 3. [blue]git commit -m 'msg'[/blue] (commit your changes)")
- rich.print(" 4. Run this command again")
- rich.print("\nOption 2 - Discard your changes:")
- rich.print(" 1. [red]git reset --hard HEAD[/red] (⚠️ discards all uncommitted changes)")
- rich.print(" 2. [red]git clean -fd[/red] (⚠️ removes all untracked files)")
- rich.print(" 3. Run this command again\n")
- raise click.ClickException("Failed to apply patch to local filesystem")
-
- except ServerError as e:
- status.stop()
- raise click.ClickException(str(e))
-
-
@click.command(name="run")
-@requires_auth
@requires_init
@click.argument("label", required=True)
-@click.option("--web", is_flag=True, help="Automatically open the diff in the web app")
-@click.option("--apply-local", is_flag=True, help="Applies the generated diff to the repository")
+@click.option("--web", is_flag=True, help="Run the function on the web service instead of locally")
@click.option("--diff-preview", type=int, help="Show a preview of the first N lines of the diff")
@click.option("--arguments", type=str, help="Arguments as a json string to pass as the function's 'arguments' parameter")
-def run_command(session: CodegenSession, label: str, web: bool = False, apply_local: bool = False, diff_preview: int | None = None, arguments: str | None = None):
+def run_command(
+ session: CodegenSession,
+ label: str,
+ web: bool = False,
+ diff_preview: int | None = None,
+ arguments: str | None = None,
+):
"""Run a codegen function by its label."""
- # First try to find it as a stored codemod
- codemod = CodemodManager.get(label)
- if codemod:
- if codemod.arguments_type_schema and not arguments:
- raise click.ClickException(f"This function requires the --arguments parameter. Expected schema: {codemod.arguments_type_schema}")
-
- if codemod.arguments_type_schema and arguments:
- arguments_json = json.loads(arguments)
- is_valid = validate_json(codemod.arguments_type_schema, arguments_json)
- print(f"is_valid: {is_valid}")
+ # Get and validate the codemod
+ codemod = CodemodManager.get_codemod(label)
- run_function(session, codemod, web, apply_local, diff_preview)
- return
+ # Handle arguments if needed
+ if codemod.arguments_type_schema and not arguments:
+ raise click.ClickException(f"This function requires the --arguments parameter. Expected schema: {codemod.arguments_type_schema}")
- # If not found as a stored codemod, look for decorated functions
- functions = CodemodManager.get_decorated()
- print("found some functions", functions)
- matching = [f for f in functions if f.name == label]
+ if codemod.arguments_type_schema and arguments:
+ arguments_json = json.loads(arguments)
+ is_valid = validate_json(codemod.arguments_type_schema, arguments_json)
+ if not is_valid:
+ raise click.ClickException(f"Invalid arguments format. Expected schema: {codemod.arguments_type_schema}")
- if not matching:
- raise click.ClickException(f"No function found with label '{label}'")
+ # Run the codemod
+ if web:
+ from codegen.cli.commands.run.run_cloud import run_cloud
- if len(matching) > 1:
- # If multiple matches, show their locations
- rich.print(f"[yellow]Multiple functions found with label '{label}':[/yellow]")
- for func in matching:
- rich.print(f" • {func.filepath}")
- raise click.ClickException("Please specify the exact file with codegen run ")
+ run_cloud(session, codemod, diff_preview=diff_preview)
+ else:
+ from codegen.cli.commands.run.run_local import run_local
- run_function(session, matching[0], web, apply_local, diff_preview)
+ run_local(session, codemod, diff_preview=diff_preview)
diff --git a/src/codegen/cli/commands/run/run_cloud.py b/src/codegen/cli/commands/run/run_cloud.py
new file mode 100644
index 000000000..f2c38a971
--- /dev/null
+++ b/src/codegen/cli/commands/run/run_cloud.py
@@ -0,0 +1,112 @@
+import webbrowser
+
+import rich
+import rich_click as click
+from rich.panel import Panel
+
+from codegen.cli.api.client import RestAPI
+from codegen.cli.auth.session import CodegenSession
+from codegen.cli.errors import ServerError
+from codegen.cli.git.patch import apply_patch
+from codegen.cli.rich.codeblocks import format_command
+from codegen.cli.rich.spinners import create_spinner
+from codegen.cli.utils.url import generate_webapp_url
+
+
+def run_cloud(session: CodegenSession, function, apply_local: bool = False, diff_preview: int | None = None):
+ """Run a function on the cloud service.
+
+ Args:
+ session: The current codegen session
+ function: The function to run
+ apply_local: Whether to apply changes to the local filesystem
+ diff_preview: Number of lines of diff to preview (None for all)
+ """
+ with create_spinner(f"Running {function.name}...") as status:
+ try:
+ run_output = RestAPI(session.token).run(
+ function=function,
+ )
+
+ status.stop()
+ rich.print(f"✅ Ran {function.name} successfully")
+ if run_output.web_link:
+ # Extract the run ID from the web link
+ run_id = run_output.web_link.split("/run/")[1].split("/")[0]
+ function_id = run_output.web_link.split("/codemod/")[1].split("/")[0]
+
+ rich.print(" [dim]Web viewer:[/dim] [blue underline]" + run_output.web_link + "[/blue underline]")
+ run_details_url = generate_webapp_url(f"functions/{function_id}/run/{run_id}")
+ rich.print(f" [dim]Run details:[/dim] [blue underline]{run_details_url}[/blue underline]")
+
+ if run_output.logs:
+ rich.print("")
+ panel = Panel(run_output.logs, title="[bold]Logs[/bold]", border_style="blue", padding=(1, 2), expand=False)
+ rich.print(panel)
+
+ if run_output.error:
+ rich.print("")
+ panel = Panel(run_output.error, title="[bold]Error[/bold]", border_style="red", padding=(1, 2), expand=False)
+ rich.print(panel)
+
+ if run_output.observation:
+ # Only show diff preview if requested
+ if diff_preview:
+ rich.print("") # Add some spacing
+
+ # Split and limit diff to requested number of lines
+ diff_lines = run_output.observation.splitlines()
+ truncated = len(diff_lines) > diff_preview
+ limited_diff = "\n".join(diff_lines[:diff_preview])
+
+ if truncated:
+ if apply_local:
+ limited_diff += "\n\n...\n\n[yellow]diff truncated to {diff_preview} lines, view the full change set in your local file system[/yellow]"
+ else:
+ limited_diff += (
+ "\n\n...\n\n[yellow]diff truncated to {diff_preview} lines, view the full change set on your local file system after using run with `--apply-local`[/yellow]"
+ )
+
+ panel = Panel(limited_diff, title="[bold]Diff Preview[/bold]", border_style="blue", padding=(1, 2), expand=False)
+ rich.print(panel)
+
+ if not apply_local:
+ rich.print("")
+ rich.print("Apply changes locally:")
+ rich.print(format_command(f"codegen run {function.name} --apply-local"))
+ rich.print("Create a PR:")
+ rich.print(format_command(f"codegen run {function.name} --create-pr"))
+ else:
+ rich.print("")
+ rich.print("[yellow] No changes were produced by this codemod[/yellow]")
+
+ # Open web link in browser
+ if run_output.web_link:
+ webbrowser.open_new(run_output.web_link)
+
+ if apply_local and run_output.observation:
+ try:
+ apply_patch(session.git_repo, f"\n{run_output.observation}\n")
+ rich.print("")
+ rich.print("[green]✓ Changes have been applied to your local filesystem[/green]")
+ rich.print("[yellow]→ Don't forget to commit your changes:[/yellow]")
+ rich.print(format_command("git add ."))
+ rich.print(format_command("git commit -m 'Applied codemod changes'"))
+ except Exception as e:
+ rich.print("")
+ rich.print("[red]✗ Failed to apply changes locally[/red]")
+ rich.print("\n[yellow]This usually happens when you have uncommitted changes.[/yellow]")
+ rich.print("\nOption 1 - Save your changes:")
+ rich.print(" 1. [blue]git status[/blue] (check your working directory)")
+ rich.print(" 2. [blue]git add .[/blue] (stage your changes)")
+ rich.print(" 3. [blue]git commit -m 'msg'[/blue] (commit your changes)")
+ rich.print(" 4. Run this command again")
+ rich.print("\nOption 2 - Discard your changes:")
+ rich.print(" 1. [red]git reset --hard HEAD[/red] (⚠️ discards all uncommitted changes)")
+ rich.print(" 2. [red]git clean -fd[/red] (⚠️ removes all untracked files)")
+ rich.print(" 3. Run this command again\n")
+ raise click.ClickException("Failed to apply patch to local filesystem")
+
+ except ServerError as e:
+ status.stop()
+ raise click.ClickException(str(e))
diff --git a/src/codegen/cli/commands/run/run_local.py b/src/codegen/cli/commands/run/run_local.py
new file mode 100644
index 000000000..b87e15e74
--- /dev/null
+++ b/src/codegen/cli/commands/run/run_local.py
@@ -0,0 +1,71 @@
+from pathlib import Path
+
+import rich
+from rich.panel import Panel
+from rich.status import Status
+
+from codegen import Codebase
+from codegen.cli.auth.session import CodegenSession
+from codegen.cli.utils.function_finder import DecoratedFunction
+
+
+def parse_codebase(repo_root: Path) -> Codebase:
+ """Parse the codebase at the given root.
+
+ Args:
+ repo_root: Path to the repository root
+
+ Returns:
+ Parsed Codebase object
+ """
+ codebase = Codebase(repo_root)
+ return codebase
+
+
+def run_local(
+ session: CodegenSession,
+ function: DecoratedFunction,
+ diff_preview: int | None = None,
+) -> None:
+ """Run a function locally against the codebase.
+
+ Args:
+ session: The current codegen session
+ function: The function to run
+ diff_preview: Number of lines of diff to preview (None for all)
+ """
+ # Parse codebase and run
+ repo_root = Path(session.git_repo.workdir)
+
+ with Status("[bold]Parsing codebase...", spinner="dots") as status:
+ codebase = parse_codebase(repo_root)
+ status.update("[bold green]✓ Parsed codebase")
+
+ status.update("[bold]Running codemod...")
+ function.run(codebase) # Run the function
+ status.update("[bold green]✓ Completed codemod")
+
+ # Get the diff from the codebase
+ result = codebase.get_diff()
+
+ # Handle no changes case
+ if not result:
+ rich.print("\n[yellow]No changes were produced by this codemod[/yellow]")
+ return
+
+ # Show diff preview if requested
+ if diff_preview:
+ rich.print("") # Add spacing
+ diff_lines = result.splitlines()
+ truncated = len(diff_lines) > diff_preview
+ limited_diff = "\n".join(diff_lines[:diff_preview])
+
+ if truncated:
+ limited_diff += f"\n\n...\n\n[yellow]diff truncated to {diff_preview} lines[/yellow]"
+
+ panel = Panel(limited_diff, title="[bold]Diff Preview[/bold]", border_style="blue", padding=(1, 2), expand=False)
+ rich.print(panel)
+
+ # Apply changes
+ rich.print("")
+ rich.print("[green]✓ Changes have been applied to your local filesystem[/green]")
diff --git a/src/codegen/cli/sdk/decorator.py b/src/codegen/cli/sdk/decorator.py
index 74068ca70..e4e93e956 100644
--- a/src/codegen/cli/sdk/decorator.py
+++ b/src/codegen/cli/sdk/decorator.py
@@ -36,6 +36,8 @@ def __call__(self, func: Callable[P, T]) -> Callable[P, T]:
def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
return func(*args, **kwargs)
+ # Set the codegen name on the wrapper function
+ wrapper.__codegen_name__ = self.name
self.func = wrapper
return wrapper
diff --git a/src/codegen/cli/utils/codemod_manager.py b/src/codegen/cli/utils/codemod_manager.py
index 14353c6df..9361ceda4 100644
--- a/src/codegen/cli/utils/codemod_manager.py
+++ b/src/codegen/cli/utils/codemod_manager.py
@@ -1,6 +1,8 @@
import builtins
from pathlib import Path
+import rich_click as click
+
from codegen.cli.utils.function_finder import DecoratedFunction, find_codegen_functions
@@ -25,6 +27,39 @@ class CodemodManager:
def get_valid_name(name: str) -> str:
return name.lower().replace(" ", "_").replace("-", "_")
+ @classmethod
+ def get_codemod(cls, name: str, start_path: Path | None = None) -> DecoratedFunction:
+ """Get and validate a codemod by name.
+
+ Args:
+ name: Name of the codemod to find
+ start_path: Directory to start searching from (default: current directory)
+
+ Returns:
+ The validated DecoratedFunction
+
+ Raises:
+ click.ClickException: If codemod can't be found or loaded
+ """
+ # First try to find the codemod
+ codemod = cls.get(name, start_path)
+ if not codemod:
+ # If not found, check if any codemods exist
+ all_codemods = cls.list(start_path)
+ if not all_codemods:
+ raise click.ClickException("No codemods found. Create one with:\n" + " codegen create my-codemod")
+ else:
+ available = "\n ".join(f"- {c.name}" for c in all_codemods)
+ raise click.ClickException(f"Codemod '{name}' not found. Available codemods:\n {available}")
+
+ # Verify we can import it
+ try:
+ # This will raise ValueError if function can't be imported
+ codemod.validate()
+ return codemod
+ except Exception as e:
+ raise click.ClickException(f"Error loading codemod '{name}': {e!s}")
+
@classmethod
def list(cls, start_path: Path | None = None) -> builtins.list[DecoratedFunction]:
"""List all codegen decorated functions in Python files under the given path.
@@ -81,48 +116,25 @@ def get_decorated(cls, start_path: Path | None = None) -> builtins.list[Decorate
if start_path is None:
start_path = Path.cwd()
- # Directories to skip
- SKIP_DIRS = {
- "__pycache__",
- "node_modules",
- ".git",
- ".hg",
- ".svn",
- ".tox",
- ".venv",
- "venv",
- "env",
- "build",
- "dist",
- "site-packages",
- ".pytest_cache",
- ".mypy_cache",
- ".ruff_cache",
- ".coverage",
- "htmlcov",
- ".codegen",
- }
+ # Look only in .codegen/codemods
+ codemods_dir = start_path / ".codegen" / "codemods"
+ if not codemods_dir.exists():
+ return []
all_functions = []
- if start_path.is_file():
- # If it's a file, just check that one
- if start_path.suffix == ".py" and _might_have_decorators(start_path):
+ seen_paths = set() # Track unique file paths
+
+ for path in codemods_dir.rglob("*.py"):
+ # Skip if we've already processed this file
+ if path in seen_paths:
+ continue
+ seen_paths.add(path)
+
+ if _might_have_decorators(path):
try:
- functions = find_codegen_functions(start_path)
+ functions = find_codegen_functions(path)
all_functions.extend(functions)
- except Exception as e:
+ except Exception:
pass # Skip files we can't parse
- else:
- # Walk the directory tree, skipping irrelevant directories
- for path in start_path.rglob("*.py"):
- # Skip if any parent directory is in SKIP_DIRS
- if any(part in SKIP_DIRS for part in path.parts):
- continue
-
- if _might_have_decorators(path):
- try:
- functions = find_codegen_functions(path)
- all_functions.extend(functions)
- except Exception as e:
- pass # Skip files we can't parse
+
return all_functions
diff --git a/src/codegen/cli/utils/function_finder.py b/src/codegen/cli/utils/function_finder.py
index 7d63b35bf..1874fbbe0 100644
--- a/src/codegen/cli/utils/function_finder.py
+++ b/src/codegen/cli/utils/function_finder.py
@@ -18,6 +18,60 @@ class DecoratedFunction:
parameters: list[tuple[str, str | None]] = dataclasses.field(default_factory=list)
arguments_type_schema: dict | None = None
+ def run(self, codebase) -> str | None:
+ """Import and run the actual function from its file.
+
+ Args:
+ codebase: The codebase to run the function on
+
+ Returns:
+ The result of running the function (usually a diff string)
+ """
+ if not self.filepath:
+ raise ValueError("Cannot run function without filepath")
+
+ # Import the module containing the function
+ spec = importlib.util.spec_from_file_location("module", self.filepath)
+ if not spec or not spec.loader:
+ raise ImportError(f"Could not load module from {self.filepath}")
+
+ module = importlib.util.module_from_spec(spec)
+ spec.loader.exec_module(module)
+
+ # Find the decorated function
+ for item_name in dir(module):
+ item = getattr(module, item_name)
+ if hasattr(item, "__codegen_name__") and item.__codegen_name__ == self.name:
+ # Found our function, run it
+ return item(codebase)
+
+ raise ValueError(f"Could not find function '{self.name}' in {self.filepath}")
+
+ def validate(self) -> None:
+ """Verify that this function can be imported and accessed.
+
+ Raises:
+ ValueError: If the function can't be found or imported
+ """
+ if not self.filepath:
+ raise ValueError("Cannot validate function without filepath")
+
+ # Import the module containing the function
+ spec = importlib.util.spec_from_file_location("module", self.filepath)
+ if not spec or not spec.loader:
+ raise ImportError(f"Could not load module from {self.filepath}")
+
+ module = importlib.util.module_from_spec(spec)
+ spec.loader.exec_module(module)
+
+ # Find the decorated function
+ for item_name in dir(module):
+ item = getattr(module, item_name)
+ if hasattr(item, "__codegen_name__") and item.__codegen_name__ == self.name:
+ return # Found it!
+
+ raise ValueError(f"Could not find function '{self.name}' in {self.filepath}")
+
class CodegenFunctionVisitor(ast.NodeVisitor):
def __init__(self):
diff --git a/src/codegen/cli/workspace/decorators.py b/src/codegen/cli/workspace/decorators.py
index fbce7a1d1..33fa0d8a3 100644
--- a/src/codegen/cli/workspace/decorators.py
+++ b/src/codegen/cli/workspace/decorators.py
@@ -16,9 +16,11 @@ def requires_init(f: Callable) -> Callable:
@functools.wraps(f)
def wrapper(*args, **kwargs):
- session: CodegenSession | None = kwargs.get("session")
+ # Create a session if one wasn't provided
+ session = kwargs.get("session")
if not session:
- raise ValueError("@requires_init must be used after @requires_auth")
+ session = CodegenSession()
+ kwargs["session"] = session
if not session.codegen_dir.exists():
rich.print("Codegen not initialized. Running init command first...")
diff --git a/src/codegen/cli/workspace/initialize_workspace.py b/src/codegen/cli/workspace/initialize_workspace.py
index 198d28319..8c7839d20 100644
--- a/src/codegen/cli/workspace/initialize_workspace.py
+++ b/src/codegen/cli/workspace/initialize_workspace.py
@@ -77,6 +77,7 @@ def initialize_codegen(
EXAMPLES_FOLDER = REPO_PATH / EXAMPLES_DIR
CONFIG_PATH = CODEGEN_FOLDER / "config.toml"
JUPYTER_DIR = CODEGEN_FOLDER / "jupyter"
+ CODEMODS_DIR = CODEGEN_FOLDER / "codemods"
# If status is a string, create a new spinner
context = create_spinner(f" {status} folders...") if isinstance(status, str) else nullcontext()
@@ -88,6 +89,7 @@ def initialize_codegen(
CODEGEN_FOLDER.mkdir(parents=True, exist_ok=True)
PROMPTS_FOLDER.mkdir(parents=True, exist_ok=True)
JUPYTER_DIR.mkdir(parents=True, exist_ok=True)
+ CODEMODS_DIR.mkdir(parents=True, exist_ok=True)
if not repo:
rich.print("No git repository found. Please run this command in a git repository.")
@@ -140,7 +142,35 @@ def add_to_gitignore_if_not_present(gitignore: Path, line: str):
def modify_gitignore(codegen_folder: Path):
+ """Update .gitignore to track only specific Codegen files."""
gitignore_path = codegen_folder / ".gitignore"
- add_to_gitignore_if_not_present(gitignore_path, "prompts")
- add_to_gitignore_if_not_present(gitignore_path, "docs")
- add_to_gitignore_if_not_present(gitignore_path, "examples")
+
+ # Define what should be ignored (everything except config.toml and codemods)
+ ignore_patterns = [
+ "# Codegen",
+ "docs/",
+ "examples/",
+ "prompts/",
+ "jupyter/",
+ "",
+ "# Keep config.toml and codemods",
+ "!config.toml",
+ "!codemods/",
+ "!codemods/**",
+ ]
+
+ # Write or update .gitignore
+ if not gitignore_path.exists():
+ gitignore_path.write_text("\n".join(ignore_patterns))
+ else:
+ # Read existing content
+ content = gitignore_path.read_text()
+
+ # Check if our section already exists
+ if "# Codegen" not in content:
+ # Add a newline if the file doesn't end with one
+ if content and not content.endswith("\n"):
+ content += "\n"
+ # Add our patterns
+ content += "\n" + "\n".join(ignore_patterns) + "\n"
+ gitignore_path.write_text(content)