Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
4177e08
convert to modal.Dict snapshot manager
clee-codegen Feb 27, 2025
b5f1828
fix: implement modified swebench harness evaluation
clee-codegen Feb 28, 2025
a54c71d
Automated pre-commit update
clee-codegen Feb 28, 2025
cdcf2d0
base_commit -> environment_setup_commit
clee-codegen Feb 28, 2025
9049f1d
feat: codegen parse oss repos via CLI and modal (#545)
clee-codegen Mar 2, 2025
7209e5d
add: integrate with postgresql output
clee-codegen Mar 3, 2025
74a019c
Automated pre-commit update
clee-codegen Mar 3, 2025
1201832
Merge branch 'develop' into swebench-sandbox-snapshots
clee-codegen Mar 4, 2025
46171bf
wip: integration
clee-codegen Mar 4, 2025
45eb835
fix: integration with modal deployments
clee-codegen Mar 4, 2025
7a3b415
wip: initial refactor
clee-codegen Mar 5, 2025
01236e5
fix: refactor run to complete
clee-codegen Mar 5, 2025
cae9518
Merge remote-tracking branch 'origin/develop' into swebench-sandbox-s…
clee-codegen Mar 10, 2025
583dd10
wip: merge changes from run_eval develop
clee-codegen Mar 10, 2025
60fed54
add: coarse retries for agent run
clee-codegen Mar 10, 2025
260d5bc
fix: limit agent modal function concurrency
clee-codegen Mar 11, 2025
c8cbde9
fix: post-merge bugs
clee-codegen Mar 12, 2025
5e4b244
Merge branch 'develop' into swebench-sandbox-snapshots
clee-codegen Mar 12, 2025
65dd98b
Merge branch 'develop' into swebench-sandbox-snapshots
clee-codegen Mar 12, 2025
60177ab
Merge branch 'develop' into swebench-sandbox-snapshots
clee-codegen Mar 13, 2025
e3bcd4e
Merge remote-tracking branch 'origin/develop' into swebench-sandbox-s…
clee-codegen Mar 14, 2025
45993ab
fix: end-to-end to metrics
clee-codegen Mar 18, 2025
bfb7089
Merge remote-tracking branch 'origin/develop' into swebench-sandbox-s…
clee-codegen Mar 19, 2025
091228a
Update local_run.ipynb
Zeeeepa Apr 22, 2025
705853a
Update data.py
Zeeeepa Apr 22, 2025
31c0c30
Update tracer.py
Zeeeepa Apr 22, 2025
c4339c4
Update graph.py
Zeeeepa Apr 22, 2025
aed3fe0
Update graph.py
Zeeeepa Apr 22, 2025
2981829
Apply changes from commit 046b238
Zeeeepa Apr 23, 2025
d76dffe
Apply changes from commit 31ca6aa
Zeeeepa Apr 23, 2025
8471f52
Apply changes from commit 8821e9b
Zeeeepa Apr 23, 2025
cfbf597
Apply changes from commit 046b238
Zeeeepa Apr 23, 2025
9cb1b82
Apply changes from commit 31ca6aa
Zeeeepa Apr 23, 2025
c3114ca
Apply changes from commit 8821e9b
Zeeeepa Apr 23, 2025
ed43ed9
Apply changes from commit bf06715
Zeeeepa Apr 23, 2025
ceb5ce1
Apply changes from commit 3a3231f
Zeeeepa Apr 23, 2025
2f31476
Apply changes from commit 903052b
Zeeeepa Apr 23, 2025
078131d
Apply changes from commit 53e774d
Zeeeepa Apr 23, 2025
9933d6e
Apply changes from commit 3367e98
Zeeeepa Apr 23, 2025
c8b9bd1
Apply changes from commit a2e8cc7
Zeeeepa Apr 23, 2025
e799306
Apply changes from commit a54a070
Zeeeepa Apr 23, 2025
30e05ad
Apply changes from commit f7bee3c
Zeeeepa Apr 23, 2025
407e7fc
Apply changes from commit c74b337
Zeeeepa Apr 23, 2025
bb148f9
Apply changes from commit 67beb1d
Zeeeepa Apr 23, 2025
00dd2d9
Apply changes from commit 31e214c
Zeeeepa Apr 23, 2025
2626732
Apply changes from commit 6c086fe
Zeeeepa Apr 23, 2025
4db87bb
Apply changes from commit f47955f
Zeeeepa Apr 23, 2025
a611587
Apply changes from commit 5af50ea
Zeeeepa Apr 23, 2025
a797199
Apply changes from commit 8bcc267
Zeeeepa Apr 23, 2025
33a2732
Apply changes from commit 4d5c560
Zeeeepa Apr 23, 2025
b36c180
Apply changes from commit f7d3d23
Zeeeepa Apr 23, 2025
1f83c6d
Add comprehensive codebase analyzer
codegen-sh[bot] Apr 29, 2025
9065780
Add files via upload
Zeeeepa May 11, 2025
ccdb7af
Delete codebase_analyzer.py
Zeeeepa May 11, 2025
a096ee4
Add codebase organization scripts
codegen-sh[bot] May 11, 2025
8b569e4
Fix workflow files to allow bot users to pass checks
codegen-sh[bot] May 11, 2025
69d6006
Add dedicated workflow for bot PRs
codegen-sh[bot] May 11, 2025
474df78
Fix workflow checks for bot PRs
codegen-sh[bot] May 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/actions/setup-environment/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ runs:
using: "composite"
steps:
- name: Install UV
uses: astral-sh/setup-uv@v5.3
uses: astral-sh/setup-uv@v5.4
id: setup-uv
with:
enable-cache: true
Expand Down
50 changes: 50 additions & 0 deletions .github/workflows/bot-pr-checks.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
name: Bot PR Checks

on:
pull_request_target:
types: [ opened, synchronize, reopened, labeled ]
branches:
- "develop"

jobs:
bot-check:
runs-on: ubuntu-latest
if: contains(github.triggering_actor, '[bot]')
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ github.event.pull_request.head.sha }}

- name: Setup environment
uses: ./.github/actions/setup-environment

- name: Run tests
timeout-minutes: 5
run: |
uv run pytest \
-n auto \
--cov src \
--timeout 15 \
-o junit_suite_name="bot-unit-tests" \
tests/unit

- uses: ./.github/actions/report
with:
flag: bot-unit-tests
codecov_token: ${{ secrets.CODECOV_TOKEN }}

# Add a job to handle the Build & Release workflow for bot PRs
bot-build:
runs-on: ubuntu-latest
if: contains(github.triggering_actor, '[bot]')
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ github.event.pull_request.head.sha }}

- name: Skip build for bot PRs
run: echo "Skipping build for bot PRs"
3 changes: 2 additions & 1 deletion .github/workflows/mypy.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name: Mypy Checks

on:
pull_request:
pull_request_target:
branches:
- "develop"

Expand All @@ -19,6 +19,7 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ github.event.pull_request.head.sha }}

- name: Setup environment
uses: ./.github/actions/setup-environment
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/pre-commit.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name: pre-commit

on:
pull_request:
pull_request_target:
branches:
- "develop"
push:
Expand All @@ -21,6 +21,7 @@ jobs:
with:
fetch-depth: 0
token: ${{ env.REPO_SCOPED_TOKEN || github.token }}
ref: ${{ github.event.pull_request.head.sha }}

- name: Setup environment
uses: ./.github/actions/setup-environment
Expand Down
20 changes: 19 additions & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,26 @@ permissions:
contents: read

jobs:
# Add a check to skip the build for bot PRs
check-bot:
runs-on: ubuntu-latest
outputs:
is_bot: ${{ steps.check-bot.outputs.is_bot }}
steps:
- name: Check if user is bot
id: check-bot
run: |
if [[ "${{ github.triggering_actor }}" == *"[bot]" ]]; then
echo "is_bot=true" >> $GITHUB_OUTPUT
else
echo "is_bot=false" >> $GITHUB_OUTPUT
fi

build:
name: Build 3.${{ matrix.python }} ${{ matrix.os }}
needs: check-bot
# Skip this job if the PR is from a bot
if: needs.check-bot.outputs.is_bot != 'true'
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
Expand Down Expand Up @@ -55,7 +73,7 @@ jobs:
repository: ${{ github.event.pull_request.head.repo.full_name || github.event.repository.full_name }}

- name: Install UV
uses: astral-sh/setup-uv@v5.3
uses: astral-sh/setup-uv@v5.4
id: setup-uv
with:
enable-cache: false
Expand Down
29 changes: 22 additions & 7 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,17 @@ jobs:
access-check:
runs-on: ubuntu-latest
steps:
- name: Check if user is bot
id: check-bot
run: |
if [[ "${{ github.triggering_actor }}" == *"[bot]" ]]; then
echo "is_bot=true" >> $GITHUB_OUTPUT
else
echo "is_bot=false" >> $GITHUB_OUTPUT
fi

- uses: actions-cool/check-user-permission@v2
if: steps.check-bot.outputs.is_bot != 'true'
with:
require: write
username: ${{ github.triggering_actor }}
Expand All @@ -32,15 +42,20 @@ jobs:
- name: Setup environment
uses: ./.github/actions/setup-environment

- name: Run ATS and Tests
uses: ./.github/actions/run-ats
timeout-minutes: 15
- name: Test with pytest
timeout-minutes: 5
run: |
uv run pytest \
-n auto \
--cov src \
--timeout 15 \
-o junit_suite_name="${{github.job}}" \
tests/unit

- uses: ./.github/actions/report
with:
default_tests: "tests/unit"
codecov_static_token: ${{ secrets.CODECOV_STATIC_TOKEN }}
flag: unit-tests
codecov_token: ${{ secrets.CODECOV_TOKEN }}
collect_args: "--timeout 15"
codecov_flags: unit-tests

codemod-tests:
needs: access-check
Expand Down
189 changes: 93 additions & 96 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,117 +1,114 @@
<br />

<p align="center">
<a href="https://docs.codegen.com">
<img src="https://i.imgur.com/6RF9W0z.jpeg" />
</a>
</p>

<h2 align="center">
Scriptable interface to a powerful, multi-lingual language server.
</h2>

<div align="center">

[![PyPI](https://img.shields.io/badge/PyPi-codegen-gray?style=flat-square&color=blue)](https://pypi.org/project/codegen/)
[![Documentation](https://img.shields.io/badge/Docs-docs.codegen.com-purple?style=flat-square)](https://docs.codegen.com)
[![Slack Community](https://img.shields.io/badge/Slack-Join-4A154B?logo=slack&style=flat-square)](https://community.codegen.com)
[![License](https://img.shields.io/badge/Code%20License-Apache%202.0-gray?&color=gray)](https://github.com/codegen-sh/codegen-sdk/tree/develop?tab=Apache-2.0-1-ov-file)
[![Follow on X](https://img.shields.io/twitter/follow/codegen?style=social)](https://x.com/codegen)

</div>

<br />

[Codegen](https://docs.codegen.com) is a python library for manipulating codebases.

```python
from codegen import Codebase

# Codegen builds a complete graph connecting
# functions, classes, imports and their relationships
codebase = Codebase("./")

# Work with code without dealing with syntax trees or parsing
for function in codebase.functions:
# Comprehensive static analysis for references, dependencies, etc.
if not function.usages:
# Auto-handles references and imports to maintain correctness
function.move_to_file("deprecated.py")
# Comprehensive Codebase Analyzer

A powerful static code analysis system that provides extensive information about your codebase using the Codegen SDK.

## Features

This analyzer provides comprehensive analysis of your codebase, including:

### 1. Codebase Structure Analysis
- File Statistics (count, language, size)
- Symbol Tree Analysis
- Import/Export Analysis
- Module Organization

### 2. Symbol-Level Analysis
- Function Analysis (parameters, return types, complexity)
- Class Analysis (methods, attributes, inheritance)
- Variable Analysis
- Type Analysis

### 3. Dependency and Flow Analysis
- Call Graph Generation
- Data Flow Analysis
- Control Flow Analysis
- Symbol Usage Analysis

### 4. Code Quality Analysis
- Unused Code Detection
- Code Duplication Analysis
- Complexity Metrics
- Style and Convention Analysis

### 5. Visualization Capabilities
- Dependency Graphs
- Call Graphs
- Symbol Trees
- Heat Maps

### 6. Language-Specific Analysis
- Python-Specific Analysis
- TypeScript-Specific Analysis

### 7. Code Metrics
- Monthly Commits
- Cyclomatic Complexity
- Halstead Volume
- Maintainability Index

## Installation

1. Clone the repository:
```bash
git clone https://github.com/yourusername/codebase-analyzer.git
cd codebase-analyzer
```

Write code that transforms code. Codegen combines the parsing power of [Tree-sitter](https://tree-sitter.github.io/tree-sitter/) with the graph algorithms of [rustworkx](https://github.com/Qiskit/rustworkx) to enable scriptable, multi-language code manipulation at scale.

## Installation and Usage

We support

- Running Codegen in Python 3.12 - 3.13 (recommended: Python 3.13+)
- macOS and Linux
- macOS is supported
- Linux is supported on x86_64 and aarch64 with glibc 2.34+
- Windows is supported via WSL. See [here](https://docs.codegen.com/building-with-codegen/codegen-with-wsl) for more details.
- Python, Typescript, Javascript and React codebases

```
# Install inside existing project
uv pip install codegen

# Install global CLI
uv tool install codegen --python 3.13

# Create a codemod for a given repo
cd path/to/repo
codegen init
codegen create test-function

# Run the codemod
codegen run test-function

# Create an isolated venv with codegen => open jupyter
codegen notebook
2. Install dependencies:
```bash
pip install -r requirements.txt
```

## Usage

See [Getting Started](https://docs.codegen.com/introduction/getting-started) for a full tutorial.
### Analyzing a Repository

```
from codegen import Codebase
```

## Troubleshooting
```bash
# Analyze from URL
python codebase_analyzer.py --repo-url https://github.com/username/repo

Having issues? Here are some common problems and their solutions:
# Analyze local repository
python codebase_analyzer.py --repo-path /path/to/repo

- **I'm hitting an UV error related to `[[ packages ]]`**: This means you're likely using an outdated version of UV. Try updating to the latest version with: `uv self update`.
- **I'm hitting an error about `No module named 'codegen.sdk.extensions.utils'`**: The compiled cython extensions are out of sync. Update them with `uv sync --reinstall-package codegen`.
- **I'm hitting a `RecursionError: maximum recursion depth exceeded` error while parsing my codebase**: If you are using python 3.12, try upgrading to 3.13. If you are already on 3.13, try upping the recursion limit with `sys.setrecursionlimit(10000)`.
# Specify language
python codebase_analyzer.py --repo-url https://github.com/username/repo --language python

If you run into additional issues not listed here, please [join our slack community](https://community.codegen.com) and we'll help you out!

## Resources
# Analyze specific categories
python codebase_analyzer.py --repo-url https://github.com/username/repo --categories codebase_structure code_quality
```

- [Docs](https://docs.codegen.com)
- [Getting Started](https://docs.codegen.com/introduction/getting-started)
- [Contributing](CONTRIBUTING.md)
- [Contact Us](https://codegen.com/contact)
### Output Formats

## Why Codegen?
```bash
# Output as JSON
python codebase_analyzer.py --repo-url https://github.com/username/repo --output-format json --output-file analysis.json

Software development is fundamentally programmatic. Refactoring a codebase, enforcing patterns, or analyzing control flow - these are all operations that can (and should) be expressed as programs themselves.
# Generate HTML report
python codebase_analyzer.py --repo-url https://github.com/username/repo --output-format html --output-file report.html

We built Codegen backwards from real-world refactors performed on enterprise codebases. Instead of starting with theoretical abstractions, we focused on creating APIs that match how developers actually think about code changes:
# Print to console (default)
python codebase_analyzer.py --repo-url https://github.com/username/repo --output-format console
```

- **Natural mental model**: Write transforms that read like your thought process - "move this function", "rename this variable", "add this parameter". No more wrestling with ASTs or manual import management.
## Available Analysis Categories

- **Battle-tested on complex codebases**: Handle Python, TypeScript, and React codebases with millions of lines of code.
- `codebase_structure`: File statistics, symbol tree, import/export analysis, module organization
- `symbol_level`: Function, class, variable, and type analysis
- `dependency_flow`: Call graphs, data flow, control flow, symbol usage
- `code_quality`: Unused code, duplication, complexity, style
- `visualization`: Dependency graphs, call graphs, symbol trees, heat maps
- `language_specific`: Language-specific analysis features
- `code_metrics`: Commits, complexity, volume, maintainability

- **Built for advanced intelligences**: As AI developers become more sophisticated, they need expressive yet precise tools to manipulate code. Codegen provides a programmatic interface that both humans and AI can use to express complex transformations through code itself.
## Requirements

## Contributing
- Python 3.8+
- Codegen SDK
- NetworkX
- Matplotlib
- Rich

Please see our [Contributing Guide](CONTRIBUTING.md) for instructions on how to set up the development environment and submit contributions.
## License

## Enterprise
MIT

For more information on enterprise engagements, please [contact us](https://codegen.com/contact) or [request a demo](https://codegen.com/request-demo).
Loading
Loading