Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,15 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions-cool/check-user-permission@v2
if: github.triggering_actor != 'codegen-sh[bot]'
with:
require: write
username: ${{ github.triggering_actor }}
error-if-missing: true
# Skip permission check for codegen-sh[bot]
- name: Skip permission check for bot
if: github.triggering_actor == 'codegen-sh[bot]'
run: echo "Skipping permission check for codegen-sh[bot]"

unit-tests:
needs: access-check
Expand Down
183 changes: 183 additions & 0 deletions codegen-on-oss/codegen_on_oss/analysis/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
# Code Analysis Module with Error Context

This module provides robust and dynamic code analysis capabilities with a focus on error detection and contextual error information.

## Overview

The code analysis module consists of several components:

1. **CodeAnalyzer**: The main class that integrates all analysis components and provides a unified interface.
2. **ErrorContextAnalyzer**: A specialized class for detecting and analyzing errors in code.
3. **CodeError**: A class representing an error in code with detailed context information.
4. **API Endpoints**: FastAPI endpoints for accessing the analysis functionality.

## Features

### Code Structure Analysis

- Analyze codebase structure and dependencies
- Generate dependency graphs for files and symbols
- Analyze import relationships and detect circular imports
- Get detailed information about files, functions, classes, and symbols

### Error Detection and Analysis

- Detect syntax errors, type errors, parameter errors, and more
- Analyze function parameters and return statements for errors
- Detect undefined variables and unused imports
- Find circular dependencies between symbols
- Provide detailed context information for errors

### API Endpoints

- `/analyze_repo`: Analyze a repository and return various metrics
- `/analyze_symbol`: Analyze a symbol and return detailed information
- `/analyze_file`: Analyze a file and return detailed information
- `/analyze_function`: Analyze a function and return detailed information
- `/analyze_errors`: Analyze errors in a repository, file, or function

## Error Types

The module can detect the following types of errors:

- **Syntax Errors**: Invalid syntax in code
- **Type Errors**: Type mismatches in expressions
- **Parameter Errors**: Incorrect function parameters
- **Call Errors**: Incorrect function calls
- **Undefined Variables**: Variables used without being defined
- **Unused Imports**: Imports that are not used in the code
- **Circular Imports**: Circular dependencies between files
- **Circular Dependencies**: Circular dependencies between symbols
- **Name Errors**: References to undefined names
- **Import Errors**: Problems with import statements
- **Attribute Errors**: References to undefined attributes

## Error Severity Levels

The module assigns severity levels to each error:

- **Critical**: Errors that will definitely cause the code to crash or fail
- **High**: Errors that are likely to cause problems in most execution paths
- **Medium**: Errors that may cause problems in some execution paths
- **Low**: Minor issues that are unlikely to cause problems but should be fixed
- **Info**: Informational messages about potential improvements

## Usage

### Using the CodeAnalyzer

```python
from codegen import Codebase
from codegen_on_oss.analysis.analysis import CodeAnalyzer

# Create a codebase from a repository
codebase = Codebase.from_repo("owner/repo")

# Create an analyzer
analyzer = CodeAnalyzer(codebase)

# Analyze errors in the codebase
errors = analyzer.analyze_errors()

# Get detailed error context for a function
function_errors = analyzer.get_function_error_context("function_name")

# Get detailed error context for a file
file_errors = analyzer.get_file_error_context("path/to/file.py")
```

### Using the API

```bash
# Analyze a repository
curl -X POST "http://localhost:8000/analyze_repo" \
-H "Content-Type: application/json" \
-d '{"repo_url": "owner/repo"}'

# Analyze errors in a function
curl -X POST "http://localhost:8000/analyze_function" \
-H "Content-Type: application/json" \
-d '{"repo_url": "owner/repo", "function_name": "function_name"}'

# Analyze errors in a file
curl -X POST "http://localhost:8000/analyze_file" \
-H "Content-Type: application/json" \
-d '{"repo_url": "owner/repo", "file_path": "path/to/file.py"}'
```

## Error Context Example

Here's an example of the error context information provided for a function:

```json
{
"function_name": "calculate_total",
"file_path": "app/utils.py",
"errors": [
{
"error_type": "parameter_error",
"message": "Function 'calculate_discount' called with 1 arguments but expects 2",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (typo): Minor grammatical fix: '1 argument' instead of '1 arguments'.

Use '1 argument' instead of '1 arguments'.

Suggested change
"message": "Function 'calculate_discount' called with 1 arguments but expects 2",
"message": "Function 'calculate_discount' called with 1 argument but expects 2",

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (typo): Minor grammar fix in example.

Replace "1 arguments" with "1 argument".

Suggested change
"message": "Function 'calculate_discount' called with 1 arguments but expects 2",
"message": "Function 'calculate_discount' called with 1 argument but expects 2",

"line_number": 15,
"severity": "high",
"context_lines": {
"13": "def calculate_total(items):",
"14": " total = sum(item.price for item in items)",
"15": " discount = calculate_discount(total)",
"16": " return total - discount",
"17": ""
},
"suggested_fix": "Update call to provide 2 arguments: calculate_discount(total, discount_percent)"
}
],
"callers": [
{"name": "process_order"}
],
"callees": [
{"name": "calculate_discount"}
],
"parameters": [
{
"name": "items",
"type": "List[Item]",
"default": null
}
],
"return_info": {
"type": "float",
"statements": ["total - discount"]
}
}
```

## Implementation Details

### ErrorContextAnalyzer

The `ErrorContextAnalyzer` class is responsible for detecting and analyzing errors in code. It uses various techniques to detect errors, including:

- **AST Analysis**: Parsing the code into an abstract syntax tree to detect syntax errors and undefined variables
- **Graph Analysis**: Building dependency graphs to detect circular imports and dependencies
- **Pattern Matching**: Using regular expressions to detect potential type errors and other issues
- **Static Analysis**: Analyzing function parameters, return statements, and variable usage

### CodeError

The `CodeError` class represents an error in code with detailed context information. It includes:

- **Error Type**: The type of error (syntax, type, parameter, etc.)
- **Message**: A descriptive message explaining the error
- **Location**: The file path and line number where the error occurs
- **Severity**: The severity of the error (critical, high, medium, low, info)
- **Context Lines**: The lines of code surrounding the error
- **Suggested Fix**: A suggested fix for the error

## Running the API Server

To run the API server locally:

```bash
cd codegen-on-oss
python -m codegen_on_oss.analysis.analysis
```

The server will be available at `http://localhost:8000`.
Loading
Loading