Skip to content

Commit a6c9ab7

Browse files
authored
docs: contributing.md first pass (#14)
* . * .
1 parent 7c52cba commit a6c9ab7

File tree

1 file changed

+168
-0
lines changed

1 file changed

+168
-0
lines changed

STRUCTURE.md

Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,168 @@
1+
# Structuring Codegen Examples
2+
3+
This guide explains how to structure examples for the Codegen library. A well-structured example helps both humans and AI understand the code's purpose and how to use it effectively.
4+
5+
## Core Principles
6+
7+
1. **Single Responsibility**: Each example should demonstrate one clear use case
8+
2. **Self-Contained**: Examples should work independently with minimal setup
9+
3. **Clear Structure**: Follow a consistent file organization pattern
10+
4. **Good Documentation**: Include README.md with clear explanations and examples
11+
12+
## Standard File Structure
13+
14+
```
15+
example-name/
16+
├── README.md # Documentation and usage examples
17+
├── run.py # Main implementation
18+
├── guide.md # (Optional) Additional technical details
19+
└── repo-before/ # (Optional) Sample code for transformation
20+
```
21+
22+
## Code Organization in `run.py`
23+
24+
Your `run.py` should follow this structure, demonstrated well in the `generate_training_data` example:
25+
26+
1. **Imports at the top**
27+
```python
28+
import codegen
29+
from codegen import Codebase
30+
from codegen.sdk.core import Function
31+
# ... other imports
32+
```
33+
34+
2. **Utility functions with clear docstrings**
35+
```python
36+
def hop_through_imports(imp: Import) -> Symbol | ExternalModule:
37+
"""Finds the root symbol for an import"""
38+
# Implementation...
39+
```
40+
41+
3. **Main Codegen function with decorator**
42+
```python
43+
@codegen.function("your-function-name")
44+
def run(codebase: Codebase):
45+
"""Clear docstring explaining what the function does.
46+
47+
Include:
48+
1. Purpose of the function
49+
2. Key steps or transformations
50+
3. Expected output
51+
"""
52+
# Implementation...
53+
```
54+
55+
4. **Entry point at bottom**
56+
```python
57+
if __name__ == "__main__":
58+
# Initialize codebase
59+
# Run transformation
60+
# Save/display results
61+
```
62+
63+
## Working with Codebases
64+
65+
Prefer using public repositories for examples when possible. However, sometimes you need a specific code structure to demonstrate a concept clearly. Here's how to handle both cases:
66+
67+
```python
68+
# Preferred: Use a well-known public repo that demonstrates the concept well
69+
codebase = Codebase.from_repo("fastapi/fastapi")
70+
71+
# Alternative: Create a minimal example repo when you need specific code structure
72+
# 1. Create an input_repo/ directory in your example
73+
# 2. Add minimal code that clearly demonstrates the transformation
74+
codebase = Codebase("./input_repo")
75+
```
76+
77+
For example:
78+
```
79+
example-name/
80+
├── README.md
81+
├── run.py
82+
└── input_repo/ # Your minimal example code
83+
├── app.py
84+
└── utils.py
85+
```
86+
87+
Choose between these approaches based on:
88+
1. Can you find a public repo that clearly shows the concept?
89+
2. Is the transformation specific enough that a custom example would be clearer?
90+
3. Would a minimal example be more educational than a complex real-world one?
91+
92+
## Best Practices
93+
94+
1. **Function Decorator**
95+
- Always use `@codegen.function()` with a descriptive name
96+
- Name should match the example's purpose
97+
98+
2. **Utility Functions**
99+
- Break down complex logic into smaller, focused functions
100+
- Each utility should demonstrate one clear concept
101+
- Include type hints and docstrings
102+
103+
3. **Main Function**
104+
- Name it `run()` for consistency
105+
- Include comprehensive docstring explaining the transformation
106+
- Return meaningful data that can be used programmatically
107+
108+
4. **Entry Point**
109+
- Include a `__name__ == "__main__"` block
110+
- Show both initialization and execution
111+
- Add progress messages for better UX
112+
113+
5. **Error Handling**
114+
- Include appropriate error handling for common cases
115+
- Provide clear error messages
116+
117+
## Example Reference Implementation
118+
119+
The `generate_training_data` example demonstrates these principles well:
120+
121+
```python
122+
# Focused utility function
123+
def get_function_context(function) -> dict:
124+
"""Get the implementation, dependencies, and usages of a function."""
125+
# Clear, focused implementation...
126+
127+
# Main transformation with decorator
128+
@codegen.function("generate-training-data")
129+
def run(codebase: Codebase):
130+
"""Generate training data using a node2vec-like approach...
131+
132+
This codemod:
133+
1. Finds all functions...
134+
2. For each function...
135+
3. Outputs structured JSON...
136+
"""
137+
# Clear implementation with good structure...
138+
139+
# Clean entry point
140+
if __name__ == "__main__":
141+
print("Initializing codebase...")
142+
codebase = Codebase.from_repo("fastapi/fastapi")
143+
run(codebase)
144+
# ... rest of execution
145+
```
146+
147+
## Documentation Requirements
148+
149+
Every example should include:
150+
151+
1. **README.md**
152+
- Clear explanation of purpose
153+
- Explains key syntax and program function
154+
- Code examples showing the transformation (before/after)
155+
- If using `input_repo/`, explain its structure and contents
156+
- Output format (if applicable)
157+
- Setup and running instructions
158+
159+
## Testing Your Example
160+
161+
Before submitting:
162+
163+
1. Test with a fresh environment
164+
2. Verify all dependencies are listed
165+
3. Ensure the example runs with minimal setup
166+
4. Check that documentation is clear and accurate
167+
168+
Remember: Your example might be used by both humans and AI to understand Codegen's capabilities. Clear structure and documentation help everyone use your code effectively.

0 commit comments

Comments
 (0)