Skip to content

Commit 5c32360

Browse files
jemeza-codegencodegen-bot
andauthored
CG-10508: Docs explain differences between SourceFile and File types (#138)
# Motivation SourceFile is currently not mentioned in the Building with Codegen Section # Content Add content explaining the differences between SourceFile and File - [x] I have added tests for my changes (N/A) - [x] I have updated the documentation or added new documentation as needed - [x] I have read and agree to the [Contributor License Agreement](../CLA.md) --------- Co-authored-by: codegen-bot <[email protected]>
1 parent 97f98ee commit 5c32360

File tree

5 files changed

+101
-53
lines changed

5 files changed

+101
-53
lines changed

docs/building-with-codegen/files-and-directories.mdx

Lines changed: 97 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,15 @@ icon: "folder-tree"
55
iconType: "solid"
66
---
77

8-
Codegen provides two primary abstractions for working with your codebase's file structure:
8+
Codegen provides three primary abstractions for working with your codebase's file structure:
99

10-
- [File](/api-reference/core/File)
11-
- [Directory](/api-reference/core/Directory)
10+
- [File](/api-reference/core/File) - Represents a file in the codebase (e.g. README.md, package.json, etc.)
11+
- [SourceFile](/api-reference/core/SourceFile) - Represents a source code file (e.g. Python, TypeScript, React, etc.)
12+
- [Directory](/api-reference/core/Directory) - Represents a directory in the codebase
1213

13-
Both of these expose a rich API for accessing and manipulating their contents.
14+
<Info>
15+
[SourceFile](/api-reference/core/SourceFile) is a subclass of [File](/api-reference/core/File) that provides additional functionality for source code files.
16+
</Info>
1417

1518
This guide explains how to effectively use these classes to manage your codebase.
1619

@@ -31,8 +34,10 @@ for file in codebase.files:
3134

3235
# Check if a file exists
3336
exists = codebase.has_file("path/to/file.py")
37+
3438
```
3539

40+
3641
These APIs are similar for [Directory](/api-reference/core/Directory), which provides similar methods for accessing files and subdirectories.
3742

3843
```python
@@ -50,61 +55,58 @@ dir = file.directory
5055
exists = codebase.has_directory("path/to/dir")
5156
```
5257

53-
## Working with Non-Code Files (README, JSON, etc.)
58+
## Differences between SourceFile and File
5459

55-
By default, Codegen focuses on source code files (Python, TypeScript, etc). However, you can access all files in your codebase, including documentation, configuration, and other non-code files like README.md, package.json, or .env:
60+
- [File](/api-reference/core/File) - a general purpose class that represents any file in the codebase including non-code files like README.md, .env, .json, image files, etc.
61+
- [SourceFile](/api-reference/core/SourceFile) - a subclass of [File](/api-reference/core/File) that provides additional functionality for source code files written in languages supported by the [codegen-sdk](/introduction/overview) (Python, TypeScript, JavaScript, React).
5662

57-
```python
58-
# Get all files in the codebase (including README, docs, config files)
59-
files = codebase.files(extensions="*")
63+
The majority of intended use cases involve using exclusively [SourceFile](/api-reference/core/SourceFile) objects as these contain code that can be parsed and manipulated by the [codegen-sdk](/introduction/overview). However, there may be cases where it will be necessary to work with non-code files. In these cases, the [File](/api-reference/core/File) class can be used.
6064

61-
# Print files that are not source code (documentation, config, etc)
62-
for file in files:
63-
if not file.filepath.endswith(('.py', '.ts', '.js')):
64-
print(f"📄 Non-code file: {file.filepath}")
65-
```
66-
67-
You can also filter for specific file types:
65+
By default, the `codebase.files` property will only return [SourceFile](/api-reference/core/SourceFile) objects. To include non-code files the `extensions='*'` argument must be used.
6866

6967
```python
70-
# Get only markdown documentation files
71-
docs = codebase.files(extensions=[".md", ".mdx"])
68+
# Get all source files in the codebase
69+
source_files = codebase.files
7270

73-
# Get configuration files
74-
config_files = codebase.files(extensions=[".json", ".yaml", ".toml"])
71+
# Get all files in the codebase (including non-code files)
72+
all_files = codebase.files(extensions="*")
7573
```
7674

77-
These APIs are similar for [Directory](/api-reference/core/Directory), which provides similar methods for accessing files and subdirectories.
7875

79-
## Raw Content and Metadata
76+
When getting a file with `codebase.get_file`, files ending in `.py, .js, .ts, .jsx, .tsx` are returned as [SourceFile](/api-reference/core/SourceFile) objects while other files are returned as [File](/api-reference/core/File) objects.
77+
78+
Furthermore, you can use the `isinstance` function to check if a file is a [SourceFile](/api-reference/core/SourceFile):
8079

8180
```python
82-
# Grab raw file string content
83-
content = file.content # For text files
84-
print('Length:', len(content))
85-
print('# of functions:', len(file.functions))
81+
py_file = codebase.get_file("path/to/file.py")
82+
if isinstance(py_file, SourceFile):
83+
print(f"File {py_file.filepath} is a source file")
8684

87-
# Access file metadata
88-
name = file.name # Base name without extension
89-
extension = file.extension # File extension with dot
90-
filepath = file.filepath # Full relative path
91-
dir = file.directory # Parent directory
85+
# prints: `File path/to/file.py is a source file`
9286

93-
# Access directory metadata
94-
name = dir.name # Base name without extension
95-
path = dir.path # Full relative path from repository root
96-
parent = dir.parent # Parent directory
87+
mdx_file = codebase.get_file("path/to/file.mdx")
88+
if not isinstance(mdx_file, SourceFile):
89+
print(f"File {mdx_file.filepath} is a non-code file")
90+
91+
# prints: `File path/to/file.mdx is a non-code file`
9792
```
9893

94+
<Note>
95+
Currently, the codebase object can only parse source code files of one language at a time. This means that if you want to work with both Python and TypeScript files, you will need to create two separate codebase objects.
96+
</Note>
97+
9998
## Accessing Code
10099

101-
Files and Directories provide several APIs for accessing and iterating over their code.
100+
[SourceFiles](/api-reference/core/SourceFile) and [Directories](/api-reference/core/Directory) provide several APIs for accessing and iterating over their code.
102101

103102
See, for example:
104103

105-
- `.functions` ([File](/api-reference/core/File#functions) / [Directory](/api-reference/core/Directory#functions)) - All [Functions](../api-reference/core/Function) in the file/directory
106-
- `.classes` ([File](/api-reference/core/File#classes) / [Directory](/api-reference/core/Directory#classes)) - All [Classes](../api-reference/core/Class) in the file/directory
107-
- `.imports` ([File](/api-reference/core/File#imports) / [Directory](/api-reference/core/Directory#imports)) - All [Imports](../api-reference/core/Import) in the file/directory
104+
- `.functions` ([SourceFile](/api-reference/core/SourceFile#functions) / [Directory](/api-reference/core/Directory#functions)) - All [Functions](/api-reference/core/Function) in the file/directory
105+
- `.classes` ([SourceFile](/api-reference/core/SourceFile#classes) / [Directory](/api-reference/core/Directory#classes)) - All [Classes](/api-reference/core/Class) in the file/directory
106+
- `.imports` ([SourceFile](/api-reference/core/SourceFile#imports) / [Directory](/api-reference/core/Directory#imports)) - All [Imports](/api-reference/core/Import) in the file/directory
107+
- `.get_function(...)` ([SourceFile](/api-reference/core/SourceFile#get-function) / [Directory](/api-reference/core/Directory#get-function)) - Get a specific function by name
108+
- `.get_class(...)` ([SourceFile](/api-reference/core/SourceFile#get-class) / [Directory](/api-reference/core/Directory#get-class)) - Get a specific class by name
109+
- `.get_global_var(...)` ([SourceFile](/api-reference/core/SourceFile#get-global-var) / [Directory](/api-reference/core/Directory#get-global-var)) - Get a specific global variable by name
108110

109111

110112
```python
@@ -142,22 +144,68 @@ if main_function:
142144
print(f"Local var: {var.name} = {var.value}")
143145
```
144146

147+
## Working with Non-Code Files (README, JSON, etc.)
148+
149+
By default, Codegen focuses on source code files (Python, TypeScript, etc). However, you can access all files in your codebase, including documentation, configuration, and other non-code [files](/api-reference/core/File) like README.md, package.json, or .env:
150+
151+
```python
152+
# Get all files in the codebase (including README, docs, config files)
153+
files = codebase.files(extensions="*")
154+
155+
# Print files that are not source code (documentation, config, etc)
156+
for file in files:
157+
if not file.filepath.endswith(('.py', '.ts', '.js')):
158+
print(f"📄 Non-code file: {file.filepath}")
159+
```
160+
161+
You can also filter for specific file types:
162+
163+
```python
164+
# Get only markdown documentation files
165+
docs = codebase.files(extensions=[".md", ".mdx"])
166+
167+
# Get configuration files
168+
config_files = codebase.files(extensions=[".json", ".yaml", ".toml"])
169+
```
170+
171+
These APIs are similar for [Directory](/api-reference/core/Directory), which provides similar methods for accessing files and subdirectories.
172+
173+
## Raw Content and Metadata
174+
175+
```python
176+
# Grab raw file string content
177+
content = file.content # For text files
178+
print('Length:', len(content))
179+
print('# of functions:', len(file.functions))
180+
181+
# Access file metadata
182+
name = file.name # Base name without extension
183+
extension = file.extension # File extension with dot
184+
filepath = file.filepath # Full relative path
185+
dir = file.directory # Parent directory
186+
187+
# Access directory metadata
188+
name = dir.name # Base name without extension
189+
path = dir.path # Full relative path from repository root
190+
parent = dir.parent # Parent directory
191+
```
192+
145193
## Editing Files Directly
146194

147-
Files themselves are [`Editable`](../api-reference/core/Editable.mdx) objects, just like Functions and Classes.
195+
Files themselves are [`Editable`](/api-reference/core/Editable.mdx) objects, just like Functions and Classes.
148196

149197
<Tip>
150198
Learn more about the [Editable API](/building-with-codegen/the-editable-api).
151199
</Tip>
152200

153201
This means they expose many useful operations, including:
154202

155-
- [`File.search`](../api-reference/core/File#search) - Search for all functions named "main"
156-
- [`File.edit`](../api-reference/core/Editable#edit) - Edit the file
157-
- [`File.replace`](../api-reference/core/File#replace) - Replace all instances of a string with another string
158-
- [`File.insert_before`](../api-reference/core/File#insert-before) - Insert text before a specific string
159-
- [`File.insert_after`](../api-reference/core/File#insert-after) - Insert text after a specific string
160-
- [`File.remove`](../api-reference/core/File#remove) - Remove a specific string
203+
- [`File.search`](/api-reference/core/File#search) - Search for all functions named "main"
204+
- [`File.edit`](/api-reference/core/File#edit) - Edit the file
205+
- [`File.replace`](/api-reference/core/File#replace) - Replace all instances of a string with another string
206+
- [`File.insert_before`](/api-reference/core/File#insert-before) - Insert text before a specific string
207+
- [`File.insert_after`](/api-reference/core/File#insert-after) - Insert text after a specific string
208+
- [`File.remove`](/api-reference/core/File#remove) - Remove a specific string
161209

162210
```python
163211
# Get a file
@@ -183,7 +231,7 @@ file.insert_after("def end():\npass")
183231
file.remove()
184232
```
185233

186-
You can frequently do bulk modifictions via the [`.edit(...)`](../api-reference/core/Editable#edit) method or [`.replace(...)`](../api-reference/core/File#replace) method.
234+
You can frequently do bulk modifictions via the [`.edit(...)`](/api-reference/core/Editable#edit) method or [`.replace(...)`](/api-reference/core/File#replace) method.
187235

188236
<Note>
189237
Most useful operations will have bespoke APIs that handle edge cases, update
@@ -192,7 +240,7 @@ You can frequently do bulk modifictions via the [`.edit(...)`](../api-reference/
192240

193241
## Moving and Renaming Files
194242

195-
Files can be manipulated through methods like [`File.update_filepath()`](../api-reference/core/File#update-filepath), [`File.rename()`](../api-reference/core/File#rename), and [`File.remove()`](../api-reference/core/File#remove):
243+
Files can be manipulated through methods like [`File.update_filepath()`](/api-reference/core/File#update-filepath), [`File.rename()`](/api-reference/core/File#rename), and [`File.remove()`](/api-reference/core/File#remove):
196244

197245
```python
198246
# Move/rename a file
@@ -216,7 +264,7 @@ for file in codebase.files:
216264

217265
## Directories
218266

219-
[`Directories`](/api-reference/core/Directory) expose a similar API to the [File](../api-reference/core/File.mdx) class, with the addition of the `subdirectories` property.
267+
[`Directories`](/api-reference/core/Directory) expose a similar API to the [File](/api-reference/core/File.mdx) class, with the addition of the `subdirectories` property.
220268

221269
```python
222270
# Get a directory

docs/building-with-codegen/the-editable-api.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ Every Editable provides:
1717
- [source](../api-reference/core/Editable#source) - the text content of the Editable
1818
- [extended_source](../api-reference/core/Editable#extended_source) - includes relevant content like decorators, comments, etc.
1919
- Information about the file that contains the Editable:
20-
- [file](../api-reference/core/Editable#file) - the [File](../api-reference/core/File) that contains this Editable
20+
- [file](../api-reference/core/Editable#file) - the [SourceFile](../api-reference/core/SourceFile) that contains this Editable
2121
- Relationship tracking
2222
- [parent_class](../api-reference/core/Editable#parent-class) - the [Class](../api-reference/core/Class) that contains this Editable
2323
- [parent_function](../api-reference/core/Editable#parent-function) - the [Function](../api-reference/core/Function) that contains this Editable

docs/tutorials/flask-to-fastapi.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -190,7 +190,7 @@ python run.py
190190

191191
The script will:
192192

193-
1. Process all Python files in your codebase
193+
1. Process all Python [files](/api-reference/python/PyFile) in your codebase
194194
2. Apply the transformations in the correct order
195195
3. Maintain your code's functionality while updating to FastAPI patterns
196196

docs/tutorials/python2-to-python3.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -229,7 +229,7 @@ python run.py
229229

230230
The script will:
231231

232-
1. Process all Python files in your codebase
232+
1. Process all Python [files](/api-reference/python/PyFile) in your codebase
233233
2. Apply the transformations in the correct order
234234
3. Maintain your code's functionality while updating to Python 3 syntax
235235

docs/tutorials/training-data.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ This will:
171171

172172
<Tip>
173173
You can use any Git repository as your source codebase by passing the repo URL
174-
to [Codebase.from_repo(...)](/api-reference/core/codebase#from-repo).
174+
to [Codebase.from_repo(...)](/api-reference/core/Codebase#from-repo).
175175
</Tip>
176176

177177
## Using the Training Data

0 commit comments

Comments
 (0)