diff --git a/docs/building-with-codegen/files-and-directories.mdx b/docs/building-with-codegen/files-and-directories.mdx index 632b5138b..c03eb1674 100644 --- a/docs/building-with-codegen/files-and-directories.mdx +++ b/docs/building-with-codegen/files-and-directories.mdx @@ -5,12 +5,15 @@ icon: "folder-tree" iconType: "solid" --- -Codegen provides two primary abstractions for working with your codebase's file structure: +Codegen provides three primary abstractions for working with your codebase's file structure: -- [File](/api-reference/core/File) -- [Directory](/api-reference/core/Directory) +- [File](/api-reference/core/File) - Represents a file in the codebase (e.g. README.md, package.json, etc.) +- [SourceFile](/api-reference/core/SourceFile) - Represents a source code file (e.g. Python, TypeScript, React, etc.) +- [Directory](/api-reference/core/Directory) - Represents a directory in the codebase -Both of these expose a rich API for accessing and manipulating their contents. + + [SourceFile](/api-reference/core/SourceFile) is a subclass of [File](/api-reference/core/File) that provides additional functionality for source code files. + This guide explains how to effectively use these classes to manage your codebase. @@ -31,8 +34,10 @@ for file in codebase.files: # Check if a file exists exists = codebase.has_file("path/to/file.py") + ``` + These APIs are similar for [Directory](/api-reference/core/Directory), which provides similar methods for accessing files and subdirectories. ```python @@ -50,61 +55,58 @@ dir = file.directory exists = codebase.has_directory("path/to/dir") ``` -## Working with Non-Code Files (README, JSON, etc.) +## Differences between SourceFile and File -By default, Codegen focuses on source code files (Python, TypeScript, etc). However, you can access all files in your codebase, including documentation, configuration, and other non-code files like README.md, package.json, or .env: +- [File](/api-reference/core/File) - a general purpose class that represents any file in the codebase including non-code files like README.md, .env, .json, image files, etc. +- [SourceFile](/api-reference/core/SourceFile) - a subclass of [File](/api-reference/core/File) that provides additional functionality for source code files written in languages supported by the [codegen-sdk](/introduction/overview) (Python, TypeScript, JavaScript, React). -```python -# Get all files in the codebase (including README, docs, config files) -files = codebase.files(extensions="*") +The majority of intended use cases involve using exclusively [SourceFile](/api-reference/core/SourceFile) objects as these contain code that can be parsed and manipulated by the [codegen-sdk](/introduction/overview). However, there may be cases where it will be necessary to work with non-code files. In these cases, the [File](/api-reference/core/File) class can be used. -# Print files that are not source code (documentation, config, etc) -for file in files: - if not file.filepath.endswith(('.py', '.ts', '.js')): - print(f"📄 Non-code file: {file.filepath}") -``` - -You can also filter for specific file types: +By default, the `codebase.files` property will only return [SourceFile](/api-reference/core/SourceFile) objects. To include non-code files the `extensions='*'` argument must be used. ```python -# Get only markdown documentation files -docs = codebase.files(extensions=[".md", ".mdx"]) +# Get all source files in the codebase +source_files = codebase.files -# Get configuration files -config_files = codebase.files(extensions=[".json", ".yaml", ".toml"]) +# Get all files in the codebase (including non-code files) +all_files = codebase.files(extensions="*") ``` -These APIs are similar for [Directory](/api-reference/core/Directory), which provides similar methods for accessing files and subdirectories. -## Raw Content and Metadata +When getting a file with `codebase.get_file`, files ending in `.py, .js, .ts, .jsx, .tsx` are returned as [SourceFile](/api-reference/core/SourceFile) objects while other files are returned as [File](/api-reference/core/File) objects. + +Furthermore, you can use the `isinstance` function to check if a file is a [SourceFile](/api-reference/core/SourceFile): ```python -# Grab raw file string content -content = file.content # For text files -print('Length:', len(content)) -print('# of functions:', len(file.functions)) +py_file = codebase.get_file("path/to/file.py") +if isinstance(py_file, SourceFile): + print(f"File {py_file.filepath} is a source file") -# Access file metadata -name = file.name # Base name without extension -extension = file.extension # File extension with dot -filepath = file.filepath # Full relative path -dir = file.directory # Parent directory +# prints: `File path/to/file.py is a source file` -# Access directory metadata -name = dir.name # Base name without extension -path = dir.path # Full relative path from repository root -parent = dir.parent # Parent directory +mdx_file = codebase.get_file("path/to/file.mdx") +if not isinstance(mdx_file, SourceFile): + print(f"File {mdx_file.filepath} is a non-code file") + +# prints: `File path/to/file.mdx is a non-code file` ``` + + Currently, the codebase object can only parse source code files of one language at a time. This means that if you want to work with both Python and TypeScript files, you will need to create two separate codebase objects. + + ## Accessing Code -Files and Directories provide several APIs for accessing and iterating over their code. +[SourceFiles](/api-reference/core/SourceFile) and [Directories](/api-reference/core/Directory) provide several APIs for accessing and iterating over their code. See, for example: -- `.functions` ([File](/api-reference/core/File#functions) / [Directory](/api-reference/core/Directory#functions)) - All [Functions](../api-reference/core/Function) in the file/directory -- `.classes` ([File](/api-reference/core/File#classes) / [Directory](/api-reference/core/Directory#classes)) - All [Classes](../api-reference/core/Class) in the file/directory -- `.imports` ([File](/api-reference/core/File#imports) / [Directory](/api-reference/core/Directory#imports)) - All [Imports](../api-reference/core/Import) in the file/directory +- `.functions` ([SourceFile](/api-reference/core/SourceFile#functions) / [Directory](/api-reference/core/Directory#functions)) - All [Functions](/api-reference/core/Function) in the file/directory +- `.classes` ([SourceFile](/api-reference/core/SourceFile#classes) / [Directory](/api-reference/core/Directory#classes)) - All [Classes](/api-reference/core/Class) in the file/directory +- `.imports` ([SourceFile](/api-reference/core/SourceFile#imports) / [Directory](/api-reference/core/Directory#imports)) - All [Imports](/api-reference/core/Import) in the file/directory +- `.get_function(...)` ([SourceFile](/api-reference/core/SourceFile#get-function) / [Directory](/api-reference/core/Directory#get-function)) - Get a specific function by name +- `.get_class(...)` ([SourceFile](/api-reference/core/SourceFile#get-class) / [Directory](/api-reference/core/Directory#get-class)) - Get a specific class by name +- `.get_global_var(...)` ([SourceFile](/api-reference/core/SourceFile#get-global-var) / [Directory](/api-reference/core/Directory#get-global-var)) - Get a specific global variable by name ```python @@ -142,9 +144,55 @@ if main_function: print(f"Local var: {var.name} = {var.value}") ``` +## Working with Non-Code Files (README, JSON, etc.) + +By default, Codegen focuses on source code files (Python, TypeScript, etc). However, you can access all files in your codebase, including documentation, configuration, and other non-code [files](/api-reference/core/File) like README.md, package.json, or .env: + +```python +# Get all files in the codebase (including README, docs, config files) +files = codebase.files(extensions="*") + +# Print files that are not source code (documentation, config, etc) +for file in files: + if not file.filepath.endswith(('.py', '.ts', '.js')): + print(f"📄 Non-code file: {file.filepath}") +``` + +You can also filter for specific file types: + +```python +# Get only markdown documentation files +docs = codebase.files(extensions=[".md", ".mdx"]) + +# Get configuration files +config_files = codebase.files(extensions=[".json", ".yaml", ".toml"]) +``` + +These APIs are similar for [Directory](/api-reference/core/Directory), which provides similar methods for accessing files and subdirectories. + +## Raw Content and Metadata + +```python +# Grab raw file string content +content = file.content # For text files +print('Length:', len(content)) +print('# of functions:', len(file.functions)) + +# Access file metadata +name = file.name # Base name without extension +extension = file.extension # File extension with dot +filepath = file.filepath # Full relative path +dir = file.directory # Parent directory + +# Access directory metadata +name = dir.name # Base name without extension +path = dir.path # Full relative path from repository root +parent = dir.parent # Parent directory +``` + ## Editing Files Directly -Files themselves are [`Editable`](../api-reference/core/Editable.mdx) objects, just like Functions and Classes. +Files themselves are [`Editable`](/api-reference/core/Editable.mdx) objects, just like Functions and Classes. Learn more about the [Editable API](/building-with-codegen/the-editable-api). @@ -152,12 +200,12 @@ Files themselves are [`Editable`](../api-reference/core/Editable.mdx) objects, j This means they expose many useful operations, including: -- [`File.search`](../api-reference/core/File#search) - Search for all functions named "main" -- [`File.edit`](../api-reference/core/Editable#edit) - Edit the file -- [`File.replace`](../api-reference/core/File#replace) - Replace all instances of a string with another string -- [`File.insert_before`](../api-reference/core/File#insert-before) - Insert text before a specific string -- [`File.insert_after`](../api-reference/core/File#insert-after) - Insert text after a specific string -- [`File.remove`](../api-reference/core/File#remove) - Remove a specific string +- [`File.search`](/api-reference/core/File#search) - Search for all functions named "main" +- [`File.edit`](/api-reference/core/File#edit) - Edit the file +- [`File.replace`](/api-reference/core/File#replace) - Replace all instances of a string with another string +- [`File.insert_before`](/api-reference/core/File#insert-before) - Insert text before a specific string +- [`File.insert_after`](/api-reference/core/File#insert-after) - Insert text after a specific string +- [`File.remove`](/api-reference/core/File#remove) - Remove a specific string ```python # Get a file @@ -183,7 +231,7 @@ file.insert_after("def end():\npass") file.remove() ``` -You can frequently do bulk modifictions via the [`.edit(...)`](../api-reference/core/Editable#edit) method or [`.replace(...)`](../api-reference/core/File#replace) method. +You can frequently do bulk modifictions via the [`.edit(...)`](/api-reference/core/Editable#edit) method or [`.replace(...)`](/api-reference/core/File#replace) method. Most useful operations will have bespoke APIs that handle edge cases, update @@ -192,7 +240,7 @@ You can frequently do bulk modifictions via the [`.edit(...)`](../api-reference/ ## Moving and Renaming Files -Files can be manipulated through methods like [`File.update_filepath()`](../api-reference/core/File#update-filepath), [`File.rename()`](../api-reference/core/File#rename), and [`File.remove()`](../api-reference/core/File#remove): +Files can be manipulated through methods like [`File.update_filepath()`](/api-reference/core/File#update-filepath), [`File.rename()`](/api-reference/core/File#rename), and [`File.remove()`](/api-reference/core/File#remove): ```python # Move/rename a file @@ -216,7 +264,7 @@ for file in codebase.files: ## Directories -[`Directories`](/api-reference/core/Directory) expose a similar API to the [File](../api-reference/core/File.mdx) class, with the addition of the `subdirectories` property. +[`Directories`](/api-reference/core/Directory) expose a similar API to the [File](/api-reference/core/File.mdx) class, with the addition of the `subdirectories` property. ```python # Get a directory diff --git a/docs/building-with-codegen/the-editable-api.mdx b/docs/building-with-codegen/the-editable-api.mdx index 78124b692..37236c430 100644 --- a/docs/building-with-codegen/the-editable-api.mdx +++ b/docs/building-with-codegen/the-editable-api.mdx @@ -17,7 +17,7 @@ Every Editable provides: - [source](../api-reference/core/Editable#source) - the text content of the Editable - [extended_source](../api-reference/core/Editable#extended_source) - includes relevant content like decorators, comments, etc. - Information about the file that contains the Editable: - - [file](../api-reference/core/Editable#file) - the [File](../api-reference/core/File) that contains this Editable + - [file](../api-reference/core/Editable#file) - the [SourceFile](../api-reference/core/SourceFile) that contains this Editable - Relationship tracking - [parent_class](../api-reference/core/Editable#parent-class) - the [Class](../api-reference/core/Class) that contains this Editable - [parent_function](../api-reference/core/Editable#parent-function) - the [Function](../api-reference/core/Function) that contains this Editable diff --git a/docs/tutorials/flask-to-fastapi.mdx b/docs/tutorials/flask-to-fastapi.mdx index 314c92314..ba4fb3fce 100644 --- a/docs/tutorials/flask-to-fastapi.mdx +++ b/docs/tutorials/flask-to-fastapi.mdx @@ -190,7 +190,7 @@ python run.py The script will: -1. Process all Python files in your codebase +1. Process all Python [files](/api-reference/python/PyFile) in your codebase 2. Apply the transformations in the correct order 3. Maintain your code's functionality while updating to FastAPI patterns diff --git a/docs/tutorials/python2-to-python3.mdx b/docs/tutorials/python2-to-python3.mdx index 50abd5361..c72227d14 100644 --- a/docs/tutorials/python2-to-python3.mdx +++ b/docs/tutorials/python2-to-python3.mdx @@ -229,7 +229,7 @@ python run.py The script will: -1. Process all Python files in your codebase +1. Process all Python [files](/api-reference/python/PyFile) in your codebase 2. Apply the transformations in the correct order 3. Maintain your code's functionality while updating to Python 3 syntax diff --git a/docs/tutorials/training-data.mdx b/docs/tutorials/training-data.mdx index 0f4608693..ced20867f 100644 --- a/docs/tutorials/training-data.mdx +++ b/docs/tutorials/training-data.mdx @@ -171,7 +171,7 @@ This will: You can use any Git repository as your source codebase by passing the repo URL - to [Codebase.from_repo(...)](/api-reference/core/codebase#from-repo). + to [Codebase.from_repo(...)](/api-reference/core/Codebase#from-repo). ## Using the Training Data