A utility for scanning directory structures and generating reports in various formats, specifically designed to prepare multi-file code projects for Large Language Models (LLMs).
This tool helps developers prepare their multi-file projects for effective processing by LLMs by:
- Creating a clear text representation of directory structures
- Combining multiple source files into a single document with proper path references
- Generating detailed metadata about files to provide context
- Producing outputs in formats easily digestible by language models
LLMs typically work best with consolidated information rather than separate files. This scanner creates a unified context that preserves file relationships and structure, allowing models to better understand the project architecture.
- Recursive directory scanning
- SHA-256 file hash calculation
- Markdown reports with visual directory structure representation
- Detailed YAML reports
- Unified Markdown file with all text files content
- Directory and file filtering
- Timestamped report file names
Python 3.6 or higher is required.
- Clone the repository:
git clone https://github.com/username/directory-scanner.git
cd directory-scanner- Install dependencies:
pip install pyyamlpython directory_scanner.py /path/to/directory-
--exclude- exclude specified directories from scanning:python directory_scanner.py /path/to/directory --exclude node_modules .git __pycache__
-
--no-content- disable generation of the file containing all text files content:python directory_scanner.py /path/to/directory --no-content
-
--exclude-ext- exclude files with specified extensions:python directory_scanner.py /path/to/directory --exclude-ext jpg png gif
All reports are saved in the data directory within the current working directory. File names follow the format:
directory_name_YYYY_MM_DD_HH_MM_SS_structure.md- directory structure in Markdown formatdirectory_name_YYYY_MM_DD_HH_MM_SS_structure.yaml- detailed structure in YAML formatdirectory_name_YYYY_MM_DD_HH_MM_SS_content.md- content of all text files
Presents the directory structure as a tree using ASCII characters.
Contains detailed information about each file:
- Relative path
- File name
- Last modification date
- SHA-256 hash of the content
Includes the content of all text files, formatted as follows:
- Header with the file path
- Content in a code block with syntax highlighting
- Removal of excessive spaces and empty lines
python directory_scanner.py /path/to/project --exclude node_modules .git build distpython directory_scanner.py /path/to/project --exclude node_modules .git __pycache__ .venvpython directory_scanner.py /path/to/documentation --exclude-ext pdf docxYuri - wku@ukr.net