Skip to content

Commit cfc8452

Browse files
committed
added tests and ci/cd
1 parent 35c135f commit cfc8452

File tree

8 files changed

+852
-1
lines changed

8 files changed

+852
-1
lines changed

.github/workflows/check.yml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
name: Code Checks
2+
3+
on:
4+
push:
5+
branches:
6+
- main
7+
pull_request:
8+
branches:
9+
- main
10+
11+
12+
concurrency:
13+
group:
14+
${{ github.workflow }}-${{ github.ref_name }}-${{
15+
github.event.pull_request.number || github.sha }}
16+
cancel-in-progress: true
17+
18+
jobs:
19+
lint:
20+
runs-on: "ubuntu-latest"
21+
22+
steps:
23+
- uses: "actions/checkout@v4"
24+
25+
- uses: astral-sh/ruff-action@v3
26+
- run: ruff check
27+
- run: ruff format --check
28+
29+
test:
30+
runs-on: ubuntu-latest
31+
strategy:
32+
matrix:
33+
python-version: ["3.11", "3.12", "3.13"]
34+
35+
steps:
36+
- uses: actions/checkout@v4
37+
38+
- name: Install uv
39+
uses: astral-sh/setup-uv@v5
40+
41+
- name: Install the project
42+
run: uv sync --all-extras --dev
43+
44+
- name: Run tests
45+
run: uv run pytest

.github/workflows/publish.yml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
name: Publish
2+
3+
on:
4+
release:
5+
types: ["published"]
6+
7+
jobs:
8+
run:
9+
name: "Build and publish release"
10+
runs-on: ubuntu-latest
11+
12+
steps:
13+
- uses: actions/checkout@v4
14+
15+
- name: Install uv
16+
uses: astral-sh/setup-uv@v5
17+
with:
18+
enable-cache: true
19+
cache-dependency-glob: uv.lock
20+
21+
- name: Set up Python
22+
run: uv uv sync --all-extras --dev
23+
24+
- name: Build
25+
run: uv build
26+
27+
- name: Publish
28+
run: uv publish --token ${{ secrets.PYPI_TOKEN }}

README.md

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# docs2llm
2+
3+
A command-line tool to extract documentation from local directories and GitHub repositories, formatting it for use as context with Large Language Models (LLMs).
4+
5+
## Purpose
6+
7+
docs2llm helps you capture documentation from codebases to use as context for AI assistants and large language models. It searches for documentation files (markdown, text, etc.), processes them, and creates a single consolidated file that can be used as reference material for LLMs.
8+
9+
## Features
10+
11+
- Extract documentation from local directories or GitHub repositories
12+
- Automatically identify and process common documentation files
13+
- Prioritize README files and important documentation
14+
- Support for multiple file formats (Markdown, RST, TXT)
15+
- Format output for optimal LLM context
16+
- Control scan depth to manage output size
17+
- Clone specific branches from Git repositories
18+
- Detailed logging with configurable verbosity
19+
20+
## Installation
21+
22+
```bash
23+
# Install from PyPI
24+
pip install docs2llm
25+
26+
```
27+
28+
## Usage
29+
30+
### Command Line Interface
31+
32+
```bash
33+
# Extract docs from a local directory
34+
docs2llm /path/to/project --output context.txt
35+
36+
# Extract docs from a GitHub repository
37+
docs2llm --git owner/repo --output context.txt
38+
39+
# Specify a branch
40+
docs2llm --git owner/repo --branch develop
41+
42+
# Control scan depth
43+
docs2llm /path/to/project --max-depth 2
44+
45+
# Enable verbose logging
46+
docs2llm /path/to/project -v
47+
48+
# Write logs to a file
49+
docs2llm /path/to/project --log-file extraction.log
50+
```
51+
52+
### Options
53+
54+
- `PATH`: Local directory containing documentation files
55+
- `--git`: GitHub repository URL or owner/repo format
56+
- `--output`: Output file name (default: llm_context.txt)
57+
- `--max-depth`: Maximum directory depth to search (default: 3)
58+
- `--branch`: Specific branch to clone (only used with --git)
59+
- `--verbose`, `-v`: Enable verbose logging
60+
- `--log-file`: Log to this file in addition to console
61+
62+
### Python API
63+
64+
```python
65+
from docs2llm import extract_documentation
66+
67+
# Extract from local directory
68+
success = extract_documentation(
69+
local_path="/path/to/project",
70+
output_file="context.txt",
71+
max_depth=3,
72+
verbose=True
73+
)
74+
75+
# Extract from GitHub repository
76+
success = extract_documentation(
77+
git_repo="owner/repo",
78+
output_file="context.txt",
79+
branch="main",
80+
verbose=True
81+
)
82+
```

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,5 +24,6 @@ build-backend = "hatchling.build"
2424

2525
[dependency-groups]
2626
dev = [
27+
"pytest>=8.3.5",
2728
"ruff>=0.11.5",
2829
]

test.log

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
2025-04-14 14:57:06,943 - INFO - Logging to file: test.log
2+
2025-04-14 14:57:06,943 - DEBUG - Logging initialized in DEBUG mode
3+
2025-04-14 14:57:06,944 - INFO - Documentation Extractor started
4+
2025-04-14 14:57:06,944 - DEBUG - Created temporary directory: /var/folders/_b/8q27qz850hq967ptd74hy3mw0000gn/T/tmpl22fi5zv
5+
2025-04-14 14:57:06,945 - INFO - Cloning repository https://github.com/owner/repo.git to temporary directory
6+
2025-04-14 14:57:07,305 - ERROR - Git clone failed: Cloning into '/var/folders/_b/8q27qz850hq967ptd74hy3mw0000gn/T/tmpl22fi5zv'...
7+
remote: Repository not found.
8+
fatal: repository 'https://github.com/owner/repo.git/' not found
9+
10+
2025-04-14 14:57:07,307 - ERROR - Error during clone: Failed to clone repository: Cloning into '/var/folders/_b/8q27qz850hq967ptd74hy3mw0000gn/T/tmpl22fi5zv'...
11+
remote: Repository not found.
12+
fatal: repository 'https://github.com/owner/repo.git/' not found
13+
14+
2025-04-14 14:57:07,308 - ERROR - Repository cloning failed. Exiting.

tests/test_cli.py

Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
import os
2+
import tempfile
3+
import pytest
4+
from click.testing import CliRunner
5+
from unittest.mock import patch
6+
from docs2llm.cli import main
7+
8+
9+
@pytest.fixture
10+
def runner():
11+
"""Provides a Click test runner for CLI testing."""
12+
return CliRunner()
13+
14+
15+
@pytest.fixture
16+
def temp_dir():
17+
"""Creates a temporary directory that is cleaned up after the test."""
18+
with tempfile.TemporaryDirectory() as td:
19+
yield td
20+
21+
22+
def test_cli_help(runner):
23+
"""Test the CLI help functionality."""
24+
result = runner.invoke(main, ["--help"])
25+
assert result.exit_code == 0
26+
assert "Generate LLM context from documentation" in result.output
27+
assert "--git" in result.output
28+
assert "--output" in result.output
29+
assert "--max-depth" in result.output
30+
31+
32+
def test_cli_missing_inputs(runner):
33+
"""Test that the CLI shows an error when no inputs are provided."""
34+
result = runner.invoke(main, [])
35+
assert result.exit_code == 1
36+
assert "Error: Either a local path or --git option must be provided" in result.output
37+
38+
39+
def test_cli_conflicting_inputs(runner):
40+
"""Test that the CLI shows an error when both local path and git repo are provided."""
41+
result = runner.invoke(main, ["local/path", "--git", "https://github.com/owner/repo.git"])
42+
assert result.exit_code == 1
43+
assert "Error: Cannot specify both a local path and --git" in result.output
44+
45+
46+
@patch("docs2llm.cli.extract_documentation")
47+
def test_cli_local_path(mock_extract, runner, temp_dir):
48+
"""Test CLI with local path input."""
49+
# Configure the mock to return True (success)
50+
mock_extract.return_value = True
51+
52+
# Create a test directory
53+
test_dir = os.path.join(temp_dir, "test_docs")
54+
os.makedirs(test_dir)
55+
56+
# Execute the CLI command
57+
result = runner.invoke(main, [test_dir, "--output", "test_output.txt"])
58+
59+
# Verify CLI behavior
60+
assert result.exit_code == 0
61+
62+
# Verify extract_documentation was called with correct arguments
63+
mock_extract.assert_called_once_with(
64+
local_path=test_dir,
65+
git_repo=None,
66+
output_file="test_output.txt",
67+
max_depth=3,
68+
branch=None,
69+
verbose=False,
70+
log_file=None
71+
)
72+
73+
74+
@patch("docs2llm.cli.extract_documentation")
75+
def test_cli_git_repo(mock_extract, runner):
76+
"""Test CLI with git repository input."""
77+
# Configure the mock to return True (success)
78+
mock_extract.return_value = True
79+
80+
# Test URL
81+
test_repo = "https://github.com/owner/repo.git"
82+
83+
# Execute the CLI command
84+
result = runner.invoke(main, [
85+
"--git", test_repo,
86+
"--output", "git_output.txt",
87+
"--branch", "main",
88+
"--verbose"
89+
])
90+
91+
# Verify CLI behavior
92+
assert result.exit_code == 0
93+
94+
# Verify extract_documentation was called with correct arguments
95+
mock_extract.assert_called_once_with(
96+
local_path=None,
97+
git_repo=test_repo,
98+
output_file="git_output.txt",
99+
max_depth=3,
100+
branch="main",
101+
verbose=True,
102+
log_file=None
103+
)
104+
105+
106+
@patch("docs2llm.cli.extract_documentation")
107+
def test_cli_with_all_options(mock_extract, runner):
108+
"""Test CLI with all available options."""
109+
# Configure the mock to return True (success)
110+
mock_extract.return_value = True
111+
112+
# Execute the CLI command with all options
113+
result = runner.invoke(main, [
114+
"--git", "https://github.com/owner/repo.git",
115+
"--output", "full_options.txt",
116+
"--max-depth", "5",
117+
"--branch", "develop",
118+
"--verbose",
119+
"--log-file", "test.log"
120+
])
121+
122+
# Verify CLI behavior
123+
assert result.exit_code == 0
124+
125+
# Verify extract_documentation was called with correct arguments
126+
mock_extract.assert_called_once_with(
127+
local_path=None,
128+
git_repo="https://github.com/owner/repo.git",
129+
output_file="full_options.txt",
130+
max_depth=5,
131+
branch="develop",
132+
verbose=True,
133+
log_file="test.log"
134+
)
135+
136+
137+
@patch("docs2llm.cli.extract_documentation")
138+
def test_cli_failure_case(mock_extract, runner):
139+
"""Test CLI when extraction fails."""
140+
# Configure the mock to return False (failure)
141+
mock_extract.return_value = False
142+
143+
# Execute the CLI command
144+
result = runner.invoke(main, ["nonexistent/path"])
145+
146+
# Verify CLI returns error code
147+
assert result.exit_code == 1

0 commit comments

Comments
 (0)